Files
Josh Hunt 23c6c1e50b CI: Speed up Playwright e2e tests workflow (#119277)
* CI: Replace Dagger builds with native make for Playwright e2e tests

Switch from Dagger-based builds to native Go/JS builds for the
Playwright e2e test pipeline. Grafana now runs as a native binary
on the CI runner instead of in a Docker container.

Key changes:
- build-backend: actions/setup-go + make build-go (instead of Dagger)
- build-frontend: actions/setup-node + make build-js + yarn e2e:plugin:build
- run-playwright-tests: downloads artifacts, uses start-server script
  to run Grafana natively (instead of Docker container from GHCR)
- build-grafana: standalone full Dagger build, off the Playwright
  critical path (still produces Docker/tarball for push-docker-image
  and run-a11y-test)
- required-playwright-tests: no longer depends on build-grafana
- Remove debug env vars (ACTIONS_STEP_DEBUG, RUNNER_DEBUG)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix zizmor template-injection: use docker/login-action for GHCR login in Playwright job

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace GHA service container with docker run for Grafana

GHA service containers start before any step (including checkout), so
volume-mounted config files don't exist yet and Grafana crashes. The
health check never passes, blocking all steps from running.

Switch to docker run -d in a step after checkout, so all files are
available when the container starts. This eliminates the need for the
docker restart workaround and the zizmor unpinned-images suppression.

Verified locally: built all three Dagger steps (backend, frontend,
assembly with --import-dir + chmod +x), loaded the Docker image, and
confirmed Grafana starts successfully with volume-mounted config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix Docker image binary permissions lost by actions/upload-artifact

actions/upload-artifact strips execute permissions (all files become 644).
The backend binaries need +x restored before Dagger packages them into the
Docker image, otherwise the container fails with "Permission denied" when
trying to exec the grafana binary.

Verified locally: pulled the CI-built image from GHCR, confirmed binaries
had 664 permissions, added chmod +x, and tested the full service container
restart flow successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix CI failures: pin docker/login-action, fix docker tarball glob, suppress zizmor unpinned-images

- Pin docker/login-action@v3 to hash @5e57cd118135c172c3672efd75eb46360885c0ef
- Use glob *.docker.tar.gz in push-docker-image (Dagger produces versioned filenames)
- Add unpinned-images ignore for pr-e2e-tests.yml (dynamic build output image)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CI: Fix missing bundled-plugins directory in build-grafana

actions/upload-artifact skips empty directories, so the bundled-plugins
dir (empty in OSS builds) doesn't exist after download. Create it before
running Dagger to prevent the --import-dir from failing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CI: Speed up Playwright e2e tests workflow

Split the monolithic Grafana build into three Dagger jobs (backend, frontend, assembly) with granular caching. Use the --import-dir flag to pre-populate the artifact store, skipping compilation in the assembly step. Run Playwright shards in parallel with 4 workers instead of 1, reduced from 8 to 6 shards, and use GHA service containers with bind-mounted config instead of building custom e2e Docker images. Add workflow concurrency, job timeouts, and dependency caching. This reduces critical path from ~32 minutes to ~17 minutes on cold builds and ~9 minutes with warm caches.

Expected impact:
- Parallel backend/frontend builds save 6-8 minutes (vs sequential)
- GHA output cache hits reduce builds to 0 seconds on cache hit
- Docker service container approach eliminates per-shard overhead (5-7 min saved)
- 4 workers per shard and reduced retry count improve test throughput
- Workflow concurrency prevents wasted runs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

try merging frontend artifact

run the tests

shard tree artifact, delete artifacts, fix ts files being excluded

copy bin

fix path

fix path

fix script

another try

fix incorrect permissions

* try stitching together standalone grafana build

* include more dirs in frontend build

fix paths

* try caching node-modules better

try caching node-modules

disable YARN_ENABLE_HARDENED_MODE

temporarily stop caching node_modules to test performance

temp don't cache node_modules to measure perf

fix frontend cache

* add script for downloading the report and viewing it locally

* Update codeowners

* Add workflow to build grafana docker image

* add placeholder check

* Use hosted runners for everything

* Bump actions versions

* Don't cache playwright browser installs

* build e2e test plugins in each shard

* Split bench report into seperate step

and update bench to v1

* try packaging less of the public dir

* Package up whole public directory

its needed for some reason

* Run the grafana server migrations in the background while playwright installs

* Fix flaky time picker preferences tests

* Fix detect-changes always running e2e tests

* Skip building frontend source maps

* Don't check out repo in report steps

* Add per-shard failure instructions
2026-03-05 12:54:49 +00:00
..