feat: Enhance CI workflows by adding linting step, updating documentation, and configuring development dependencies

This commit is contained in:
2025-10-27 08:54:11 +01:00
parent 70db34d088
commit e8a86b15e4
10 changed files with 279 additions and 78 deletions

View File

@@ -117,46 +117,6 @@ pytest tests/e2e/ --headed
When adding new workflows, mirror this structure to ensure secrets, caching, and deployment steps remain aligned with the production environment.
## CI Owner Coordination Notes
### Key Findings
- Self-hosted runner: ASUS System Product Name chassis with AMD Ryzen 7 7700X (8 physical cores / 16 threads) and 63.2 GB usable RAM; `act_runner` configuration not overridden, so only one workflow job runs concurrently today.
- Unit test matrix job: completes 117 pytest cases in roughly 4.1 seconds after Postgres spins up; Docker services consume ~150 MB for `postgres:16-alpine`, with minimal sustained CPU load once tests begin.
- End-to-end matrix job: `pytest tests/e2e` averages 2122 seconds of execution, but a cold run downloads ~179 MB of apt packages plus ~470 MB of Playwright browser bundles (Chromium, Firefox, WebKit, FFmpeg), exceeding 650 MB network transfer and adding several gigabytes of disk writes if caches are absent.
- Both jobs reuse existing Python package caches when available; absent a shared cache service, repeated Playwright installs remain the dominant cost driver for cold executions.
### Open Questions
- Can we raise the runner concurrency above the default single job, or provision an additional runner, so the test matrix can execute without serializing queued workflows?
- Is there a central cache or artifact service available for Python wheels and Playwright browser bundles to avoid ~650 MB downloads on cold starts?
- Are we permitted to bake Playwright browsers into the base runner image, or should we pursue a shared cache/proxy solution instead?
### Outreach Draft
```text
Subject: CalMiner CI parallelization support
Hi <CI Owner>,
We recently updated the CalMiner test workflow to fan out unit and Playwright E2E suites in parallel. While validating the change, we gathered the following:
- Runner host: ASUS System Product Name with AMD Ryzen 7 7700X (8 cores / 16 threads), ~63 GB RAM, default `act_runner` concurrency (1 job at a time).
- Unit job finishes in ~4.1 s once Postgres is ready; light CPU and network usage.
- E2E job finishes in ~22 s, but a cold run pulls ~179 MB of apt packages plus ~470 MB of Playwright browser payloads (>650 MB download, several GB disk writes) because we do not have a shared cache yet.
To move forward, could you help with the following?
1. Confirm whether we can raise the runner concurrency limit or provision an additional runner so parallel jobs do not queue behind one another.
2. Let us know if a central cache (Artifactory, Nexus, etc.) is available for Python wheels and Playwright browser bundles, or if we should consider baking the browsers into the runner image instead.
3. Share any guidance on preferred caching or proxy solutions for large binary installs on self-hosted runners.
Once we have clarity, we can finalize the parallel rollout and update the documentation accordingly.
Thanks,
<Your Name>
```
## Workflow Optimization Opportunities
### `test.yml`
@@ -216,3 +176,43 @@ Thanks,
- Benefits: centralizes proxy logic and dependency installs, reduces duplication across matrix jobs, and keeps future lint/type-check jobs lightweight by disabling database setup.
- Implementation status: action available at `.gitea/actions/setup-python-env` and consumed by `test.yml`; extend to additional workflows as they adopt the shared routine.
- Obsolete steps removed: individual apt proxy, dependency install, Playwright, and database setup commands pruned from `test.yml` once the composite action was integrated.
## CI Owner Coordination Notes
### Key Findings
- Self-hosted runner: ASUS System Product Name chassis with AMD Ryzen 7 7700X (8 physical cores / 16 threads) and 63.2 GB usable RAM; `act_runner` configuration not overridden, so only one workflow job runs concurrently today.
- Unit test matrix job: completes 117 pytest cases in roughly 4.1 seconds after Postgres spins up; Docker services consume ~150 MB for `postgres:16-alpine`, with minimal sustained CPU load once tests begin.
- End-to-end matrix job: `pytest tests/e2e` averages 2122 seconds of execution, but a cold run downloads ~179 MB of apt packages plus ~470 MB of Playwright browser bundles (Chromium, Firefox, WebKit, FFmpeg), exceeding 650 MB network transfer and adding several gigabytes of disk writes if caches are absent.
- Both jobs reuse existing Python package caches when available; absent a shared cache service, repeated Playwright installs remain the dominant cost driver for cold executions.
### Open Questions
- Can we raise the runner concurrency above the default single job, or provision an additional runner, so the test matrix can execute without serializing queued workflows?
- Is there a central cache or artifact service available for Python wheels and Playwright browser bundles to avoid ~650 MB downloads on cold starts?
- Are we permitted to bake Playwright browsers into the base runner image, or should we pursue a shared cache/proxy solution instead?
### Outreach Draft
```text
Subject: CalMiner CI parallelization support
Hi <CI Owner>,
We recently updated the CalMiner test workflow to fan out unit and Playwright E2E suites in parallel. While validating the change, we gathered the following:
- Runner host: ASUS System Product Name with AMD Ryzen 7 7700X (8 cores / 16 threads), ~63 GB RAM, default `act_runner` concurrency (1 job at a time).
- Unit job finishes in ~4.1 s once Postgres is ready; light CPU and network usage.
- E2E job finishes in ~22 s, but a cold run pulls ~179 MB of apt packages plus ~470 MB of Playwright browser payloads (>650 MB download, several GB disk writes) because we do not have a shared cache yet.
To move forward, could you help with the following?
1. Confirm whether we can raise the runner concurrency limit or provision an additional runner so parallel jobs do not queue behind one another.
2. Let us know if a central cache (Artifactory, Nexus, etc.) is available for Python wheels and Playwright browser bundles, or if we should consider baking the browsers into the runner image instead.
3. Share any guidance on preferred caching or proxy solutions for large binary installs on self-hosted runners.
Once we have clarity, we can finalize the parallel rollout and update the documentation accordingly.
Thanks,
<Your Name>
```