# Testing, CI and Quality Assurance This chapter centralizes the project's testing strategy, CI configuration, and quality targets. ## Overview CalMiner uses a combination of unit, integration, and end-to-end tests to ensure quality. ### Frameworks - Backend: pytest for unit and integration tests. - Frontend: pytest with Playwright for E2E tests. - Database: pytest fixtures with psycopg2 for DB tests. ### Test Types - Unit Tests: Test individual functions/modules. - Integration Tests: Test API endpoints and DB interactions. - E2E Tests: Playwright for full user flows. ### CI/CD - Use Gitea Actions for CI/CD; workflows live under `.gitea/workflows/`. - `test.yml` runs on every push, provisions a temporary Postgres 16 service, waits for readiness, executes the setup script in dry-run and live modes, then fans out into parallel matrix jobs for unit (`pytest tests/unit`) and end-to-end (`pytest tests/e2e`) suites. Playwright browsers install only for the E2E job. - `build-and-push.yml` runs only after the **Run Tests** workflow finishes successfully (triggered via `workflow_run` on `main`). Once tests pass, it builds the Docker image with `docker/build-push-action@v2`, reuses cache-backed layers, and pushes to the Gitea registry. - `deploy.yml` runs only after the build workflow reports success on `main`. It connects to the target host (via `appleboy/ssh-action`), pulls the Docker image tagged with the build commit SHA, and restarts the container with that exact image reference. - Mandatory secrets: `REGISTRY_USERNAME`, `REGISTRY_PASSWORD`, `REGISTRY_URL`, `SSH_HOST`, `SSH_USERNAME`, `SSH_PRIVATE_KEY`. - Run tests on pull requests to shared branches; enforce coverage target ≥80% (pytest-cov). ### Running Tests - Unit: `pytest tests/unit/` - E2E: `pytest tests/e2e/` - All: `pytest` ### Test Directory Structure Organize tests under the `tests/` directory mirroring the application structure: ```text tests/ unit/ test_.py e2e/ test_.py fixtures/ conftest.py ``` ### Fixtures and Test Data - Define reusable fixtures in `tests/fixtures/conftest.py`. - Use temporary in-memory databases or isolated schemas for DB tests. - Load sample data via fixtures for consistent test environments. - Leverage the `seeded_ui_data` fixture in `tests/unit/conftest.py` to populate scenarios with related cost, maintenance, and simulation records for deterministic UI route checks. ### E2E (Playwright) Tests The E2E test suite, located in `tests/e2e/`, uses Playwright to simulate user interactions in a live browser environment. These tests are designed to catch issues in the UI, frontend-backend integration, and overall application flow. #### Fixtures - `live_server`: A session-scoped fixture that launches the FastAPI application in a separate process, making it accessible to the browser. - `playwright_instance`, `browser`, `page`: Standard `pytest-playwright` fixtures for managing the Playwright instance, browser, and individual pages. #### Smoke Tests - UI Page Loading: `test_smoke.py` contains a parameterized test that systematically navigates to all UI routes to ensure they load without errors, have the correct title, and display a primary heading. - Form Submissions: Each major form in the application has a corresponding test file (e.g., `test_scenarios.py`, `test_costs.py`) that verifies: page loads, create item by filling the form, success message, and UI updates. ### Running E2E Tests To run the Playwright tests: ```bash pytest tests/e2e/ ```` To run headed mode: ```bash pytest tests/e2e/ --headed ``` ### Mocking and Dependency Injection - Use `unittest.mock` to mock external dependencies. - Inject dependencies via function parameters or FastAPI's dependency overrides in tests. ### Code Coverage - Install `pytest-cov` to generate coverage reports. - Run with coverage: `pytest --cov --cov-report=term` (use `--cov-report=html` when visualizing hotspots). - Target 95%+ overall coverage. Focus on historically low modules: `services/simulation.py`, `services/reporting.py`, `middleware/validation.py`, and `routes/ui.py`. - Latest snapshot (2025-10-21): `pytest --cov=. --cov-report=term-missing` returns **91%** overall coverage. ### CI Integration `test.yml` encapsulates the steps below: - Check out the repository and set up Python 3.10. - Configure the runner's apt proxy (if available), install project dependencies (requirements + test extras), and download Playwright browsers. - Run `pytest` (extend with `--cov` flags when enforcing coverage). > The pip cache step is temporarily disabled in `test.yml` until the self-hosted cache service is exposed (see `docs/ci-cache-troubleshooting.md`). `build-and-push.yml` adds: - Registry login using repository secrets. - Docker image build/push with GHA cache storage (`cache-from/cache-to` set to `type=gha`). `deploy.yml` handles: - SSH into the deployment host. - Pull the tagged image from the registry. - Stop, remove, and relaunch the `calminer` container exposing port 8000. When adding new workflows, mirror this structure to ensure secrets, caching, and deployment steps remain aligned with the production environment. ## CI Owner Coordination Notes ### Key Findings - Self-hosted runner: ASUS System Product Name chassis with AMD Ryzen 7 7700X (8 physical cores / 16 threads) and 63.2 GB usable RAM; `act_runner` configuration not overridden, so only one workflow job runs concurrently today. - Unit test matrix job: completes 117 pytest cases in roughly 4.1 seconds after Postgres spins up; Docker services consume ~150 MB for `postgres:16-alpine`, with minimal sustained CPU load once tests begin. - End-to-end matrix job: `pytest tests/e2e` averages 21‑22 seconds of execution, but a cold run downloads ~179 MB of apt packages plus ~470 MB of Playwright browser bundles (Chromium, Firefox, WebKit, FFmpeg), exceeding 650 MB network transfer and adding several gigabytes of disk writes if caches are absent. - Both jobs reuse existing Python package caches when available; absent a shared cache service, repeated Playwright installs remain the dominant cost driver for cold executions. ### Open Questions - Can we raise the runner concurrency above the default single job, or provision an additional runner, so the test matrix can execute without serializing queued workflows? - Is there a central cache or artifact service available for Python wheels and Playwright browser bundles to avoid ~650 MB downloads on cold starts? - Are we permitted to bake Playwright browsers into the base runner image, or should we pursue a shared cache/proxy solution instead? ### Outreach Draft ```text Subject: CalMiner CI parallelization support Hi , We recently updated the CalMiner test workflow to fan out unit and Playwright E2E suites in parallel. While validating the change, we gathered the following: - Runner host: ASUS System Product Name with AMD Ryzen 7 7700X (8 cores / 16 threads), ~63 GB RAM, default `act_runner` concurrency (1 job at a time). - Unit job finishes in ~4.1 s once Postgres is ready; light CPU and network usage. - E2E job finishes in ~22 s, but a cold run pulls ~179 MB of apt packages plus ~470 MB of Playwright browser payloads (>650 MB download, several GB disk writes) because we do not have a shared cache yet. To move forward, could you help with the following? 1. Confirm whether we can raise the runner concurrency limit or provision an additional runner so parallel jobs do not queue behind one another. 2. Let us know if a central cache (Artifactory, Nexus, etc.) is available for Python wheels and Playwright browser bundles, or if we should consider baking the browsers into the runner image instead. 3. Share any guidance on preferred caching or proxy solutions for large binary installs on self-hosted runners. Once we have clarity, we can finalize the parallel rollout and update the documentation accordingly. Thanks, ``` ## Workflow Optimization Opportunities ### `test.yml` - Run the apt-proxy setup once via a composite action or preconfigured runner image if additional matrix jobs are added. - Collapse dependency installation into a single `pip install -r requirements-test.txt` call (includes base requirements) once caching is restored. - Investigate caching or pre-baking Playwright browser binaries to eliminate >650 MB cold downloads per run. ### `build-and-push.yml` - Skip QEMU setup or explicitly constrain Buildx to linux/amd64 to reduce startup time. - Enable `cache-from` / `cache-to` settings (registry or `type=gha`) to reuse Docker build layers between runs. ### `deploy.yml` - Extract deployment script into a reusable shell script or compose file to minimize inline secrets and ease multi-environment scaling. - Add a post-deploy health check (e.g., `curl` readiness probe) before declaring success. ### Priority Overview 1. Restore shared caching for Python wheels and Playwright browsers once infrastructure exposes the cache service (highest impact on runtime and bandwidth; requires coordination with CI owners). 2. Enable Docker layer caching in `build-and-push.yml` to shorten build cycles (medium effort, immediate benefit to release workflows). 3. Add post-deploy health verification to `deploy.yml` (low effort, improves confidence in automation). 4. Streamline redundant setup steps in `test.yml` (medium effort once cache strategy is in place; consider composite actions or base image updates). ### Setup Consolidation Opportunities - `Run Tests` matrix jobs each execute the apt proxy configuration, pip installs, database wait, and setup scripts. A composite action or shell script wrapper could centralize these routines and parameterize target-specific behavior (unit vs e2e) to avoid copy/paste maintenance as additional jobs (lint, type check) are introduced. - Both the test and build workflows perform a `checkout` step; while unavoidable per workflow, shared git submodules or sparse checkout rules could be encapsulated in a composite action to keep options consistent. - The database setup script currently runs twice (dry-run and live) for every matrix leg. Evaluate whether the dry-run remains necessary once migrations stabilize; if retained, consider adding an environment variable toggle to skip redundant seed operations for read-only suites (e.g., lint). ### Proposed Shared Setup Action - Location: `.gitea/actions/setup-python-env/action.yml` (composite action). - Inputs: - `python-version` (default `3.10`): forwarded to `actions/setup-python`. - `install-playwright` (default `false`): when `true`, run `python -m playwright install --with-deps`. - `install-requirements` (default `requirements.txt requirements-test.txt`): space-delimited list pip installs iterate over. - `run-db-setup` (default `true`): toggles database wait + setup scripts. - `db-dry-run` (default `true`): controls whether the dry-run invocation executes. - Steps encapsulated: 1. Set up Python via `actions/setup-python@v5` using provided version. 2. Configure apt proxy via shared shell snippet (with graceful fallback when proxy offline). 3. Iterate over requirement files and execute `pip install -r `. 4. If `install-playwright == true`, install browsers. 5. If `run-db-setup == true`, run the wait-for-Postgres python snippet and call `scripts/setup_database.py`, honoring `db-dry-run` toggle. - Usage sketch (in `test.yml`): ```yaml - name: Prepare Python environment uses: ./.gitea/actions/setup-python-env with: install-playwright: ${{ matrix.target == 'e2e' }} db-dry-run: true ``` - Benefits: centralizes proxy logic and dependency installs, reduces duplication across matrix jobs, and keeps future lint/type-check jobs lightweight by disabling database setup. - Implementation status: action available at `.gitea/actions/setup-python-env` and consumed by `test.yml`; extend to additional workflows as they adopt the shared routine. - Obsolete steps removed: individual apt proxy, dependency install, Playwright, and database setup commands pruned from `test.yml` once the composite action was integrated.