feat: Enhance CI workflows by adding linting step, updating documentation, and configuring development dependencies

2025-10-27 08:54:11 +01:00
parent 70db34d088
commit e8a86b15e4
10 changed files with 279 additions and 78 deletions
--- a/.gitea/workflows/test.yml
+++ b/.gitea/workflows/test.yml
@@ -20,7 +20,7 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        target: [unit, e2e]
+        target: [unit, e2e, lint]
    services:
      postgres:
        image: postgres:16-alpine
@@ -44,6 +44,8 @@ jobs:
        run: |
          if [ "${{ matrix.target }}" = "unit" ]; then
            pytest tests/unit
          elif [ "${{ matrix.target }}" = "lint" ]; then
            ruff check .
          else
            pytest tests/e2e
          fi
--- a/.prettierrc
+++ b/.prettierrc
@@ -0,0 +1,8 @@
 {
  "semi": true,
  "singleQuote": true,
  "trailingComma": "es5",
  "printWidth": 80,
  "tabWidth": 2,
  "useTabs": false
 }
--- a/docs/architecture/07_deployment/07_01_testing_ci.md
+++ b/docs/architecture/07_deployment/07_01_testing_ci.md
@@ -117,46 +117,6 @@ pytest tests/e2e/ --headed
 When adding new workflows, mirror this structure to ensure secrets, caching, and deployment steps remain aligned with the production environment.
 ## CI Owner Coordination Notes
 ### Key Findings
 - Self-hosted runner: ASUS System Product Name chassis with AMD Ryzen 7 7700X (8 physical cores / 16 threads) and 63.2 GB usable RAM; `act_runner` configuration not overridden, so only one workflow job runs concurrently today.
 - Unit test matrix job: completes 117 pytest cases in roughly 4.1 seconds after Postgres spins up; Docker services consume ~150 MB for `postgres:16-alpine`, with minimal sustained CPU load once tests begin.
 - End-to-end matrix job: `pytest tests/e2e` averages 21‑22 seconds of execution, but a cold run downloads ~179 MB of apt packages plus ~470 MB of Playwright browser bundles (Chromium, Firefox, WebKit, FFmpeg), exceeding 650 MB network transfer and adding several gigabytes of disk writes if caches are absent.
 - Both jobs reuse existing Python package caches when available; absent a shared cache service, repeated Playwright installs remain the dominant cost driver for cold executions.
 ### Open Questions
 - Can we raise the runner concurrency above the default single job, or provision an additional runner, so the test matrix can execute without serializing queued workflows?
 - Is there a central cache or artifact service available for Python wheels and Playwright browser bundles to avoid ~650 MB downloads on cold starts?
 - Are we permitted to bake Playwright browsers into the base runner image, or should we pursue a shared cache/proxy solution instead?
 ### Outreach Draft
 ```text
 Subject: CalMiner CI parallelization support
 Hi <CI Owner>,
 We recently updated the CalMiner test workflow to fan out unit and Playwright E2E suites in parallel. While validating the change, we gathered the following:
 - Runner host: ASUS System Product Name with AMD Ryzen 7 7700X (8 cores / 16 threads), ~63 GB RAM, default `act_runner` concurrency (1 job at a time).
 - Unit job finishes in ~4.1 s once Postgres is ready; light CPU and network usage.
 - E2E job finishes in ~22 s, but a cold run pulls ~179 MB of apt packages plus ~470 MB of Playwright browser payloads (>650 MB download, several GB disk writes) because we do not have a shared cache yet.
 To move forward, could you help with the following?
 1. Confirm whether we can raise the runner concurrency limit or provision an additional runner so parallel jobs do not queue behind one another.
 2. Let us know if a central cache (Artifactory, Nexus, etc.) is available for Python wheels and Playwright browser bundles, or if we should consider baking the browsers into the runner image instead.
 3. Share any guidance on preferred caching or proxy solutions for large binary installs on self-hosted runners.
 Once we have clarity, we can finalize the parallel rollout and update the documentation accordingly.
 Thanks,
 <Your Name>
 ```
 ## Workflow Optimization Opportunities
 ### `test.yml`
@@ -216,3 +176,43 @@ Thanks,
 - Benefits: centralizes proxy logic and dependency installs, reduces duplication across matrix jobs, and keeps future lint/type-check jobs lightweight by disabling database setup.
 - Implementation status: action available at `.gitea/actions/setup-python-env` and consumed by `test.yml`; extend to additional workflows as they adopt the shared routine.
 - Obsolete steps removed: individual apt proxy, dependency install, Playwright, and database setup commands pruned from `test.yml` once the composite action was integrated.
 ## CI Owner Coordination Notes
 ### Key Findings
 - Self-hosted runner: ASUS System Product Name chassis with AMD Ryzen 7 7700X (8 physical cores / 16 threads) and 63.2 GB usable RAM; `act_runner` configuration not overridden, so only one workflow job runs concurrently today.
 - Unit test matrix job: completes 117 pytest cases in roughly 4.1 seconds after Postgres spins up; Docker services consume ~150 MB for `postgres:16-alpine`, with minimal sustained CPU load once tests begin.
 - End-to-end matrix job: `pytest tests/e2e` averages 21‑22 seconds of execution, but a cold run downloads ~179 MB of apt packages plus ~470 MB of Playwright browser bundles (Chromium, Firefox, WebKit, FFmpeg), exceeding 650 MB network transfer and adding several gigabytes of disk writes if caches are absent.
 - Both jobs reuse existing Python package caches when available; absent a shared cache service, repeated Playwright installs remain the dominant cost driver for cold executions.
 ### Open Questions
 - Can we raise the runner concurrency above the default single job, or provision an additional runner, so the test matrix can execute without serializing queued workflows?
 - Is there a central cache or artifact service available for Python wheels and Playwright browser bundles to avoid ~650 MB downloads on cold starts?
 - Are we permitted to bake Playwright browsers into the base runner image, or should we pursue a shared cache/proxy solution instead?
 ### Outreach Draft
 ```text
 Subject: CalMiner CI parallelization support
 Hi <CI Owner>,
 We recently updated the CalMiner test workflow to fan out unit and Playwright E2E suites in parallel. While validating the change, we gathered the following:
 - Runner host: ASUS System Product Name with AMD Ryzen 7 7700X (8 cores / 16 threads), ~63 GB RAM, default `act_runner` concurrency (1 job at a time).
 - Unit job finishes in ~4.1 s once Postgres is ready; light CPU and network usage.
 - E2E job finishes in ~22 s, but a cold run pulls ~179 MB of apt packages plus ~470 MB of Playwright browser payloads (>650 MB download, several GB disk writes) because we do not have a shared cache yet.
 To move forward, could you help with the following?
 1. Confirm whether we can raise the runner concurrency limit or provision an additional runner so parallel jobs do not queue behind one another.
 2. Let us know if a central cache (Artifactory, Nexus, etc.) is available for Python wheels and Playwright browser bundles, or if we should consider baking the browsers into the runner image instead.
 3. Share any guidance on preferred caching or proxy solutions for large binary installs on self-hosted runners.
 Once we have clarity, we can finalize the parallel rollout and update the documentation accordingly.
 Thanks,
 <Your Name>
 ```
--- a/docs/architecture/07_deployment/07_03_gitea_action_runner.md
+++ b/docs/architecture/07_deployment/07_03_gitea_action_runner.md
@@ -0,0 +1,152 @@
 # Gitea Action Runner Setup
 This guide describes how to provision, configure, and maintain self-hosted runners for CalMiner's Gitea-based CI/CD pipelines.
 ## 1. Purpose and Scope
 - Explain the role runners play in executing GitHub Actions–compatible workflows inside our private Gitea instance.
 - Define supported environments (Windows hosts running Docker for Linux containers today, Alpine or other Linux variants as future additions).
 - Provide repeatable steps so additional runners can be brought online quickly and consistently.
 ## 2. Prerequisites
 - **Hardware**: Minimum 8 vCPU, 16 GB RAM, and 50 GB free disk. For Playwright-heavy suites, plan for ≥60 GB free to absorb browser caches.
 - **Operating system**: Current runner uses Windows 11 Pro (10.0.26100, 64-bit). Linux instructions mirror the same flow; see section 7 for Alpine specifics.
 - **Container engine**: Docker Desktop (Windows) or Docker Engine (Linux) with pull access to `docker.gitea.com/runner-images` and `postgres:16-alpine`.
 - **Dependencies**: `curl`, `tar`, PowerShell 7+ (Windows), or standard GNU utilities (Linux) to unpack releases.
 - **Gitea access**: Repository admin or site admin token with permission to register self-hosted runners (`Settings → Runners → New Runner`).
 ### Current Runner Inventory (October 2025)
 - Hostname `DESKTOP-GLB3A15`; ASUS System Product Name chassis with AMD Ryzen 7 7700X (8C/16T) and ~63 GB usable RAM.
 - Windows 11 Pro 10.0.26100 (64-bit) hosting Docker containers for Ubuntu-based job images.
 - `act_runner` version `v0.2.13`; no `act_runner.yaml` present, so defaults apply (single concurrency, no custom labels beyond registration).
 - Registered against `http://192.168.88.30:3000` with labels:
  - `ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest`
  - `ubuntu-24.04:docker://docker.gitea.com/runner-images:ubuntu-24.04`
  - `ubuntu-22.04:docker://docker.gitea.com/runner-images:ubuntu-22.04`
 - Runner metadata stored in `.runner`; removing this file forces re-registration and should only be done intentionally.
 ## 3. Runner Installation
 ### 3.1 Download and Extract
 ```powershell
 $runnerVersion = "v0.2.13"
 $downloadUrl = "https://gitea.com/gitea/act_runner/releases/download/$runnerVersion/act_runner_${runnerVersion}_windows_amd64.zip"
 Invoke-WebRequest -Uri $downloadUrl -OutFile act_runner.zip
 Expand-Archive act_runner.zip -DestinationPath C:\Tools\act-runner -Force
 ```
 For Linux, download the `linux_amd64.tar.gz` artifact and extract with `tar -xzf` into `/opt/act-runner`.
 ### 3.2 Configure Working Directory
 ```powershell
 Set-Location C:\Tools\act-runner
 New-Item -ItemType Directory -Path logs -Force | Out-Null
 ```
 Ensure the directory is writable by the service account that will execute the runner.
 ### 3.3 Register With Gitea
 1. In Gitea, navigate to the repository or organization **Settings → Runners → New Runner**.
 2. Copy the registration token and instance URL.
 3. Execute the registration wizard:
 ```powershell
 .\act_runner.exe register --instance http://192.168.88.30:3000 --token <TOKEN> --labels "ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest" "ubuntu-24.04:docker://docker.gitea.com/runner-images:ubuntu-24.04" "ubuntu-22.04:docker://docker.gitea.com/runner-images:ubuntu-22.04"
 ```
 Linux syntax is identical using `./act_runner register`.
 This command populates `.runner` with the runner ID, UUID, and labels.
 ## 4. Service Configuration
 ### 4.1 Windows Service
 Act Runner provides a built-in service helper:
 ```powershell
 .\act_runner.exe install
 .\act_runner.exe start
 ```
 The service runs under `LocalSystem` by default. Use `.\act_runner.exe install --user <DOMAIN\User> --password <Secret>` if isolation is required.
 ### 4.2 Linux systemd Unit
 Create `/etc/systemd/system/act-runner.service`:
 ```ini
 [Unit]
 Description=Gitea Act Runner
 After=docker.service
 Requires=docker.service
 [Service]
 WorkingDirectory=/opt/act-runner
 ExecStart=/opt/act-runner/act_runner daemon
 Restart=always
 RestartSec=10
 Environment="HTTP_PROXY=http://apt-cacher:3142" "HTTPS_PROXY=http://apt-cacher:3142"
 [Install]
 WantedBy=multi-user.target
 ```
 Enable and start:
 ```bash
 sudo systemctl daemon-reload
 sudo systemctl enable --now act-runner.service
 ```
 ### 4.3 Environment Variables and Proxy Settings
 - Configure `HTTP_PROXY`, `HTTPS_PROXY`, and their lowercase variants to leverage the shared apt cache (`http://apt-cacher:3142`).
 - Persist Docker registry credentials (for `docker.gitea.com`) in the service user profile using `docker login`; workflows rely on cached authentication for builds.
 - To expose pip caching once infrastructure is available, set `PIP_INDEX_URL` and `PIP_EXTRA_INDEX_URL` at the service level.
 ### 4.4 Logging
 - Windows services write to `%ProgramData%\act-runner\logs`. Redirect or forward to centralized logging if required.
 - Linux installations can leverage `journalctl -u act-runner` and logrotate rules for `/opt/act-runner/logs`.
 ## 5. Network and Security
 - **Outbound**: Allow HTTPS traffic to the Gitea instance, Docker Hub, docker.gitea.com, npm (for Playwright), PyPI, and the apt cache proxy.
 - **Inbound**: No inbound ports are required; block unsolicited traffic on internet-facing hosts.
 - **Credentials**: Store deployment SSH keys and registry credentials in Gitea secrets, not on the runner host.
 - **Least privilege**: Run the service under a dedicated account with access only to Docker and required directories.
 ## 6. Maintenance and Upgrades
 - **Version checks**: Monitor `https://gitea.com/gitea/act_runner/releases` and schedule upgrades quarterly or when security fixes drop.
 - **Upgrade procedure**: Stop the service, replace `act_runner` binary, restart. Re-registration is not required as long as `.runner` remains intact.
 - **Health checks**: Periodically validate connectivity with `act_runner exec --detect-event -W .gitea/workflows/test.yml` and inspect workflow durations to catch regressions.
 - **Cleanup**: Purge Docker images and volumes monthly (`docker system prune -af`) to reclaim disk space.
 - **Troubleshooting**: Use `act_runner diagnose` (if available in newer versions) or review logs for repeated failures; reset by stopping the service, deleting stale job containers (`docker ps -a`), and restarting.
 ## 7. Alpine-based Runner Notes
 - Install baseline packages: `apk add docker bash curl coreutils nodejs npm python3 py3-pip libstdc++`.
 - Playwright requirements: add `apk add chromium nss freetype harfbuzz ca-certificates mesa-gl` or install Playwright browsers via `npx playwright install --with-deps` using the Alpine bundle.
 - Musl vs glibc: When workflows require glibc (e.g., certain Python wheels), include `apk add gcompat` or base images on `frolvlad/alpine-glibc`.
 - Systemd alternative: Use `rc-service` or `supervisord` to manage `act_runner daemon` on Alpine since systemd is absent.
 - Storage: Mount `/var/lib/docker` to persistent storage if running inside a VM, ensuring browser downloads and layer caches survive restarts.
 ## 8. Appendix
 - **Troubleshooting checklist**:
  - Verify Docker daemon is healthy (`docker info`).
  - Confirm `.runner` file exists and lists expected labels.
  - Re-run `act_runner register` if the runner no longer appears in Gitea.
  - Check proxy endpoints are reachable before jobs start downloading dependencies.
 - **Related documentation**:
  - `docs/architecture/07_deployment/07_01_testing_ci.md` (workflow architecture and CI owner coordination).
  - `docs/ci-cache-troubleshooting.md` (pip caching status and known issues).
  - `.gitea/actions/setup-python-env/action.yml` (shared job preparation logic referenced in workflows).
--- a/docs/architecture/07_deployment_view.md
+++ b/docs/architecture/07_deployment_view.md
@@ -41,16 +41,42 @@ The infrastructure components for the application include:
 ```mermaid
 graph TD
    G[Git Repository] --> C[CI/CD Pipeline]
    C --> GAW[Gitea Action Workflows]
    GAW --> GAR[Gitea Action Runners]
    GAR --> T[Testing]
    GAR --> CI[Continuous Integration]
    T --> G
    CI --> G
    W[Web Server] --> DB[Database Server]
    RP[Reverse Proxy] --> W
    I((Internet)) <--> RP
    PO[Containerization] --> W
    C[CI/CD Pipeline] --> PO
    W --> S[Static File Server]
-    P[Reverse Proxy] --> W
+    S --> RP
-    C[CI/CD Pipeline] --> W
+    PO --> DB
-    F[Containerization] --> W
+    PO --> S
 ```
 ## Environments
-The application can be deployed in multiple environments to support development, testing, and production:
+The application can be deployed in multiple environments to support development, testing, and production.
 ```mermaid
 graph TD
    R[Repository] --> DEV[Development Environment]
    R[Repository] --> TEST[Testing Environment]
    R[Repository] --> PROD[Production Environment]
    DEV --> W_DEV[Web Server - Dev]
    DEV --> DB_DEV[Database Server - Dev]
    TEST --> W_TEST[Web Server - Test]
    TEST --> DB_TEST[Database Server - Test]
    PROD --> W_PROD[Web Server - Prod]
    PROD --> DB_PROD[Database Server - Prod]
 ```
 ### Development Environment
@@ -73,7 +99,7 @@ The production environment is set up for serving live traffic and includes:
 - Production PostgreSQL instance
 - FastAPI server running in production mode
- Load balancer (e.g., Nginx) for distributing incoming requests
+- Load balancer (Traefik) for distributing incoming requests
 - Monitoring and logging tools for tracking application performance
 ## Containerized Deployment Flow
@@ -84,12 +110,12 @@ The Docker-based deployment path aligns with the solution strategy documented in
 - The multi-stage `Dockerfile` installs dependencies in a builder layer (including system compilers and Python packages) and copies only the required runtime artifacts to the final image.
 - Build arguments are minimal; database configuration is supplied at runtime via granular variables (`DATABASE_DRIVER`, `DATABASE_HOST`, `DATABASE_PORT`, `DATABASE_USER`, `DATABASE_PASSWORD`, `DATABASE_NAME`, optional `DATABASE_SCHEMA`). Secrets and configuration should be passed via environment variables or an orchestrator.
- The resulting image exposes port `8000` and starts `uvicorn main:app` (s. [README.md](../../README.md)).
+- The resulting image exposes port `8000` and starts `uvicorn main:app` (see main [README.md](../../README.md)).
 ### Runtime Environment
 - For single-node deployments, run the container alongside PostgreSQL/Redis using Docker Compose or an equivalent orchestrator.
- A reverse proxy (e.g., Nginx) terminates TLS and forwards traffic to the container on port `8000`.
+- A reverse proxy (Traefik) terminates TLS and forwards traffic to the container on port `8000`.
 - Migrations must be applied prior to rolling out a new image; automation can hook into the deploy step to run `scripts/run_migrations.py`.
 ### CI/CD Integration
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -168,8 +168,6 @@ docker compose -f docker-compose.postgres.yml down
 docker volume rm calminer_postgres_local_postgres_data  # optional cleanup
 ```
 Document successful runs (or issues encountered) in `.github/instructions/DONE.TODO.md` for future reference.
 ### Seeding reference data
 `scripts/seed_data.py` provides targeted control over the baseline datasets when the full setup script is not required:
@@ -202,7 +200,7 @@ After a failure and rollback, rerun the full setup once the environment issues a
 The `.gitea/workflows/test.yml` job spins up a temporary PostgreSQL 16 container and runs the setup script twice: once with `--dry-run` to validate the plan and again without it to apply migrations and seeds. No external secrets are required; the workflow sets the following environment variables for both invocations and for pytest:
 | Variable                      | Value         | Purpose                                           |
-| --- | --- | --- |
+| ----------------------------- | ------------- | ------------------------------------------------- |
 | `DATABASE_DRIVER`             | `postgresql`  | Signals the driver to the setup script            |
 | `DATABASE_HOST`               | `postgres`    | Hostname of the Postgres job service container    |
 | `DATABASE_PORT`               | `5432`        | Default service port                              |
@@ -228,8 +226,6 @@ Recommended execution order:
 2. Execute the live run with the same flags minus `--dry-run` to provision the database, role grants, migrations, and seed data. Save the log as `reports/setup_staging_apply.log`.
 3. Repeat the dry run to verify idempotency and record the result (for example `reports/setup_staging_post_apply.log`).
 Record any issues in `.github/instructions/TODO.md` or `.github/instructions/DONE.TODO.md` as appropriate so the team can track follow-up actions.
 ## Database Objects
 The database contains tables such as `capex`, `opex`, `chemical_consumption`, `fuel_consumption`, `water_consumption`, `scrap_consumption`, `production_output`, `equipment_operation`, `ore_batch`, `exchange_rate`, and `simulation_result`.
--- a/docs/staging_environment_setup.md
+++ b/docs/staging_environment_setup.md
@@ -17,7 +17,7 @@ This guide outlines how to provision and validate the CalMiner staging database
 Populate the following environment variables before invoking the setup script. Store them in a secure location such as `config/setup_staging.env` (excluded from source control) and load them with `dotenv` or your shell profile.
 | Variable                      | Description                                                                               |
-| --- | --- |
+| ----------------------------- | ----------------------------------------------------------------------------------------- |
 | `DATABASE_HOST`               | Staging PostgreSQL hostname or IP (for example `staging-db.internal`).                    |
 | `DATABASE_PORT`               | Port exposed by the staging PostgreSQL service (default `5432`).                          |
 | `DATABASE_NAME`               | CalMiner staging database name (for example `calminer_staging`).                          |
@@ -98,4 +98,3 @@ Run the setup script in three phases to validate idempotency and capture diagnos
 ## Next Steps
 - Keep this document updated as staging infrastructure evolves (for example, when migrating to managed services or rotating credentials).
 - Once staging validation is complete, summarize the outcome in `.github/instructions/DONE.TODO.md` and cross-link the relevant log files.
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,16 @@
 [tool.black]
 line-length = 80
 target-version = ['py310']
 include = '\\.pyi?$'
 exclude = '''
 /(
  .git
  | .hg
  | .mypy_cache
  | .tox
  | .venv
  | build
  | dist
 )/
 '''
--- a/requirements-dev.txt
+++ b/requirements-dev.txt
@@ -0,0 +1 @@
 black
--- a/requirements-test.txt
+++ b/requirements-test.txt
@@ -3,3 +3,4 @@ pytest-cov
 pytest-httpx
 playwright
 pytest-playwright
 ruff