feat: Add staging environment setup guide and configuration files; update .gitignore

docs: Add summary for Postgres container setup in quickstart guide
refactor: Update CI configuration and local setup documentation; remove obsolete currency migration scripts
2025-10-25 18:01:46 +02:00 · 2025-10-25 17:05:49 +02:00 · 2025-10-25 16:59:35 +02:00
11 changed files with 249 additions and 121 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,6 +16,9 @@ env/
 # environment variables
 .env
 *.env
 # except example files
 !config/*.env.example
 # github instruction files
 .github/instructions/
--- a/config/setup_staging.env.example
+++ b/config/setup_staging.env.example
@@ -0,0 +1,11 @@
 # Sample environment configuration for staging deployment
 DATABASE_HOST=staging-db.internal
 DATABASE_PORT=5432
 DATABASE_NAME=calminer_staging
 DATABASE_USER=calminer_app
 DATABASE_PASSWORD=<app-password>
 # Admin connection used for provisioning database and roles
 DATABASE_SUPERUSER=postgres
 DATABASE_SUPERUSER_PASSWORD=<admin-password>
 DATABASE_SUPERUSER_DB=postgres
--- a/config/setup_test.env.example
+++ b/config/setup_test.env.example
@@ -1,13 +1,14 @@
 # Sample environment configuration for running scripts/setup_database.py against a test instance
 DATABASE_DRIVER=postgresql
-DATABASE_HOST=192.168.88.35
+DATABASE_HOST=postgres
 DATABASE_PORT=5432
 DATABASE_NAME=calminer_test
 DATABASE_USER=calminer_test
-DATABASE_PASSWORD=calminer_test_password
+DATABASE_PASSWORD=<test-password>
-DATABASE_SCHEMA=public
+# optional: specify schema if different from 'public'
 #DATABASE_SCHEMA=public
 # Admin connection used for provisioning database and roles
 DATABASE_SUPERUSER=postgres
-DATABASE_SUPERUSER_PASSWORD=M11ffpgm.
+DATABASE_SUPERUSER_PASSWORD=<superuser-password>
 DATABASE_SUPERUSER_DB=postgres
--- a/docker-compose.postgres.yml
+++ b/docker-compose.postgres.yml
@@ -0,0 +1,23 @@
 version: "3.9"
 services:
  postgres:
    image: postgres:16-alpine
    container_name: calminer_postgres_local
    restart: unless-stopped
    environment:
      POSTGRES_DB: calminer_local
      POSTGRES_USER: calminer
      POSTGRES_PASSWORD: secret
    ports:
      - "5433:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U calminer -d calminer_local"]
      interval: 10s
      timeout: 5s
      retries: 10
    volumes:
      - postgres_data:/var/lib/postgresql/data
 volumes:
  postgres_data:
--- a/docs/architecture/07_deployment_view.md
+++ b/docs/architecture/07_deployment_view.md
@@ -37,7 +37,7 @@ The application can be deployed in multiple environments to support development,
 The development environment is set up for local development and testing. It includes:
- Local PostgreSQL instance
+- Local PostgreSQL instance (docker compose recommended, script available at `docker-compose.postgres.yml`)
 - FastAPI server running in debug mode
 ### Testing Environment
--- a/docs/architecture/14_testing_ci.md
+++ b/docs/architecture/14_testing_ci.md
@@ -21,7 +21,7 @@ CalMiner uses a combination of unit, integration, and end-to-end tests to ensure
 ### CI/CD
 - Use Gitea Actions for CI/CD; workflows live under `.gitea/workflows/`.
- `test.yml` runs on every push with cached Python dependencies via `actions/cache@v3`.
+- `test.yml` runs on every push, provisions a temporary Postgres 16 service, waits for readiness, executes the setup script in dry-run and live modes, installs Playwright browsers, and finally runs the full pytest suite.
 - `build-and-push.yml` builds the Docker image with `docker/build-push-action@v2`, reusing GitHub Actions cache-backed layers, and pushes to the Gitea registry.
 - `deploy.yml` connects to the target host (via `appleboy/ssh-action`) to pull the freshly pushed image and restart the container.
 - Mandatory secrets: `REGISTRY_USERNAME`, `REGISTRY_PASSWORD`, `REGISTRY_URL`, `SSH_HOST`, `SSH_USERNAME`, `SSH_PRIVATE_KEY`.
@@ -99,10 +99,11 @@ pytest tests/e2e/ --headed
 `test.yml` encapsulates the steps below:
 - Check out the repository and set up Python 3.10.
- Restore the pip cache (keyed by `requirements.txt`).
+- Configure the runner's apt proxy (if available), install project dependencies (requirements + test extras), and download Playwright browsers.
 - Install project dependencies and Playwright browsers (if needed for E2E).
 - Run `pytest` (extend with `--cov` flags when enforcing coverage).
 > The pip cache step is temporarily disabled in `test.yml` until the self-hosted cache service is exposed (see `docs/ci-cache-troubleshooting.md`).
 `build-and-push.yml` adds:
 - Registry login using repository secrets.
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -120,6 +120,45 @@ Typical log output confirms:
 After a successful run the target database contains all application tables plus `schema_migrations`, and that table records each applied migration file. New installations only record `000_base.sql`; upgraded environments retain historical entries alongside the baseline.
 ### Local Postgres via Docker Compose
 For local validation without installing Postgres directly, use the provided compose file:
 ```powershell
 docker compose -f docker-compose.postgres.yml up -d
 ```
 #### Summary
 1. Start the Postgres container with `docker compose -f docker-compose.postgres.yml up -d`.
 2. Export the granular database environment variables (host `127.0.0.1`, port `5433`, database `calminer_local`, user/password `calminer`/`secret`).
 3. Run the setup script twice: first with `--dry-run` to preview actions, then without it to apply changes.
 4. When finished, stop and optionally remove the container/volume using `docker compose -f docker-compose.postgres.yml down`.
 The service exposes Postgres 16 on `localhost:5433` with database `calminer_local` and role `calminer`/`secret`. When the container is running, set the granular environment variables before invoking the setup script:
 ```powershell
 $env:DATABASE_DRIVER = 'postgresql'
 $env:DATABASE_HOST = '127.0.0.1'
 $env:DATABASE_PORT = '5433'
 $env:DATABASE_USER = 'calminer'
 $env:DATABASE_PASSWORD = 'secret'
 $env:DATABASE_NAME = 'calminer_local'
 $env:DATABASE_SCHEMA = 'public'
 python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v
 python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data -v
 ```
 When testing is complete, shut down the container (and optional persistent volume) with:
 ```powershell
 docker compose -f docker-compose.postgres.yml down
 docker volume rm calminer_postgres_local_postgres_data  # optional cleanup
 ```
 Document successful runs (or issues encountered) in `.github/instructions/DONE.TODO.md` for future reference.
 ### Seeding reference data
 `scripts/seed_data.py` provides targeted control over the baseline datasets when the full setup script is not required:
@@ -154,7 +193,7 @@ The `.gitea/workflows/test.yml` job spins up a temporary PostgreSQL 16 container
 | Variable | Value | Purpose |
 | --- | --- | --- |
 | `DATABASE_DRIVER` | `postgresql` | Signals the driver to the setup script |
-| `DATABASE_HOST` | `127.0.0.1` | Points to the linked job service |
+| `DATABASE_HOST` | `postgres` | Hostname of the Postgres job service container |
 | `DATABASE_PORT` | `5432` | Default service port |
 | `DATABASE_NAME` | `calminer_ci` | Target database created by the workflow |
 | `DATABASE_USER` | `calminer` | Application role used during tests |
@@ -166,7 +205,19 @@ The `.gitea/workflows/test.yml` job spins up a temporary PostgreSQL 16 container
 The workflow also updates `DATABASE_URL` for pytest to point at the CI Postgres instance. Existing tests continue to work unchanged, since SQLAlchemy reads the URL exactly as it does locally.
-Because the workflow provisions everything inline, no repository or organization secrets need to be configured for basic CI runs. If you later move the setup step to staging or production pipelines, replace these inline values with secrets managed by the CI platform.
+Because the workflow provisions everything inline, no repository or organization secrets need to be configured for basic CI runs. If you later move the setup step to staging or production pipelines, replace these inline values with secrets managed by the CI platform. When running on self-hosted runners behind an HTTP proxy or apt cache, ensure Playwright dependencies and OS packages inherit the same proxy settings that the workflow configures prior to installing browsers.
 ### Staging environment workflow
 Use the staging checklist in `docs/staging_environment_setup.md` when running the setup script against the shared environment. A sample variable file (`config/setup_staging.env`) records the expected inputs (host, port, admin/application roles); copy it outside the repository or load the values securely via your shell before executing the workflow.
 Recommended execution order:
 1. Dry run with `--dry-run -v` to confirm connectivity and review planned operations. Capture the output to `reports/setup_staging_dry_run.log` (or similar) for auditing.
 2. Execute the live run with the same flags minus `--dry-run` to provision the database, role grants, migrations, and seed data. Save the log as `reports/setup_staging_apply.log`.
 3. Repeat the dry run to verify idempotency and record the result (for example `reports/setup_staging_post_apply.log`).
 Record any issues in `.github/instructions/TODO.md` or `.github/instructions/DONE.TODO.md` as appropriate so the team can track follow-up actions.
 ## Database Objects
--- a/docs/staging_environment_setup.md
+++ b/docs/staging_environment_setup.md
@@ -0,0 +1,101 @@
 # Staging Environment Setup
 This guide outlines how to provision and validate the CalMiner staging database using `scripts/setup_database.py`. It complements the local and CI-focused instructions in `docs/quickstart.md`.
 ## Prerequisites
 - Network access to the staging infrastructure (VPN or bastion, as required by ops).
 - Provisioned PostgreSQL instance with superuser or delegated admin credentials for maintenance.
 - Application credentials (role + password) dedicated to CalMiner staging.
 - The application repository checked out with Python dependencies installed (`pip install -r requirements.txt`).
 - Optional but recommended: a writable directory (for example `reports/`) to capture setup logs.
 > Replace the placeholder values in the examples below with the actual host, port, and credential details supplied by ops.
 ## Environment Configuration
 Populate the following environment variables before invoking the setup script. Store them in a secure location such as `config/setup_staging.env` (excluded from source control) and load them with `dotenv` or your shell profile.
 | Variable | Description |
 | --- | --- |
 | `DATABASE_HOST` | Staging PostgreSQL hostname or IP (for example `staging-db.internal`). |
 | `DATABASE_PORT` | Port exposed by the staging PostgreSQL service (default `5432`). |
 | `DATABASE_NAME` | CalMiner staging database name (for example `calminer_staging`). |
 | `DATABASE_USER` | Application role used by the FastAPI app (for example `calminer_app`). |
 | `DATABASE_PASSWORD` | Password for the application role. |
 | `DATABASE_SCHEMA` | Optional non-public schema; omit or set to `public` otherwise. |
 | `DATABASE_SUPERUSER` | Administrative role with rights to create roles/databases (for example `calminer_admin`). |
 | `DATABASE_SUPERUSER_PASSWORD` | Password for the administrative role. |
 | `DATABASE_SUPERUSER_DB` | Database to connect to for admin tasks (default `postgres`). |
 | `DATABASE_ADMIN_URL` | Optional DSN that overrides the granular admin settings above. |
 You may also set `DATABASE_URL` for application runtime convenience, but the setup script only requires the values listed in the table.
 ### Loading Variables (PowerShell example)
 ```powershell
 $env:DATABASE_HOST = "staging-db.internal"
 $env:DATABASE_PORT = "5432"
 $env:DATABASE_NAME = "calminer_staging"
 $env:DATABASE_USER = "calminer_app"
 $env:DATABASE_PASSWORD = "<app-password>"
 $env:DATABASE_SUPERUSER = "calminer_admin"
 $env:DATABASE_SUPERUSER_PASSWORD = "<admin-password>"
 $env:DATABASE_SUPERUSER_DB = "postgres"
 ```
 For bash shells, export the same variables using `export VARIABLE=value` or load them through `dotenv`.
 ## Setup Workflow
 Run the setup script in three phases to validate idempotency and capture diagnostics:
 1. **Dry run (diagnostic):**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_dry_run.log
   ```
   Confirm that the script reports planned actions without failures. If the application role is missing, a dry run will log skip messages until a live run creates the role.
 2. **Apply changes:**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_apply.log
   ```
   Verify the log for successful database creation, role grants, migration execution, and seed verification.
 3. **Post-apply dry run:**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_post_apply.log
   ```
   This run should confirm that all schema objects, migrations, and seed data are already in place.
 ## Validation Checklist
 - [ ] Confirm the staging application can connect using the application DSN (for example, run `pytest tests/e2e/test_smoke.py` against staging or trigger a smoke test workflow).
 - [ ] Inspect `schema_migrations` to ensure the baseline migration (`000_base.sql`) is recorded.
 - [ ] Spot-check seeded reference data (`currency`, `measurement_unit`) for correctness.
 - [ ] Capture and archive the three setup logs in a shared location for audit purposes.
 ## Troubleshooting
 - If the dry run reports skipped actions because the application role does not exist, proceed with the live run; subsequent dry runs will validate as expected.
 - Connection errors usually stem from network restrictions or incorrect credentials. Validate reachability with `psql` or `pg_isready` using the same host/port and credentials.
 - For permission issues during migrations or seeding, confirm the admin role has rights on the target database and that the application role inherits the expected privileges.
 ## Rollback Guidance
 - Database creation and role grants register rollback actions when not running in dry-run mode. If a later step fails, rerun the script without `--dry-run`; it will automatically revoke grants or drop newly created resources as part of the rollback routine.
 - For staged environments where manual intervention is required, coordinate with ops before dropping databases or roles.
 ## Next Steps
 - Keep this document updated as staging infrastructure evolves (for example, when migrating to managed services or rotating credentials).
 - Once staging validation is complete, summarize the outcome in `.github/instructions/DONE.TODO.md` and cross-link the relevant log files.
--- a/scripts/migrations/20251021_add_currency_and_unit_fields.sql
+++ b/scripts/migrations/20251021_add_currency_and_unit_fields.sql
@@ -1,29 +0,0 @@
 -- CalMiner Migration: add currency and unit metadata columns
 -- Date: 2025-10-21
 -- Purpose: align persisted schema with API changes introducing currency selection for
 --          CAPEX/OPEX costs and unit selection for consumption/production records.
 BEGIN;
 -- CAPEX / OPEX
 ALTER TABLE capex
    ADD COLUMN IF NOT EXISTS currency_code VARCHAR(3) NOT NULL DEFAULT 'USD';
 ALTER TABLE opex
    ADD COLUMN IF NOT EXISTS currency_code VARCHAR(3) NOT NULL DEFAULT 'USD';
 -- Consumption tracking
 ALTER TABLE consumption
    ADD COLUMN IF NOT EXISTS unit_name VARCHAR(64);
 ALTER TABLE consumption
    ADD COLUMN IF NOT EXISTS unit_symbol VARCHAR(16);
 -- Production output
 ALTER TABLE production_output
    ADD COLUMN IF NOT EXISTS unit_name VARCHAR(64);
 ALTER TABLE production_output
    ADD COLUMN IF NOT EXISTS unit_symbol VARCHAR(16);
 COMMIT;
--- a/scripts/migrations/20251022_create_currency_table_and_fks.sql
+++ b/scripts/migrations/20251022_create_currency_table_and_fks.sql
@@ -1,66 +0,0 @@
 -- Migration: create currency referential table and convert capex/opex to FK
 -- Date: 2025-10-22
 BEGIN;
 -- 1) Create currency table
 CREATE TABLE IF NOT EXISTS currency (
    id SERIAL PRIMARY KEY,
    code VARCHAR(3) NOT NULL UNIQUE,
    name VARCHAR(128) NOT NULL,
    symbol VARCHAR(8),
    is_active BOOLEAN NOT NULL DEFAULT TRUE
 );
 -- 2) Seed some common currencies (idempotent)
 INSERT INTO currency (code, name, symbol, is_active)
 SELECT * FROM (VALUES
    ('USD','United States Dollar','$',TRUE),
    ('EUR','Euro','€',TRUE),
    ('CLP','Chilean Peso','CLP$',TRUE),
    ('RMB','Chinese Yuan','¥',TRUE),
    ('GBP','British Pound','£',TRUE),
    ('CAD','Canadian Dollar','C$',TRUE),
    ('AUD','Australian Dollar','A$',TRUE)
 ) AS v(code,name,symbol,is_active)
 ON CONFLICT (code) DO NOTHING;
 -- 3) Add currency_id columns to capex and opex with nullable true to allow backfill
 ALTER TABLE capex ADD COLUMN IF NOT EXISTS currency_id INTEGER;
 ALTER TABLE opex ADD COLUMN IF NOT EXISTS currency_id INTEGER;
 -- 4) Backfill currency_id using existing currency_code column where present
 -- Only do this if the currency_code column exists
 DO $$
 BEGIN
    IF EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name='capex' AND column_name='currency_code') THEN
        UPDATE capex SET currency_id = (
            SELECT id FROM currency WHERE code = capex.currency_code LIMIT 1
        );
    END IF;
    IF EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name='opex' AND column_name='currency_code') THEN
        UPDATE opex SET currency_id = (
            SELECT id FROM currency WHERE code = opex.currency_code LIMIT 1
        );
    END IF;
 END$$;
 -- 5) Make currency_id non-nullable and add FK constraint, default to USD where missing
 UPDATE currency SET is_active = TRUE WHERE code = 'USD';
 -- Ensure any NULL currency_id uses USD
 UPDATE capex SET currency_id = (SELECT id FROM currency WHERE code='USD') WHERE currency_id IS NULL;
 UPDATE opex SET currency_id = (SELECT id FROM currency WHERE code='USD') WHERE currency_id IS NULL;
 ALTER TABLE capex ALTER COLUMN currency_id SET NOT NULL;
 ALTER TABLE opex ALTER COLUMN currency_id SET NOT NULL;
 ALTER TABLE capex ADD CONSTRAINT fk_capex_currency FOREIGN KEY (currency_id) REFERENCES currency(id);
 ALTER TABLE opex ADD CONSTRAINT fk_opex_currency FOREIGN KEY (currency_id) REFERENCES currency(id);
 -- 6) Optionally drop old currency_code columns if they exist
 ALTER TABLE capex DROP COLUMN IF EXISTS currency_code;
 ALTER TABLE opex DROP COLUMN IF EXISTS currency_code;
 COMMIT;
--- a/scripts/setup_database.py
+++ b/scripts/setup_database.py
@@ -559,6 +559,26 @@ class DatabaseSetup:
                        schema_name,
                    )
    def application_role_exists(self) -> bool:
        try:
            with self._admin_connection(self.config.admin_database) as conn:
                with conn.cursor() as cursor:
                    try:
                        cursor.execute(
                            "SELECT 1 FROM pg_roles WHERE rolname = %s",
                            (self.config.user,),
                        )
                    except psycopg2.Error as exc:
                        message = (
                            "Unable to inspect existing roles while checking for role '%s'."
                            " Verify admin permissions."
                        ) % self.config.user
                        logger.error(message)
                        raise RuntimeError(message) from exc
                    return cursor.fetchone() is not None
        except RuntimeError:
            raise
    def _admin_connection(self, database: Optional[str] = None) -> PGConnection:
        target_db = database or self.config.admin_database
        dsn = self.config.admin_dsn(database)
@@ -1101,13 +1121,26 @@ def main() -> None:
    setup = DatabaseSetup(config, dry_run=args.dry_run)
    admin_tasks_requested = args.ensure_database or args.ensure_role or args.ensure_schema
    application_tasks_requested = args.initialize_schema or args.run_migrations
    if admin_tasks_requested:
        setup.validate_admin_connection()
    app_validated = False
    def ensure_application_connection_for(operation: str) -> bool:
        nonlocal app_validated
        if app_validated:
            return True
        if setup.dry_run and not setup.application_role_exists():
            logger.info(
                "Dry run: skipping %s because application role '%s' does not exist yet.",
                operation,
                setup.config.user,
            )
            return False
        setup.validate_application_connection()
        app_validated = True
        return True
    try:
        if args.ensure_database:
            setup.ensure_database()
@@ -1117,22 +1150,21 @@ def main() -> None:
            setup.ensure_schema()
        if args.initialize_schema:
-            if not app_validated and application_tasks_requested:
+            if ensure_application_connection_for(
-                setup.validate_application_connection()
+                "SQLAlchemy schema initialization"
-                app_validated = True
+            ):
-            setup.initialize_schema()
+                setup.initialize_schema()
        if args.run_migrations:
-            if not app_validated and application_tasks_requested:
+            if ensure_application_connection_for("migration execution"):
-                setup.validate_application_connection()
+                migrations_path = (
-                app_validated = True
+                    Path(args.migrations_dir)
-            migrations_path = Path(
+                    if args.migrations_dir
-                args.migrations_dir) if args.migrations_dir else None
+                    else None
-            setup.run_migrations(migrations_path)
+                )
                setup.run_migrations(migrations_path)
        if args.seed_data:
-            if not app_validated:
+            if ensure_application_connection_for("baseline data seeding"):
-                setup.validate_application_connection()
+                setup.seed_baseline_data(dry_run=args.dry_run)
                app_validated = True
            setup.seed_baseline_data(dry_run=args.dry_run)
    except Exception:
        if not setup.dry_run:
            setup.execute_rollbacks()
Author	SHA1	Message	Date
zwitschi	e74ec79cc9	feat: Add staging environment setup guide and configuration files; update .gitignore All checks were successful Run Tests / test (push) Successful in 1m49s Details	2025-10-25 18:01:46 +02:00
zwitschi	f3ce095b71	docs: Add summary for Postgres container setup in quickstart guide	2025-10-25 17:05:49 +02:00
zwitschi	4e1658a638	refactor: Update CI configuration and local setup documentation; remove obsolete currency migration scripts	2025-10-25 16:59:35 +02:00