feat: Add staging environment setup guide and configuration files; update .gitignore

2025-10-25 18:01:46 +02:00
parent f3ce095b71
commit e74ec79cc9
7 changed files with 181 additions and 21 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,6 +16,9 @@ env/
 # environment variables
 .env
 *.env
 # except example files
 !config/*.env.example
 # github instruction files
 .github/instructions/
--- a/config/setup_staging.env.example
+++ b/config/setup_staging.env.example
@@ -0,0 +1,11 @@
 # Sample environment configuration for staging deployment
 DATABASE_HOST=staging-db.internal
 DATABASE_PORT=5432
 DATABASE_NAME=calminer_staging
 DATABASE_USER=calminer_app
 DATABASE_PASSWORD=<app-password>
 # Admin connection used for provisioning database and roles
 DATABASE_SUPERUSER=postgres
 DATABASE_SUPERUSER_PASSWORD=<admin-password>
 DATABASE_SUPERUSER_DB=postgres
--- a/config/setup_test.env.example
+++ b/config/setup_test.env.example
@@ -1,13 +1,14 @@
 # Sample environment configuration for running scripts/setup_database.py against a test instance
 DATABASE_DRIVER=postgresql
-DATABASE_HOST=192.168.88.35
+DATABASE_HOST=postgres
 DATABASE_PORT=5432
 DATABASE_NAME=calminer_test
 DATABASE_USER=calminer_test
-DATABASE_PASSWORD=calminer_test_password
+DATABASE_PASSWORD=<test-password>
-DATABASE_SCHEMA=public
+# optional: specify schema if different from 'public'
 #DATABASE_SCHEMA=public
 # Admin connection used for provisioning database and roles
 DATABASE_SUPERUSER=postgres
-DATABASE_SUPERUSER_PASSWORD=M11ffpgm.
+DATABASE_SUPERUSER_PASSWORD=<superuser-password>
 DATABASE_SUPERUSER_DB=postgres
--- a/docs/architecture/07_deployment_view.md
+++ b/docs/architecture/07_deployment_view.md
@@ -37,7 +37,7 @@ The application can be deployed in multiple environments to support development,
 The development environment is set up for local development and testing. It includes:
- Local PostgreSQL instance
+- Local PostgreSQL instance (docker compose recommended, script available at `docker-compose.postgres.yml`)
 - FastAPI server running in debug mode
 ### Testing Environment
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -207,6 +207,18 @@ The workflow also updates `DATABASE_URL` for pytest to point at the CI Postgres
 Because the workflow provisions everything inline, no repository or organization secrets need to be configured for basic CI runs. If you later move the setup step to staging or production pipelines, replace these inline values with secrets managed by the CI platform. When running on self-hosted runners behind an HTTP proxy or apt cache, ensure Playwright dependencies and OS packages inherit the same proxy settings that the workflow configures prior to installing browsers.
 ### Staging environment workflow
 Use the staging checklist in `docs/staging_environment_setup.md` when running the setup script against the shared environment. A sample variable file (`config/setup_staging.env`) records the expected inputs (host, port, admin/application roles); copy it outside the repository or load the values securely via your shell before executing the workflow.
 Recommended execution order:
 1. Dry run with `--dry-run -v` to confirm connectivity and review planned operations. Capture the output to `reports/setup_staging_dry_run.log` (or similar) for auditing.
 2. Execute the live run with the same flags minus `--dry-run` to provision the database, role grants, migrations, and seed data. Save the log as `reports/setup_staging_apply.log`.
 3. Repeat the dry run to verify idempotency and record the result (for example `reports/setup_staging_post_apply.log`).
 Record any issues in `.github/instructions/TODO.md` or `.github/instructions/DONE.TODO.md` as appropriate so the team can track follow-up actions.
 ## Database Objects
 The database contains tables such as `capex`, `opex`, `chemical_consumption`, `fuel_consumption`, `water_consumption`, `scrap_consumption`, `production_output`, `equipment_operation`, `ore_batch`, `exchange_rate`, and `simulation_result`.
--- a/docs/staging_environment_setup.md
+++ b/docs/staging_environment_setup.md
@@ -0,0 +1,101 @@
 # Staging Environment Setup
 This guide outlines how to provision and validate the CalMiner staging database using `scripts/setup_database.py`. It complements the local and CI-focused instructions in `docs/quickstart.md`.
 ## Prerequisites
 - Network access to the staging infrastructure (VPN or bastion, as required by ops).
 - Provisioned PostgreSQL instance with superuser or delegated admin credentials for maintenance.
 - Application credentials (role + password) dedicated to CalMiner staging.
 - The application repository checked out with Python dependencies installed (`pip install -r requirements.txt`).
 - Optional but recommended: a writable directory (for example `reports/`) to capture setup logs.
 > Replace the placeholder values in the examples below with the actual host, port, and credential details supplied by ops.
 ## Environment Configuration
 Populate the following environment variables before invoking the setup script. Store them in a secure location such as `config/setup_staging.env` (excluded from source control) and load them with `dotenv` or your shell profile.
 | Variable | Description |
 | --- | --- |
 | `DATABASE_HOST` | Staging PostgreSQL hostname or IP (for example `staging-db.internal`). |
 | `DATABASE_PORT` | Port exposed by the staging PostgreSQL service (default `5432`). |
 | `DATABASE_NAME` | CalMiner staging database name (for example `calminer_staging`). |
 | `DATABASE_USER` | Application role used by the FastAPI app (for example `calminer_app`). |
 | `DATABASE_PASSWORD` | Password for the application role. |
 | `DATABASE_SCHEMA` | Optional non-public schema; omit or set to `public` otherwise. |
 | `DATABASE_SUPERUSER` | Administrative role with rights to create roles/databases (for example `calminer_admin`). |
 | `DATABASE_SUPERUSER_PASSWORD` | Password for the administrative role. |
 | `DATABASE_SUPERUSER_DB` | Database to connect to for admin tasks (default `postgres`). |
 | `DATABASE_ADMIN_URL` | Optional DSN that overrides the granular admin settings above. |
 You may also set `DATABASE_URL` for application runtime convenience, but the setup script only requires the values listed in the table.
 ### Loading Variables (PowerShell example)
 ```powershell
 $env:DATABASE_HOST = "staging-db.internal"
 $env:DATABASE_PORT = "5432"
 $env:DATABASE_NAME = "calminer_staging"
 $env:DATABASE_USER = "calminer_app"
 $env:DATABASE_PASSWORD = "<app-password>"
 $env:DATABASE_SUPERUSER = "calminer_admin"
 $env:DATABASE_SUPERUSER_PASSWORD = "<admin-password>"
 $env:DATABASE_SUPERUSER_DB = "postgres"
 ```
 For bash shells, export the same variables using `export VARIABLE=value` or load them through `dotenv`.
 ## Setup Workflow
 Run the setup script in three phases to validate idempotency and capture diagnostics:
 1. **Dry run (diagnostic):**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_dry_run.log
   ```
   Confirm that the script reports planned actions without failures. If the application role is missing, a dry run will log skip messages until a live run creates the role.
 2. **Apply changes:**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_apply.log
   ```
   Verify the log for successful database creation, role grants, migration execution, and seed verification.
 3. **Post-apply dry run:**
   ```powershell
   python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
     2>&1 | Tee-Object -FilePath reports/setup_staging_post_apply.log
   ```
   This run should confirm that all schema objects, migrations, and seed data are already in place.
 ## Validation Checklist
 - [ ] Confirm the staging application can connect using the application DSN (for example, run `pytest tests/e2e/test_smoke.py` against staging or trigger a smoke test workflow).
 - [ ] Inspect `schema_migrations` to ensure the baseline migration (`000_base.sql`) is recorded.
 - [ ] Spot-check seeded reference data (`currency`, `measurement_unit`) for correctness.
 - [ ] Capture and archive the three setup logs in a shared location for audit purposes.
 ## Troubleshooting
 - If the dry run reports skipped actions because the application role does not exist, proceed with the live run; subsequent dry runs will validate as expected.
 - Connection errors usually stem from network restrictions or incorrect credentials. Validate reachability with `psql` or `pg_isready` using the same host/port and credentials.
 - For permission issues during migrations or seeding, confirm the admin role has rights on the target database and that the application role inherits the expected privileges.
 ## Rollback Guidance
 - Database creation and role grants register rollback actions when not running in dry-run mode. If a later step fails, rerun the script without `--dry-run`; it will automatically revoke grants or drop newly created resources as part of the rollback routine.
 - For staged environments where manual intervention is required, coordinate with ops before dropping databases or roles.
 ## Next Steps
 - Keep this document updated as staging infrastructure evolves (for example, when migrating to managed services or rotating credentials).
 - Once staging validation is complete, summarize the outcome in `.github/instructions/DONE.TODO.md` and cross-link the relevant log files.
--- a/scripts/setup_database.py
+++ b/scripts/setup_database.py
@@ -559,6 +559,26 @@ class DatabaseSetup:
                        schema_name,
                    )
    def application_role_exists(self) -> bool:
        try:
            with self._admin_connection(self.config.admin_database) as conn:
                with conn.cursor() as cursor:
                    try:
                        cursor.execute(
                            "SELECT 1 FROM pg_roles WHERE rolname = %s",
                            (self.config.user,),
                        )
                    except psycopg2.Error as exc:
                        message = (
                            "Unable to inspect existing roles while checking for role '%s'."
                            " Verify admin permissions."
                        ) % self.config.user
                        logger.error(message)
                        raise RuntimeError(message) from exc
                    return cursor.fetchone() is not None
        except RuntimeError:
            raise
    def _admin_connection(self, database: Optional[str] = None) -> PGConnection:
        target_db = database or self.config.admin_database
        dsn = self.config.admin_dsn(database)
@@ -1101,13 +1121,26 @@ def main() -> None:
    setup = DatabaseSetup(config, dry_run=args.dry_run)
    admin_tasks_requested = args.ensure_database or args.ensure_role or args.ensure_schema
    application_tasks_requested = args.initialize_schema or args.run_migrations
    if admin_tasks_requested:
        setup.validate_admin_connection()
    app_validated = False
    def ensure_application_connection_for(operation: str) -> bool:
        nonlocal app_validated
        if app_validated:
            return True
        if setup.dry_run and not setup.application_role_exists():
            logger.info(
                "Dry run: skipping %s because application role '%s' does not exist yet.",
                operation,
                setup.config.user,
            )
            return False
        setup.validate_application_connection()
        app_validated = True
        return True
    try:
        if args.ensure_database:
            setup.ensure_database()
@@ -1117,21 +1150,20 @@ def main() -> None:
            setup.ensure_schema()
        if args.initialize_schema:
-            if not app_validated and application_tasks_requested:
+            if ensure_application_connection_for(
-                setup.validate_application_connection()
+                "SQLAlchemy schema initialization"
-                app_validated = True
+            ):
                setup.initialize_schema()
        if args.run_migrations:
-            if not app_validated and application_tasks_requested:
+            if ensure_application_connection_for("migration execution"):
-                setup.validate_application_connection()
+                migrations_path = (
-                app_validated = True
+                    Path(args.migrations_dir)
-            migrations_path = Path(
+                    if args.migrations_dir
-                args.migrations_dir) if args.migrations_dir else None
+                    else None
                )
                setup.run_migrations(migrations_path)
        if args.seed_data:
-            if not app_validated:
+            if ensure_application_connection_for("baseline data seeding"):
                setup.validate_application_connection()
                app_validated = True
                setup.seed_baseline_data(dry_run=args.dry_run)
    except Exception:
        if not setup.dry_run: