3 Commits

11 changed files with 249 additions and 121 deletions

3
.gitignore vendored
View File

@@ -16,6 +16,9 @@ env/
# environment variables # environment variables
.env .env
*.env
# except example files
!config/*.env.example
# github instruction files # github instruction files
.github/instructions/ .github/instructions/

View File

@@ -0,0 +1,11 @@
# Sample environment configuration for staging deployment
DATABASE_HOST=staging-db.internal
DATABASE_PORT=5432
DATABASE_NAME=calminer_staging
DATABASE_USER=calminer_app
DATABASE_PASSWORD=<app-password>
# Admin connection used for provisioning database and roles
DATABASE_SUPERUSER=postgres
DATABASE_SUPERUSER_PASSWORD=<admin-password>
DATABASE_SUPERUSER_DB=postgres

View File

@@ -1,13 +1,14 @@
# Sample environment configuration for running scripts/setup_database.py against a test instance # Sample environment configuration for running scripts/setup_database.py against a test instance
DATABASE_DRIVER=postgresql DATABASE_DRIVER=postgresql
DATABASE_HOST=192.168.88.35 DATABASE_HOST=postgres
DATABASE_PORT=5432 DATABASE_PORT=5432
DATABASE_NAME=calminer_test DATABASE_NAME=calminer_test
DATABASE_USER=calminer_test DATABASE_USER=calminer_test
DATABASE_PASSWORD=calminer_test_password DATABASE_PASSWORD=<test-password>
DATABASE_SCHEMA=public # optional: specify schema if different from 'public'
#DATABASE_SCHEMA=public
# Admin connection used for provisioning database and roles # Admin connection used for provisioning database and roles
DATABASE_SUPERUSER=postgres DATABASE_SUPERUSER=postgres
DATABASE_SUPERUSER_PASSWORD=M11ffpgm. DATABASE_SUPERUSER_PASSWORD=<superuser-password>
DATABASE_SUPERUSER_DB=postgres DATABASE_SUPERUSER_DB=postgres

View File

@@ -0,0 +1,23 @@
version: "3.9"
services:
postgres:
image: postgres:16-alpine
container_name: calminer_postgres_local
restart: unless-stopped
environment:
POSTGRES_DB: calminer_local
POSTGRES_USER: calminer
POSTGRES_PASSWORD: secret
ports:
- "5433:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U calminer -d calminer_local"]
interval: 10s
timeout: 5s
retries: 10
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:

View File

@@ -37,7 +37,7 @@ The application can be deployed in multiple environments to support development,
The development environment is set up for local development and testing. It includes: The development environment is set up for local development and testing. It includes:
- Local PostgreSQL instance - Local PostgreSQL instance (docker compose recommended, script available at `docker-compose.postgres.yml`)
- FastAPI server running in debug mode - FastAPI server running in debug mode
### Testing Environment ### Testing Environment

View File

@@ -21,7 +21,7 @@ CalMiner uses a combination of unit, integration, and end-to-end tests to ensure
### CI/CD ### CI/CD
- Use Gitea Actions for CI/CD; workflows live under `.gitea/workflows/`. - Use Gitea Actions for CI/CD; workflows live under `.gitea/workflows/`.
- `test.yml` runs on every push with cached Python dependencies via `actions/cache@v3`. - `test.yml` runs on every push, provisions a temporary Postgres 16 service, waits for readiness, executes the setup script in dry-run and live modes, installs Playwright browsers, and finally runs the full pytest suite.
- `build-and-push.yml` builds the Docker image with `docker/build-push-action@v2`, reusing GitHub Actions cache-backed layers, and pushes to the Gitea registry. - `build-and-push.yml` builds the Docker image with `docker/build-push-action@v2`, reusing GitHub Actions cache-backed layers, and pushes to the Gitea registry.
- `deploy.yml` connects to the target host (via `appleboy/ssh-action`) to pull the freshly pushed image and restart the container. - `deploy.yml` connects to the target host (via `appleboy/ssh-action`) to pull the freshly pushed image and restart the container.
- Mandatory secrets: `REGISTRY_USERNAME`, `REGISTRY_PASSWORD`, `REGISTRY_URL`, `SSH_HOST`, `SSH_USERNAME`, `SSH_PRIVATE_KEY`. - Mandatory secrets: `REGISTRY_USERNAME`, `REGISTRY_PASSWORD`, `REGISTRY_URL`, `SSH_HOST`, `SSH_USERNAME`, `SSH_PRIVATE_KEY`.
@@ -99,10 +99,11 @@ pytest tests/e2e/ --headed
`test.yml` encapsulates the steps below: `test.yml` encapsulates the steps below:
- Check out the repository and set up Python 3.10. - Check out the repository and set up Python 3.10.
- Restore the pip cache (keyed by `requirements.txt`). - Configure the runner's apt proxy (if available), install project dependencies (requirements + test extras), and download Playwright browsers.
- Install project dependencies and Playwright browsers (if needed for E2E).
- Run `pytest` (extend with `--cov` flags when enforcing coverage). - Run `pytest` (extend with `--cov` flags when enforcing coverage).
> The pip cache step is temporarily disabled in `test.yml` until the self-hosted cache service is exposed (see `docs/ci-cache-troubleshooting.md`).
`build-and-push.yml` adds: `build-and-push.yml` adds:
- Registry login using repository secrets. - Registry login using repository secrets.

View File

@@ -120,6 +120,45 @@ Typical log output confirms:
After a successful run the target database contains all application tables plus `schema_migrations`, and that table records each applied migration file. New installations only record `000_base.sql`; upgraded environments retain historical entries alongside the baseline. After a successful run the target database contains all application tables plus `schema_migrations`, and that table records each applied migration file. New installations only record `000_base.sql`; upgraded environments retain historical entries alongside the baseline.
### Local Postgres via Docker Compose
For local validation without installing Postgres directly, use the provided compose file:
```powershell
docker compose -f docker-compose.postgres.yml up -d
```
#### Summary
1. Start the Postgres container with `docker compose -f docker-compose.postgres.yml up -d`.
2. Export the granular database environment variables (host `127.0.0.1`, port `5433`, database `calminer_local`, user/password `calminer`/`secret`).
3. Run the setup script twice: first with `--dry-run` to preview actions, then without it to apply changes.
4. When finished, stop and optionally remove the container/volume using `docker compose -f docker-compose.postgres.yml down`.
The service exposes Postgres 16 on `localhost:5433` with database `calminer_local` and role `calminer`/`secret`. When the container is running, set the granular environment variables before invoking the setup script:
```powershell
$env:DATABASE_DRIVER = 'postgresql'
$env:DATABASE_HOST = '127.0.0.1'
$env:DATABASE_PORT = '5433'
$env:DATABASE_USER = 'calminer'
$env:DATABASE_PASSWORD = 'secret'
$env:DATABASE_NAME = 'calminer_local'
$env:DATABASE_SCHEMA = 'public'
python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v
python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data -v
```
When testing is complete, shut down the container (and optional persistent volume) with:
```powershell
docker compose -f docker-compose.postgres.yml down
docker volume rm calminer_postgres_local_postgres_data # optional cleanup
```
Document successful runs (or issues encountered) in `.github/instructions/DONE.TODO.md` for future reference.
### Seeding reference data ### Seeding reference data
`scripts/seed_data.py` provides targeted control over the baseline datasets when the full setup script is not required: `scripts/seed_data.py` provides targeted control over the baseline datasets when the full setup script is not required:
@@ -154,7 +193,7 @@ The `.gitea/workflows/test.yml` job spins up a temporary PostgreSQL 16 container
| Variable | Value | Purpose | | Variable | Value | Purpose |
| --- | --- | --- | | --- | --- | --- |
| `DATABASE_DRIVER` | `postgresql` | Signals the driver to the setup script | | `DATABASE_DRIVER` | `postgresql` | Signals the driver to the setup script |
| `DATABASE_HOST` | `127.0.0.1` | Points to the linked job service | | `DATABASE_HOST` | `postgres` | Hostname of the Postgres job service container |
| `DATABASE_PORT` | `5432` | Default service port | | `DATABASE_PORT` | `5432` | Default service port |
| `DATABASE_NAME` | `calminer_ci` | Target database created by the workflow | | `DATABASE_NAME` | `calminer_ci` | Target database created by the workflow |
| `DATABASE_USER` | `calminer` | Application role used during tests | | `DATABASE_USER` | `calminer` | Application role used during tests |
@@ -166,7 +205,19 @@ The `.gitea/workflows/test.yml` job spins up a temporary PostgreSQL 16 container
The workflow also updates `DATABASE_URL` for pytest to point at the CI Postgres instance. Existing tests continue to work unchanged, since SQLAlchemy reads the URL exactly as it does locally. The workflow also updates `DATABASE_URL` for pytest to point at the CI Postgres instance. Existing tests continue to work unchanged, since SQLAlchemy reads the URL exactly as it does locally.
Because the workflow provisions everything inline, no repository or organization secrets need to be configured for basic CI runs. If you later move the setup step to staging or production pipelines, replace these inline values with secrets managed by the CI platform. Because the workflow provisions everything inline, no repository or organization secrets need to be configured for basic CI runs. If you later move the setup step to staging or production pipelines, replace these inline values with secrets managed by the CI platform. When running on self-hosted runners behind an HTTP proxy or apt cache, ensure Playwright dependencies and OS packages inherit the same proxy settings that the workflow configures prior to installing browsers.
### Staging environment workflow
Use the staging checklist in `docs/staging_environment_setup.md` when running the setup script against the shared environment. A sample variable file (`config/setup_staging.env`) records the expected inputs (host, port, admin/application roles); copy it outside the repository or load the values securely via your shell before executing the workflow.
Recommended execution order:
1. Dry run with `--dry-run -v` to confirm connectivity and review planned operations. Capture the output to `reports/setup_staging_dry_run.log` (or similar) for auditing.
2. Execute the live run with the same flags minus `--dry-run` to provision the database, role grants, migrations, and seed data. Save the log as `reports/setup_staging_apply.log`.
3. Repeat the dry run to verify idempotency and record the result (for example `reports/setup_staging_post_apply.log`).
Record any issues in `.github/instructions/TODO.md` or `.github/instructions/DONE.TODO.md` as appropriate so the team can track follow-up actions.
## Database Objects ## Database Objects

View File

@@ -0,0 +1,101 @@
# Staging Environment Setup
This guide outlines how to provision and validate the CalMiner staging database using `scripts/setup_database.py`. It complements the local and CI-focused instructions in `docs/quickstart.md`.
## Prerequisites
- Network access to the staging infrastructure (VPN or bastion, as required by ops).
- Provisioned PostgreSQL instance with superuser or delegated admin credentials for maintenance.
- Application credentials (role + password) dedicated to CalMiner staging.
- The application repository checked out with Python dependencies installed (`pip install -r requirements.txt`).
- Optional but recommended: a writable directory (for example `reports/`) to capture setup logs.
> Replace the placeholder values in the examples below with the actual host, port, and credential details supplied by ops.
## Environment Configuration
Populate the following environment variables before invoking the setup script. Store them in a secure location such as `config/setup_staging.env` (excluded from source control) and load them with `dotenv` or your shell profile.
| Variable | Description |
| --- | --- |
| `DATABASE_HOST` | Staging PostgreSQL hostname or IP (for example `staging-db.internal`). |
| `DATABASE_PORT` | Port exposed by the staging PostgreSQL service (default `5432`). |
| `DATABASE_NAME` | CalMiner staging database name (for example `calminer_staging`). |
| `DATABASE_USER` | Application role used by the FastAPI app (for example `calminer_app`). |
| `DATABASE_PASSWORD` | Password for the application role. |
| `DATABASE_SCHEMA` | Optional non-public schema; omit or set to `public` otherwise. |
| `DATABASE_SUPERUSER` | Administrative role with rights to create roles/databases (for example `calminer_admin`). |
| `DATABASE_SUPERUSER_PASSWORD` | Password for the administrative role. |
| `DATABASE_SUPERUSER_DB` | Database to connect to for admin tasks (default `postgres`). |
| `DATABASE_ADMIN_URL` | Optional DSN that overrides the granular admin settings above. |
You may also set `DATABASE_URL` for application runtime convenience, but the setup script only requires the values listed in the table.
### Loading Variables (PowerShell example)
```powershell
$env:DATABASE_HOST = "staging-db.internal"
$env:DATABASE_PORT = "5432"
$env:DATABASE_NAME = "calminer_staging"
$env:DATABASE_USER = "calminer_app"
$env:DATABASE_PASSWORD = "<app-password>"
$env:DATABASE_SUPERUSER = "calminer_admin"
$env:DATABASE_SUPERUSER_PASSWORD = "<admin-password>"
$env:DATABASE_SUPERUSER_DB = "postgres"
```
For bash shells, export the same variables using `export VARIABLE=value` or load them through `dotenv`.
## Setup Workflow
Run the setup script in three phases to validate idempotency and capture diagnostics:
1. **Dry run (diagnostic):**
```powershell
python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
2>&1 | Tee-Object -FilePath reports/setup_staging_dry_run.log
```
Confirm that the script reports planned actions without failures. If the application role is missing, a dry run will log skip messages until a live run creates the role.
2. **Apply changes:**
```powershell
python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data -v `
2>&1 | Tee-Object -FilePath reports/setup_staging_apply.log
```
Verify the log for successful database creation, role grants, migration execution, and seed verification.
3. **Post-apply dry run:**
```powershell
python scripts/setup_database.py --ensure-database --ensure-role --ensure-schema --initialize-schema --run-migrations --seed-data --dry-run -v `
2>&1 | Tee-Object -FilePath reports/setup_staging_post_apply.log
```
This run should confirm that all schema objects, migrations, and seed data are already in place.
## Validation Checklist
- [ ] Confirm the staging application can connect using the application DSN (for example, run `pytest tests/e2e/test_smoke.py` against staging or trigger a smoke test workflow).
- [ ] Inspect `schema_migrations` to ensure the baseline migration (`000_base.sql`) is recorded.
- [ ] Spot-check seeded reference data (`currency`, `measurement_unit`) for correctness.
- [ ] Capture and archive the three setup logs in a shared location for audit purposes.
## Troubleshooting
- If the dry run reports skipped actions because the application role does not exist, proceed with the live run; subsequent dry runs will validate as expected.
- Connection errors usually stem from network restrictions or incorrect credentials. Validate reachability with `psql` or `pg_isready` using the same host/port and credentials.
- For permission issues during migrations or seeding, confirm the admin role has rights on the target database and that the application role inherits the expected privileges.
## Rollback Guidance
- Database creation and role grants register rollback actions when not running in dry-run mode. If a later step fails, rerun the script without `--dry-run`; it will automatically revoke grants or drop newly created resources as part of the rollback routine.
- For staged environments where manual intervention is required, coordinate with ops before dropping databases or roles.
## Next Steps
- Keep this document updated as staging infrastructure evolves (for example, when migrating to managed services or rotating credentials).
- Once staging validation is complete, summarize the outcome in `.github/instructions/DONE.TODO.md` and cross-link the relevant log files.

View File

@@ -1,29 +0,0 @@
-- CalMiner Migration: add currency and unit metadata columns
-- Date: 2025-10-21
-- Purpose: align persisted schema with API changes introducing currency selection for
-- CAPEX/OPEX costs and unit selection for consumption/production records.
BEGIN;
-- CAPEX / OPEX
ALTER TABLE capex
ADD COLUMN IF NOT EXISTS currency_code VARCHAR(3) NOT NULL DEFAULT 'USD';
ALTER TABLE opex
ADD COLUMN IF NOT EXISTS currency_code VARCHAR(3) NOT NULL DEFAULT 'USD';
-- Consumption tracking
ALTER TABLE consumption
ADD COLUMN IF NOT EXISTS unit_name VARCHAR(64);
ALTER TABLE consumption
ADD COLUMN IF NOT EXISTS unit_symbol VARCHAR(16);
-- Production output
ALTER TABLE production_output
ADD COLUMN IF NOT EXISTS unit_name VARCHAR(64);
ALTER TABLE production_output
ADD COLUMN IF NOT EXISTS unit_symbol VARCHAR(16);
COMMIT;

View File

@@ -1,66 +0,0 @@
-- Migration: create currency referential table and convert capex/opex to FK
-- Date: 2025-10-22
BEGIN;
-- 1) Create currency table
CREATE TABLE IF NOT EXISTS currency (
id SERIAL PRIMARY KEY,
code VARCHAR(3) NOT NULL UNIQUE,
name VARCHAR(128) NOT NULL,
symbol VARCHAR(8),
is_active BOOLEAN NOT NULL DEFAULT TRUE
);
-- 2) Seed some common currencies (idempotent)
INSERT INTO currency (code, name, symbol, is_active)
SELECT * FROM (VALUES
('USD','United States Dollar','$',TRUE),
('EUR','Euro','',TRUE),
('CLP','Chilean Peso','CLP$',TRUE),
('RMB','Chinese Yuan','¥',TRUE),
('GBP','British Pound','£',TRUE),
('CAD','Canadian Dollar','C$',TRUE),
('AUD','Australian Dollar','A$',TRUE)
) AS v(code,name,symbol,is_active)
ON CONFLICT (code) DO NOTHING;
-- 3) Add currency_id columns to capex and opex with nullable true to allow backfill
ALTER TABLE capex ADD COLUMN IF NOT EXISTS currency_id INTEGER;
ALTER TABLE opex ADD COLUMN IF NOT EXISTS currency_id INTEGER;
-- 4) Backfill currency_id using existing currency_code column where present
-- Only do this if the currency_code column exists
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name='capex' AND column_name='currency_code') THEN
UPDATE capex SET currency_id = (
SELECT id FROM currency WHERE code = capex.currency_code LIMIT 1
);
END IF;
IF EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name='opex' AND column_name='currency_code') THEN
UPDATE opex SET currency_id = (
SELECT id FROM currency WHERE code = opex.currency_code LIMIT 1
);
END IF;
END$$;
-- 5) Make currency_id non-nullable and add FK constraint, default to USD where missing
UPDATE currency SET is_active = TRUE WHERE code = 'USD';
-- Ensure any NULL currency_id uses USD
UPDATE capex SET currency_id = (SELECT id FROM currency WHERE code='USD') WHERE currency_id IS NULL;
UPDATE opex SET currency_id = (SELECT id FROM currency WHERE code='USD') WHERE currency_id IS NULL;
ALTER TABLE capex ALTER COLUMN currency_id SET NOT NULL;
ALTER TABLE opex ALTER COLUMN currency_id SET NOT NULL;
ALTER TABLE capex ADD CONSTRAINT fk_capex_currency FOREIGN KEY (currency_id) REFERENCES currency(id);
ALTER TABLE opex ADD CONSTRAINT fk_opex_currency FOREIGN KEY (currency_id) REFERENCES currency(id);
-- 6) Optionally drop old currency_code columns if they exist
ALTER TABLE capex DROP COLUMN IF EXISTS currency_code;
ALTER TABLE opex DROP COLUMN IF EXISTS currency_code;
COMMIT;

View File

@@ -559,6 +559,26 @@ class DatabaseSetup:
schema_name, schema_name,
) )
def application_role_exists(self) -> bool:
try:
with self._admin_connection(self.config.admin_database) as conn:
with conn.cursor() as cursor:
try:
cursor.execute(
"SELECT 1 FROM pg_roles WHERE rolname = %s",
(self.config.user,),
)
except psycopg2.Error as exc:
message = (
"Unable to inspect existing roles while checking for role '%s'."
" Verify admin permissions."
) % self.config.user
logger.error(message)
raise RuntimeError(message) from exc
return cursor.fetchone() is not None
except RuntimeError:
raise
def _admin_connection(self, database: Optional[str] = None) -> PGConnection: def _admin_connection(self, database: Optional[str] = None) -> PGConnection:
target_db = database or self.config.admin_database target_db = database or self.config.admin_database
dsn = self.config.admin_dsn(database) dsn = self.config.admin_dsn(database)
@@ -1101,13 +1121,26 @@ def main() -> None:
setup = DatabaseSetup(config, dry_run=args.dry_run) setup = DatabaseSetup(config, dry_run=args.dry_run)
admin_tasks_requested = args.ensure_database or args.ensure_role or args.ensure_schema admin_tasks_requested = args.ensure_database or args.ensure_role or args.ensure_schema
application_tasks_requested = args.initialize_schema or args.run_migrations
if admin_tasks_requested: if admin_tasks_requested:
setup.validate_admin_connection() setup.validate_admin_connection()
app_validated = False app_validated = False
def ensure_application_connection_for(operation: str) -> bool:
nonlocal app_validated
if app_validated:
return True
if setup.dry_run and not setup.application_role_exists():
logger.info(
"Dry run: skipping %s because application role '%s' does not exist yet.",
operation,
setup.config.user,
)
return False
setup.validate_application_connection()
app_validated = True
return True
try: try:
if args.ensure_database: if args.ensure_database:
setup.ensure_database() setup.ensure_database()
@@ -1117,22 +1150,21 @@ def main() -> None:
setup.ensure_schema() setup.ensure_schema()
if args.initialize_schema: if args.initialize_schema:
if not app_validated and application_tasks_requested: if ensure_application_connection_for(
setup.validate_application_connection() "SQLAlchemy schema initialization"
app_validated = True ):
setup.initialize_schema() setup.initialize_schema()
if args.run_migrations: if args.run_migrations:
if not app_validated and application_tasks_requested: if ensure_application_connection_for("migration execution"):
setup.validate_application_connection() migrations_path = (
app_validated = True Path(args.migrations_dir)
migrations_path = Path( if args.migrations_dir
args.migrations_dir) if args.migrations_dir else None else None
setup.run_migrations(migrations_path) )
setup.run_migrations(migrations_path)
if args.seed_data: if args.seed_data:
if not app_validated: if ensure_application_connection_for("baseline data seeding"):
setup.validate_application_connection() setup.seed_baseline_data(dry_run=args.dry_run)
app_validated = True
setup.seed_baseline_data(dry_run=args.dry_run)
except Exception: except Exception:
if not setup.dry_run: if not setup.dry_run:
setup.execute_rollbacks() setup.execute_rollbacks()