calminer/docs/architecture.md

# Architecture Documentation

## Overview

CalMiner is a FastAPI application that collects mining project inputs, persists scenario-specific records, and surfaces aggregated insights. The platform targets Monte Carlo driven planning, with deterministic CRUD features in place and simulation logic staged for future work.

Frontend components are server-rendered Jinja2 templates, with Chart.js powering the dashboard visualization.

The backend leverages SQLAlchemy for ORM mapping to a PostgreSQL database.

## System Components

- **FastAPI backend** (`main.py`, `routes/`): hosts REST endpoints for scenarios, parameters, costs, consumption, production, equipment, maintenance, simulations, and reporting. Each router encapsulates request/response schemas and DB access patterns, leveraging a shared dependency module (`routes/dependencies.get_db`) for SQLAlchemy session management.
- **Service layer** (`services/`): houses business logic. `services/reporting.py` produces statistical summaries, while `services/simulation.py` provides the Monte Carlo integration point.
- **Persistence** (`models/`, `config/database.py`): SQLAlchemy models map to PostgreSQL tables in schema `bricsium_platform`. Relationships connect scenarios to derived domain entities.
- **Presentation** (`templates/`, `components/`): server-rendered views extend a shared `base.html` layout with a persistent left sidebar, pull global styles from `static/css/main.css`, and surface data entry (scenario and parameter forms) alongside the Chart.js-powered dashboard.
  - **Reusable partials** (`templates/partials/components.html`): macro library that standardises select inputs, feedback/empty states, and table wrappers so pages remain consistent while keeping DOM hooks stable for existing JavaScript modules.
- **Middleware** (`middleware/validation.py`): applies JSON validation before requests reach routers.
- **Testing** (`tests/unit/`): pytest suite covering route and service behavior, including UI rendering checks and negative-path router validation tests to ensure consistent HTTP error semantics. Playwright end-to-end coverage is planned for core smoke flows (dashboard load, scenario inputs, reporting) and will attach in CI once scaffolding is completed.

## Runtime Flow

1. Users navigate to form templates or API clients to manage scenarios, parameters, and operational data.
2. FastAPI routers validate payloads with Pydantic models, then delegate to SQLAlchemy sessions for persistence.
3. Simulation runs (placeholder `services/simulation.py`) will consume stored parameters to emit iteration results via `/api/simulations/run`.
4. Reporting requests POST simulation outputs to `/api/reporting/summary`; the reporting service calculates aggregates (count, min/max, mean, median, percentiles, standard deviation, variance, and tail-risk metrics at the 95% confidence level).
5. `templates/Dashboard.html` fetches summaries, renders metric cards, and plots distribution charts with Chart.js for stakeholder review.

### Dashboard Flow Review — 2025-10-20

- The dashboard now renders at the root route (`/`) and leverages `_load_dashboard` within `routes/ui.py` to aggregate scenarios, parameters, costs, production, consumption, simulations, and maintenance data before templating.
- Client-side logic consumes a server-rendered JSON payload and can request live refreshes via `GET /ui/dashboard/data`, ensuring charts and summary cards stay synchronized with the database without a full page reload.
- Chart.js visualises cost mix (CAPEX vs OPEX) and activity throughput (production vs consumption) per scenario. When datasets are empty the UI swaps the chart canvas for contextual guidance.
- Simulation metrics draw on the aggregated reporting service; once `simulation_result` persists records, the dashboard lists recent runs with iteration counts, mean values, and percentile highlights.
- Maintenance reminders pull the next five scheduled events, providing equipment and scenario context alongside formatted costs.

### Reporting Pipeline and UI Integration

1. **Data Sources**

   - Scenario-linked calculations (costs, consumption, production) produce raw figures stored in dedicated tables (`capex`, `opex`, `consumption`, `production_output`).
   - Monte Carlo simulations (currently transient) generate arrays of `{ "result": float }` tuples that the dashboard or downstream tooling passes directly to reporting endpoints.

2. **API Contract**

   - `POST /api/reporting/summary` accepts a JSON array of result objects and validates shape through `_validate_payload` in `routes/reporting.py`.
   - On success it returns a structured payload (`ReportSummary`) containing count, mean, median, min/max, standard deviation, and percentile values, all as floats.

3. **Service Layer**

   - `services/reporting.generate_report` converts the sanitized payload into descriptive statistics using Python’s standard library (`statistics` module) to avoid external dependencies.
   - The service remains stateless; no database read/write occurs, which keeps summary calculations deterministic and idempotent.
   - Extended KPIs (surfaced in the API and dashboard):
     - `variance`: population variance computed as the square of the population standard deviation.
     - `percentile_5` and `percentile_95`: lower and upper tail interpolated percentiles for sensitivity bounds.
     - `value_at_risk_95`: 5th percentile threshold representing the minimum outcome within a 95% confidence band.
     - `expected_shortfall_95`: mean of all outcomes at or below the `value_at_risk_95`, highlighting tail exposure.

4. **UI Consumption**

   - `templates/Dashboard.html` posts the user-provided dataset to the summary endpoint, renders metric cards for each field, and charts the distribution using Chart.js.
   - `SUMMARY_FIELDS` now includes variance, 5th/10th/90th/95th percentiles, and tail-risk metrics (VaR/Expected Shortfall at 95%); tooltip annotations surface the tail metrics alongside the percentile line chart.
   - Error handling surfaces HTTP failures inline so users can address malformed JSON or backend availability issues without leaving the page.

5. **Future Integration Points**
   - Once `/api/simulations/run` persists to `simulation_result`, the dashboard can fetch precalculated runs per scenario, removing the manual JSON step.
   - Additional reporting endpoints (e.g., scenario comparisons) can reuse the same service layer, ensuring consistency across UI and API consumers.

## Data Model Highlights

- `scenario`: central entity describing a mining scenario; owns relationships to cost, consumption, production, equipment, and maintenance tables.
- `capex`, `opex`: monetary tracking linked to scenarios.
- `consumption`: resource usage entries parameterized by scenario and description.
- `parameter`: scenario inputs with base `value` and optional distribution linkage via `distribution_id`, `distribution_type`, and JSON `distribution_parameters` to support simulation sampling.
- `production_output`: production metrics per scenario.
- `equipment` and `maintenance`: equipment inventory and maintenance events with dates/costs.
- `simulation_result`: staging table for future Monte Carlo outputs (not yet populated by `run_simulation`).

Foreign keys secure referential integrity between domain tables and their scenarios, enabling per-scenario analytics.

## Integrations and Future Work

- **Monte Carlo engine**: `services/simulation.py` will incorporate stochastic sampling (e.g., NumPy, SciPy) to populate `simulation_result` and feed reporting.
- **Persistence of results**: `/api/simulations/run` currently returns in-memory results; next iteration should persist to `simulation_result` and reference scenarios.
- **Authentication**: not yet implemented; all endpoints are open.
- **Deployment**: documentation focuses on local development; containerization and CI/CD pipelines remain to be defined.

For extended diagrams and setup instructions reference:

- [docs/development_setup.md](development_setup.md) — environment provisioning and tooling.
- [docs/testing.md](testing.md) — pytest workflow and coverage expectations.
- [docs/mvp.md](mvp.md) — roadmap and milestone scope.
- [docs/implementation_plan.md](implementation_plan.md) — feature breakdown aligned with the TODO tracker.
- [docs/architecture_overview.md](architecture_overview.md) — supplementary module map and request flow diagram.

### UI Frontend-Backend Integration Requirements — 2025-10-20

### Reusable Template Components — 2025-10-21

To reduce duplication across form-centric pages, shared Jinja macros live in `templates/partials/components.html`.

- `select_field(...)`: renders labeled `<select>` controls with consistent placeholder handling and optional preselection. Existing JavaScript modules continue to target the generated IDs, so template calls must pass the same identifiers (`consumption-form-scenario`, etc.).
- `feedback(...)` and `empty_state(...)`: wrap status messages in standard classes (`feedback`, `empty-state`) with optional `hidden` toggles so scripts can control visibility without reimplementing markup.
- `table_container(...)`: provides a semantic wrapper and optional heading around tabular content; the `{% call %}` body supplies the `<thead>`, `<tbody>`, and `<tfoot>` elements while the macro applies the `table-container` class and manages hidden state.

Pages like `templates/consumption.html` and `templates/costs.html` already consume these helpers to keep markup aligned while preserving existing JavaScript selectors.

Pages should import these macros via `{% from "partials/components.html" import ... with context %}` to ensure scenario lists or other context variables stay available inside the macro body.

### Styling Audit Notes — 2025-10-21

- **Spacing**: Panels (`section.panel`) sometimes lack consistent vertical rhythm between headings, form grids, and tables. Extra top/bottom margin utilities would help align content.
- **Typography**: Headings rely on browser defaults; font-size scale is uneven between `<h2>` and `<h3>`. Define explicit scale tokens (e.g., `--font-size-lg`) for predictable sizing.
- **Forms**: `.form-grid` uses fixed column gaps that collapse on small screens; introduce responsive grid rules to stack gracefully below ~768px.
- **Tables**: `.table-container` wrappers need overflow handling for narrow viewports; consider `overflow-x: auto` with padding adjustments.
- **Feedback/Empty states**: Messages use default font weight and spacing; a utility class for margin/padding would ensure consistent separation from forms or tables.

### Styling Utilities — 2025-10-21

- Added spacing and typography CSS variables (e.g., `--space-sm`, `--font-size-xl`) and applied them to `.panel` and `.form-grid` elements for consistent vertical rhythm.
- Standardised heading weights/sizes within panels so `<h2>` and `<h3>` share explicit scale tokens.
- Updated form controls to use the new spacing tokens, preparing the layout for further responsive tweaks.

#### Scenarios (`templates/ScenarioForm.html`)

- **Data**: `GET /api/scenarios/` to list existing scenarios for navigation and to hydrate dropdowns in downstream forms; optional aggregation of scenario counts for dashboard badges.
- **Actions**: `POST /api/scenarios/` to create new scenarios; future delete/update flows would reuse the same router once endpoints exist.

#### Parameters (`templates/ParameterInput.html`)

- **Data**: Scenario catalogue from `GET /api/scenarios/`; parameter inventory via `GET /api/parameters/` with client-side filtering by `scenario_id`; optional distribution catalogue from `models/distribution` when exposed.
- **Actions**: `POST /api/parameters/` to add parameters; extend UI to support editing or deleting parameters as routes arrive.

#### Costs (`templates/costs.html`)

- **Data**: CAPEX list `GET /api/costs/capex`; OPEX list `GET /api/costs/opex`; computed totals grouped by scenario for summary panels.
- **Actions**: `POST /api/costs/capex` and `POST /api/costs/opex` for new entries; planned future edits/deletes once routers expand.

#### Consumption (`templates/consumption.html`)

- **Data**: Consumption entries by scenario via `GET /api/consumption/`; scenario metadata for filtering and empty-state messaging.
- **Actions**: `POST /api/consumption/` to log consumption items; include optimistic UI refresh after persistence.

#### Production (`templates/production.html`)

- **Data**: Production records from `GET /api/production/`; scenario list for filter chips; optional aggregates (totals, averages) derived client-side.
- **Actions**: `POST /api/production/` to capture production outputs; notify users when data drives downstream reporting.

#### Equipment (`templates/equipment.html`)

- **Data**: Equipment roster from `GET /api/equipment/`; scenario context to scope inventory; maintenance counts per asset once joins are introduced.
- **Actions**: `POST /api/equipment/` to add equipment; integrate delete/edit when endpoints arrive.

#### Maintenance (`templates/maintenance.html`)

- **Data**: Maintenance schedule `GET /api/maintenance/` with pagination support (`skip`, `limit`); equipment list (`GET /api/equipment/`) to map IDs to names; scenario catalogue for filtering.
- **Actions**: CRUD operations through `POST /api/maintenance/`, `PUT /api/maintenance/{id}`, `DELETE /api/maintenance/{id}`; view detail via `GET /api/maintenance/{id}` for modal display.

#### Simulations (`templates/simulations.html`)

- **Data**: Scenario list `GET /api/scenarios/`; parameter sets `GET /api/parameters/` filtered client-side; persisted simulation results (future) via dedicated endpoint or by caching `/api/simulations/run` responses.
- **Actions**: `POST /api/simulations/run` to execute simulations; surface run configuration (iterations, seed) and feed response summary into reporting page.

#### Reporting (`templates/reporting.html` and `templates/Dashboard.html`)

- **Data**: Simulation outputs either from recent `/api/simulations/run` calls or a planned `GET` endpoint against `simulation_result`; summary metrics via `POST /api/reporting/summary`; scenario metadata for header context.
- **Actions**: Trigger summary refreshes by posting batched results; allow export/download actions once implemented; integrate future comparison requests across scenarios.