Add comprehensive architecture documentation and related scripts

- Introduced multiple architecture documentation files covering building block view, runtime view, deployment view, concepts, architecture decisions, quality requirements, technical risks, glossary, UI and styling, testing, CI, and development setup. - Migrated existing content from `architecture_overview.md` and `implementation_plan.md` into structured documentation. - Created scripts for checking broken links in documentation and formatting Markdown files for consistency. - Updated quickstart guide to provide clearer setup instructions and usage overview. - Removed outdated MVP features and testing strategy documents to streamline documentation.
2025-10-21 15:39:17 +02:00
parent 2bae273d9e
commit 4b3a15ed15
26 changed files with 796 additions and 675 deletions
--- a/README.md
+++ b/README.md
@@ -24,169 +24,10 @@ A range of features are implemented to support these functionalities.
 - **Modular Frontend Scripts**: Page-specific interactions now live in `static/js/` modules, keeping templates lean while enabling browser caching and reuse.
 - **Monte Carlo Simulation (in progress)**: Services and routes are scaffolded for future stochastic analysis.

-## Architecture
+## Documentation & quickstart

-The architecture is documented in [docs/architecture.md](docs/architecture.md).
+This repository contains detailed developer and architecture documentation in the `docs/` folder. For a short quickstart, troubleshooting notes, migration/backfill instructions and the current implementation status, see `docs/quickstart.md`.

-## Project Structure
+Key architecture documents: see `docs/architecture/README.md` for the arc42-based architecture documentation.

-The project is organized into several key directories:
-
- `models/`: Contains SQLAlchemy models representing database tables.
- `routes/`: Defines FastAPI routes for API endpoints; shared dependencies like `get_db` live in `routes/dependencies.py`.
- `services/`: Business logic and service layer.
- `components/`: Frontend components (to be defined).
- `config/`: Configuration files and settings.
- `middleware/`: Custom middleware for request/response processing.
- `tests/`: Unit and integration tests.
- `templates/`: Jinja2 HTML templates for server-side rendering.
- `docs/`: Documentation files.
-
-Key files include:
-
- `main.py`: FastAPI application entry point.
- `.env`: Environment variables for configuration.
- `requirements.txt`: Python dependencies.
-
-## Development
-
-The development setup instructions are provided in [docs/development_setup.md](docs/development_setup.md).
-
-To get started locally:
-
-```powershell
-# Clone the repository
-git clone https://git.allucanget.biz/allucanget/calminer.git
-cd calminer
-
-# Create and activate a virtual environment
-python -m venv .venv
-.\.venv\Scripts\Activate.ps1
-
-# Install dependencies
-pip install -r requirements.txt
-
-# Start the development server
-uvicorn main:app --reload
-```
-
-## Usage Overview
-
- **API base URL**: `http://localhost:8000/api`
- **Key routes**:
-  - `POST /api/scenarios/` create scenarios
-  - `POST /api/parameters/` manage process parameters; payload supports optional `distribution_id` or inline `distribution_type`/`distribution_parameters` fields for simulation metadata
-  - `POST /api/costs/capex` and `POST /api/costs/opex` capture project costs
-  - `POST /api/consumption/` add consumption entries
-  - `POST /api/production/` register production output
-  - `POST /api/equipment/` create equipment records
-  - `POST /api/maintenance/` log maintenance events
-  - `POST /api/reporting/summary` aggregate simulation results, returning count, mean/median, min/max, standard deviation, variance, percentile bands (5/10/90/95), value-at-risk (95%) and expected shortfall (95%)
- **UI entries** (rendered via FastAPI templates, also reachable from the sidebar):
- `GET /` operations overview dashboard
-  - `GET /ui/dashboard` legacy dashboard alias
-  - `GET /ui/scenarios` scenario creation form
-  - `GET /ui/parameters` parameter input form
-  - `GET /ui/costs`, `/ui/consumption`, `/ui/production`, `/ui/equipment`, `/ui/maintenance`, `/ui/simulations`, `/ui/reporting` placeholder views aligned with future integrations
-
-### Dashboard Preview
-
-1. Start the FastAPI server and navigate to `/`.
-2. Review the headline metrics, scenario snapshot table, and cost/activity charts sourced from the current database state.
-3. Use the "Refresh Dashboard" button to pull freshly aggregated data via `/ui/dashboard/data` without reloading the page.
-4. Populate scenarios, costs, production, consumption, simulations, and maintenance records to see charts and lists update.
-5. The legacy `/ui/dashboard` route remains available but now serves the same consolidated overview.
-
-## Testing
-
-Testing guidelines and best practices are outlined in [docs/testing.md](docs/testing.md).
-
-To execute the unit test suite:
-
-```powershell
-pytest
-```
-
-### End-to-End Tests
-
- Playwright-based E2E tests rely on a session-scoped `live_server` fixture that auto-starts the FastAPI app on `http://localhost:8001`, so no per-test `@pytest.mark.usefixtures("live_server")` annotations are required.
- The fixture now polls `[http://localhost:8001](http://localhost:8001)` until it responds (up to ~30s), ensuring the uvicorn subprocess is ready before Playwright starts navigation, then preloads `/` and waits for a `networkidle` state so sidebar navigation and global assets are ready for each test.
- Latest run (`pytest tests/e2e/` on 2025-10-21) passes end-to-end smoke and form coverage after aligning form selectors, titles, and the live server startup behaviour.
-
-### Coverage Snapshot (2025-10-21)
-
- `pytest --cov=. --cov-report=term-missing` reports **91%** overall coverage.
- Recent additions pushed `routes/ui.py` and `services/simulation.py` to 100%; remaining gaps are concentrated in `config/database.py`, several `models/*.py` loaders, and `services/reporting.py` (95%).
- Playwright specs under `tests/e2e/` are excluded from the coverage run to keep browser automation optional; their files show as uncovered because they are not executed in the `pytest --cov` workflow.
-
-## Database Objects
-
-The database is composed of several tables that store different types of information.
-
- **CAPEX** — `capex`: Stores data on capital expenditures.
- **OPEX** — `opex`: Contains information on operational expenditures.
- **Chemical consumption** — `chemical_consumption`: Tracks the consumption of chemical reagents.
- **Fuel consumption** — `fuel_consumption`: Records the amount of fuel consumed.
- **Water consumption** — `water_consumption`: Monitors the use of water.
- **Scrap consumption** — `scrap_consumption`: Tracks the consumption of scrap materials.
- **Production output** — `production_output`: Stores data on production output, such as tons produced and recovery rates.
- **Equipment operation** — `equipment_operation`: Contains operational data for each piece of equipment.
- **Ore batch** — `ore_batch`: Stores information on ore batches, including their grade and other characteristics.
- **Exchange rate** — `exchange_rate`: Contains currency exchange rates.
- **Simulation result** — `simulation_result`: Stores the results of the Monte Carlo simulations.
-
-### Currency normalization and migrations
-
-The project now includes a referential `currency` table and associated migration and backfill tooling to normalize free-text currency fields into a canonical lookup.
-
- New model: `models/currency.py` implements the `currency` table (id, code, name, symbol, is_active).
- Database migration: `scripts/migrations/20251022_create_currency_table_and_fks.sql` creates the `currency` table, seeds common currencies, adds `currency_id` to `capex`/`opex`, backfills where possible, and adds foreign key constraints. The migration is written to be idempotent and safe on databases where parts of the work were already applied.
- Backfill utility: `scripts/backfill_currency.py` is an idempotent, developer-friendly script to preview and apply backfill updates. It supports `--dry-run` and `--create-missing` flags.
- API & models: `models/capex.py` and `models/opex.py` now reference `currency_id` and include a compatibility property accepting a legacy `currency_code` during creates/updates. Routes accept either `currency_id` or `currency_code`; see `routes/costs.py` and the new `routes/currencies.py` (GET /api/currencies/).
- Frontend: server templates and `static/js` now fetch `/api/currencies/` to populate currency selects for cost forms so users can pick a normalized currency instance.
-
-How to run migrations and backfill (development)
-
-1. Ensure `DATABASE_URL` is set in your PowerShell session to point at a development Postgres instance.
-
-```powershell
-$env:DATABASE_URL = 'postgresql://user:pass@host/db'
-python scripts/run_migrations.py
-python scripts/backfill_currency.py --dry-run
-python scripts/backfill_currency.py --create-missing
-```
-
-Use `--dry-run` first to verify what will change. The migration and backfill scripts are designed to be safe and idempotent, but always back up production databases before applying schema changes.
-
-## Static Parameters
-
-These are values that are not expected to change frequently and are used for configuration purposes. Some examples include:
-
- **Currencies**: `currency_code`, `currency_name`.
- **Distribution types**: `distribution_name`.
- **Units**: `unit_name`, `unit_symbol`, `unit_system`, `conversion_to_base`.
- **Parameter categories**: `category_name`.
- **Material types**: `type_name`, `category`.
- **Chemical reagents**: `reagent_name`, `chemical_formula`.
- **Fuel**: `fuel_name`.
- **Water**: `water_type`.
- **Scrap material**: `scrap_name`.
-
-## Variables
-
-These are dynamic data points that are recorded over time and used in calculations and simulations. Some examples include:
-
- **CAPEX**: `amount`.
- **OPEX**: `amount`.
- **Chemical consumption**: `quantity`, `efficiency`, `waste_factor`.
- **Fuel consumption**: `quantity`.
- **Water consumption**: `quantity`.
- **Scrap consumption**: `quantity`.
- **Production output**: `tons_produced`, `recovery_rate`, `metal_content`, `metallurgical_loss`, `net_revenue`.
- **Equipment operation**: `hours_operated`, `downtime_hours`.
- **Ore batch**: `ore_grade`, `moisture`, `sulfur`, `chlorine`.
- **Exchange rate**: `rate`.
- **Parameter values**: `value`.
- **Simulation result**: NPV (`npv`), IRR (`irr`), EBITDA (`ebitda`), `net_revenue`.
- **Cementation parameters**: `temperature`, pH (`ph`), `reaction_time`, `copper_concentration`, `iron_surface_area`.
- **Precipitate product**: `density`, `melting_point`, `boiling_point`.
+For contributors: the `routes/`, `models/` and `services/` folders contain the primary application code. Tests and E2E specs are in `tests/`.
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -1,167 +1,41 @@
 # Architecture Documentation

-## Overview
+This folder contains the project's architecture documents split into arc42-inspired chapters (Markdown).

-CalMiner is a FastAPI application that collects mining project inputs, persists scenario-specific records, and surfaces aggregated insights. The platform targets Monte Carlo driven planning, with deterministic CRUD features in place and simulation logic staged for future work.
+Start here (per-chapter files live under `docs/architecture/`):

-Frontend components are server-rendered Jinja2 templates, with Chart.js powering the dashboard visualization.
+- `docs/architecture/README.md` — mapping and next steps (overview of the chapter layout)
+- `docs/architecture/01_introduction_and_goals.md` — overview and runtime flow
+- `docs/architecture/02_architecture_constraints.md` — constraints
+- `docs/architecture/03_context_and_scope.md` — context and external actors
+- `docs/architecture/04_solution_strategy.md` — solution strategy and simulation roadmap
+- `docs/architecture/04_solution_strategy_extended.md` — implementation plan & MVP roadmap
+- `docs/architecture/05_building_block_view.md` — system components and static structure
+- `docs/architecture/06_runtime_view.md` — reporting pipeline and runtime interactions
+- `docs/architecture/07_deployment_view.md` — deployment and infrastructure notes
+- `docs/architecture/08_concepts.md` — domain concepts and data model
+- `docs/architecture/09_architecture_decisions.md` — architecture decision records
+- `docs/architecture/10_quality_requirements.md` — quality targets and checks
+- `docs/architecture/11_technical_risks.md` — technical risks and mitigations
+- `docs/architecture/12_glossary.md` — glossary
+- `docs/architecture/13_ui_and_style.md` — UI templates, macros and style guidance
+- `docs/architecture/14_testing_ci.md` — testing strategy and CI guidance
+- `docs/architecture/15_development_setup.md` — local development setup

-The backend leverages SQLAlchemy for ORM mapping to a PostgreSQL database.
+Overview (migrated content):

-## System Components
+This repository includes an architecture overview that complements the chapter files above and maps high-level module layout and request flow into the per-chapter documents.

- **FastAPI backend** (`main.py`, `routes/`): hosts REST endpoints for scenarios, parameters, costs, consumption, production, equipment, maintenance, simulations, and reporting. Each router encapsulates request/response schemas and DB access patterns, leveraging a shared dependency module (`routes/dependencies.get_db`) for SQLAlchemy session management.
- **Service layer** (`services/`): houses business logic. `services/reporting.py` produces statistical summaries, while `services/simulation.py` provides the Monte Carlo integration point.
- **Persistence** (`models/`, `config/database.py`): SQLAlchemy models map to PostgreSQL tables. Relationships connect scenarios to derived domain entities.
- **Presentation** (`templates/`, `components/`): server-rendered views extend a shared `base.html` layout with a persistent left sidebar, pull global styles from `static/css/main.css`, and surface data entry (scenario and parameter forms) alongside the Chart.js-powered dashboard.
-  - **Reusable partials** (`templates/partials/components.html`): macro library that standardises select inputs, feedback/empty states, and table wrappers so pages remain consistent while keeping DOM hooks stable for existing JavaScript modules.
- **Middleware** (`middleware/validation.py`): applies JSON validation before requests reach routers.
- **Testing** (`tests/unit/`): pytest suite covering route and service behavior, including UI rendering checks and negative-path router validation tests to ensure consistent HTTP error semantics. Playwright end-to-end coverage is planned for core smoke flows (dashboard load, scenario inputs, reporting) and will attach in CI once scaffolding is completed.
+Key pointers:

-## Runtime Flow
+- Module map & components: `docs/architecture/05_building_block_view.md`
+- Request flow & runtime interactions: `docs/architecture/06_runtime_view.md`
+- Simulation roadmap & strategy: `docs/architecture/04_solution_strategy.md`

-1. Users navigate to form templates or API clients to manage scenarios, parameters, and operational data.
-2. FastAPI routers validate payloads with Pydantic models, then delegate to SQLAlchemy sessions for persistence.
-3. Simulation runs (placeholder `services/simulation.py`) will consume stored parameters to emit iteration results via `/api/simulations/run`.
-4. Reporting requests POST simulation outputs to `/api/reporting/summary`; the reporting service calculates aggregates (count, min/max, mean, median, percentiles, standard deviation, variance, and tail-risk metrics at the 95% confidence level).
-5. `templates/Dashboard.html` fetches summaries, renders metric cards, and plots distribution charts with Chart.js for stakeholder review.
+Developer quickstart, migrations, testing and current implementation status remain in `docs/quickstart.md` (canonical quickstart). If you prefer, the quickstart can be split into specific chapters, but currently it serves as a single onboarding document.

-### Dashboard Flow Review — 2025-10-20
+To continue expanding the architecture docs:

- The dashboard now renders at the root route (`/`) and leverages `_load_dashboard` within `routes/ui.py` to aggregate scenarios, parameters, costs, production, consumption, simulations, and maintenance data before templating.
- Client-side logic consumes a server-rendered JSON payload and can request live refreshes via `GET /ui/dashboard/data`, ensuring charts and summary cards stay synchronized with the database without a full page reload.
- Chart.js visualises cost mix (CAPEX vs OPEX) and activity throughput (production vs consumption) per scenario. When datasets are empty the UI swaps the chart canvas for contextual guidance.
- Simulation metrics draw on the aggregated reporting service; once `simulation_result` persists records, the dashboard lists recent runs with iteration counts, mean values, and percentile highlights.
- Maintenance reminders pull the next five scheduled events, providing equipment and scenario context alongside formatted costs.
-
-### Reporting Pipeline and UI Integration
-
-1. **Data Sources**
-
-   - Scenario-linked calculations (costs, consumption, production) produce raw figures stored in dedicated tables (`capex`, `opex`, `consumption`, `production_output`).
-   - Monte Carlo simulations (currently transient) generate arrays of `{ "result": float }` tuples that the dashboard or downstream tooling passes directly to reporting endpoints.
-
-2. **API Contract**
-
-   - `POST /api/reporting/summary` accepts a JSON array of result objects and validates shape through `_validate_payload` in `routes/reporting.py`.
-   - On success it returns a structured payload (`ReportSummary`) containing count, mean, median, min/max, standard deviation, and percentile values, all as floats.
-
-3. **Service Layer**
-
-   - `services/reporting.generate_report` converts the sanitized payload into descriptive statistics using Python’s standard library (`statistics` module) to avoid external dependencies.
-   - The service remains stateless; no database read/write occurs, which keeps summary calculations deterministic and idempotent.
-   - Extended KPIs (surfaced in the API and dashboard):
-     - `variance`: population variance computed as the square of the population standard deviation.
-     - `percentile_5` and `percentile_95`: lower and upper tail interpolated percentiles for sensitivity bounds.
-     - `value_at_risk_95`: 5th percentile threshold representing the minimum outcome within a 95% confidence band.
-     - `expected_shortfall_95`: mean of all outcomes at or below the `value_at_risk_95`, highlighting tail exposure.
-
-4. **UI Consumption**
-
-   - `templates/Dashboard.html` posts the user-provided dataset to the summary endpoint, renders metric cards for each field, and charts the distribution using Chart.js.
-   - `SUMMARY_FIELDS` now includes variance, 5th/10th/90th/95th percentiles, and tail-risk metrics (VaR/Expected Shortfall at 95%); tooltip annotations surface the tail metrics alongside the percentile line chart.
-   - Error handling surfaces HTTP failures inline so users can address malformed JSON or backend availability issues without leaving the page.
-
-5. **Future Integration Points**
-   - Once `/api/simulations/run` persists to `simulation_result`, the dashboard can fetch precalculated runs per scenario, removing the manual JSON step.
-   - Additional reporting endpoints (e.g., scenario comparisons) can reuse the same service layer, ensuring consistency across UI and API consumers.
-
-## Data Model Highlights
-
- `scenario`: central entity describing a mining scenario; owns relationships to cost, consumption, production, equipment, and maintenance tables.
- `capex`, `opex`: monetary tracking linked to scenarios.
- `consumption`: resource usage entries parameterized by scenario and description.
- `parameter`: scenario inputs with base `value` and optional distribution linkage via `distribution_id`, `distribution_type`, and JSON `distribution_parameters` to support simulation sampling.
- `production_output`: production metrics per scenario.
- `equipment` and `maintenance`: equipment inventory and maintenance events with dates/costs.
- `simulation_result`: staging table for future Monte Carlo outputs (not yet populated by `run_simulation`).
-
-Foreign keys secure referential integrity between domain tables and their scenarios, enabling per-scenario analytics.
-
-## Integrations and Future Work
-
- **Monte Carlo engine**: `services/simulation.py` will incorporate stochastic sampling (e.g., NumPy, SciPy) to populate `simulation_result` and feed reporting.
- **Persistence of results**: `/api/simulations/run` currently returns in-memory results; next iteration should persist to `simulation_result` and reference scenarios.
- **Authentication**: not yet implemented; all endpoints are open.
- **Deployment**: documentation focuses on local development; containerization and CI/CD pipelines remain to be defined.
-
-For extended diagrams and setup instructions reference:
-
- [docs/development_setup.md](development_setup.md) — environment provisioning and tooling.
- [docs/testing.md](testing.md) — pytest workflow and coverage expectations.
- [docs/mvp.md](mvp.md) — roadmap and milestone scope.
- [docs/implementation_plan.md](implementation_plan.md) — feature breakdown aligned with the TODO tracker.
- [docs/architecture_overview.md](architecture_overview.md) — supplementary module map and request flow diagram.
-
-### UI Frontend-Backend Integration Requirements — 2025-10-20
-
-### Reusable Template Components — 2025-10-21
-
-To reduce duplication across form-centric pages, shared Jinja macros live in `templates/partials/components.html`.
-
- `select_field(...)`: renders labeled `<select>` controls with consistent placeholder handling and optional preselection. Existing JavaScript modules continue to target the generated IDs, so template calls must pass the same identifiers (`consumption-form-scenario`, etc.).
- `feedback(...)` and `empty_state(...)`: wrap status messages in standard classes (`feedback`, `empty-state`) with optional `hidden` toggles so scripts can control visibility without reimplementing markup.
- `table_container(...)`: provides a semantic wrapper and optional heading around tabular content; the `{% call %}` body supplies the `<thead>`, `<tbody>`, and `<tfoot>` elements while the macro applies the `table-container` class and manages hidden state.
-
-Pages like `templates/consumption.html` and `templates/costs.html` already consume these helpers to keep markup aligned while preserving existing JavaScript selectors.
-
-Pages should import these macros via `{% from "partials/components.html" import ... with context %}` to ensure scenario lists or other context variables stay available inside the macro body.
-
-### Styling Audit Notes — 2025-10-21
-
- **Spacing**: Panels (`section.panel`) sometimes lack consistent vertical rhythm between headings, form grids, and tables. Extra top/bottom margin utilities would help align content.
- **Typography**: Headings rely on browser defaults; font-size scale is uneven between `<h2>` and `<h3>`. Define explicit scale tokens (e.g., `--font-size-lg`) for predictable sizing.
- **Forms**: `.form-grid` uses fixed column gaps that collapse on small screens; introduce responsive grid rules to stack gracefully below ~768px.
- **Tables**: `.table-container` wrappers need overflow handling for narrow viewports; consider `overflow-x: auto` with padding adjustments.
- **Feedback/Empty states**: Messages use default font weight and spacing; a utility class for margin/padding would ensure consistent separation from forms or tables.
-
-### Styling Utilities — 2025-10-21
-
- Added spacing and typography CSS variables (e.g., `--space-sm`, `--font-size-xl`) and applied them to `.panel` and `.form-grid` elements for consistent vertical rhythm.
- Standardised heading weights/sizes within panels so `<h2>` and `<h3>` share explicit scale tokens.
- Updated form controls to use the new spacing tokens, preparing the layout for further responsive tweaks.
-
-#### Scenarios (`templates/ScenarioForm.html`)
-
- **Data**: `GET /api/scenarios/` to list existing scenarios for navigation and to hydrate dropdowns in downstream forms; optional aggregation of scenario counts for dashboard badges.
- **Actions**: `POST /api/scenarios/` to create new scenarios; future delete/update flows would reuse the same router once endpoints exist.
-
-#### Parameters (`templates/ParameterInput.html`)
-
- **Data**: Scenario catalogue from `GET /api/scenarios/`; parameter inventory via `GET /api/parameters/` with client-side filtering by `scenario_id`; optional distribution catalogue from `models/distribution` when exposed.
- **Actions**: `POST /api/parameters/` to add parameters; extend UI to support editing or deleting parameters as routes arrive.
-
-#### Costs (`templates/costs.html`)
-
- **Data**: CAPEX list `GET /api/costs/capex`; OPEX list `GET /api/costs/opex`; computed totals grouped by scenario for summary panels.
- **Actions**: `POST /api/costs/capex` and `POST /api/costs/opex` for new entries; planned future edits/deletes once routers expand.
-
-#### Consumption (`templates/consumption.html`)
-
- **Data**: Consumption entries by scenario via `GET /api/consumption/`; scenario metadata for filtering and empty-state messaging.
- **Actions**: `POST /api/consumption/` to log consumption items; include optimistic UI refresh after persistence.
-
-#### Production (`templates/production.html`)
-
- **Data**: Production records from `GET /api/production/`; scenario list for filter chips; optional aggregates (totals, averages) derived client-side.
- **Actions**: `POST /api/production/` to capture production outputs; notify users when data drives downstream reporting.
-
-#### Equipment (`templates/equipment.html`)
-
- **Data**: Equipment roster from `GET /api/equipment/`; scenario context to scope inventory; maintenance counts per asset once joins are introduced.
- **Actions**: `POST /api/equipment/` to add equipment; integrate delete/edit when endpoints arrive.
-
-#### Maintenance (`templates/maintenance.html`)
-
- **Data**: Maintenance schedule `GET /api/maintenance/` with pagination support (`skip`, `limit`); equipment list (`GET /api/equipment/`) to map IDs to names; scenario catalogue for filtering.
- **Actions**: CRUD operations through `POST /api/maintenance/`, `PUT /api/maintenance/{id}`, `DELETE /api/maintenance/{id}`; view detail via `GET /api/maintenance/{id}` for modal display.
-
-#### Simulations (`templates/simulations.html`)
-
- **Data**: Scenario list `GET /api/scenarios/`; parameter sets `GET /api/parameters/` filtered client-side; persisted simulation results (future) via dedicated endpoint or by caching `/api/simulations/run` responses.
- **Actions**: `POST /api/simulations/run` to execute simulations; surface run configuration (iterations, seed) and feed response summary into reporting page.
-
-#### Reporting (`templates/reporting.html` and `templates/Dashboard.html`)
-
- **Data**: Simulation outputs either from recent `/api/simulations/run` calls or a planned `GET` endpoint against `simulation_result`; summary metrics via `POST /api/reporting/summary`; scenario metadata for header context.
- **Actions**: Trigger summary refreshes by posting batched results; allow export/download actions once implemented; integrate future comparison requests across scenarios.
+1. Open the chapter file you want to expand under `docs/architecture/`.
+2. Move or add content from other docs into the chapter and ensure the chapter references code files (e.g., `services/simulation.py`, `routes/reporting.py`) using relative links.
+3. After edits, run `scripts/check_docs_links.py` to validate local links.
--- a/docs/architecture/01_introduction_and_goals.md
+++ b/docs/architecture/01_introduction_and_goals.md
@@ -0,0 +1,44 @@
+# 01 — Introduction and Goals
+
+Status: skeleton
+
+Describe the system purpose, stakeholders, and high-level goals. Fill this file with project introduction and business/technical goals.
+
+## Overview
+
+CalMiner is a FastAPI application that collects mining project inputs, persists scenario-specific records, and surfaces aggregated insights. The platform targets Monte Carlo driven planning, with deterministic CRUD features in place and simulation logic staged for future work.
+
+Frontend components are server-rendered Jinja2 templates, with Chart.js powering the dashboard visualization. The backend leverages SQLAlchemy for ORM mapping to a PostgreSQL database.
+
+### Runtime Flow
+
+1. Users navigate to form templates or API clients to manage scenarios, parameters, and operational data.
+2. FastAPI routers validate payloads with Pydantic models, then delegate to SQLAlchemy sessions for persistence.
+3. Simulation runs (placeholder `services/simulation.py`) will consume stored parameters to emit iteration results via `/api/simulations/run`.
+4. Reporting requests POST simulation outputs to `/api/reporting/summary`; the reporting service calculates aggregates (count, min/max, mean, median, percentiles, standard deviation, variance, and tail-risk metrics at the 95% confidence level).
+5. `templates/Dashboard.html` fetches summaries, renders metric cards, and plots distribution charts with Chart.js for stakeholder review.
+
+### Current implementation status (summary)
+
+- Currency normalization, simulation scaffold, and reporting service exist; see `docs/quickstart.md` for full status and migration instructions.
+
+## MVP Features (migrated)
+
+The following MVP features and priorities were migrated from `docs/mvp.md`.
+
+### Prioritized Features
+
+1. **Scenario Creation and Management** (High Priority): Allow users to create, edit, and delete scenarios. Rationale: Core functionality for what-if analysis.
+1. **Parameter Input and Validation** (High Priority): Input process parameters with validation. Rationale: Ensures data integrity for simulations.
+1. **Monte Carlo Simulation Run** (High Priority): Execute simulations and store results. Rationale: Key differentiator for risk analysis.
+1. **Basic Reporting** (Medium Priority): Display NPV, IRR, EBITDA from simulation results. Rationale: Essential for decision-making.
+1. **Cost Tracking Dashboard** (Medium Priority): Visualize CAPEX and OPEX. Rationale: Helps monitor expenses.
+1. **Consumption Monitoring** (Low Priority): Track resource consumption. Rationale: Useful for optimization.
+1. **User Authentication** (Medium Priority): Basic login/logout. Rationale: Security for multi-user access.
+1. **Export Results** (Low Priority): Export simulation data to CSV/PDF. Rationale: For external analysis.
+
+### Rationale for Prioritization
+
+- High: Core simulation and scenario features first.
+- Medium: Reporting and auth for usability.
+- Low: Nice-to-haves after basics.
--- a/docs/architecture/02_architecture_constraints.md
+++ b/docs/architecture/02_architecture_constraints.md
@@ -0,0 +1,5 @@
+# 02 — Architecture Constraints
+
+Status: skeleton
+
+Document imposed constraints: technical, organizational, regulatory, and environmental constraints that affect architecture decisions.
--- a/docs/architecture/03_context_and_scope.md
+++ b/docs/architecture/03_context_and_scope.md
@@ -0,0 +1,5 @@
+# 03 — Context and Scope
+
+Status: skeleton
+
+Describe system context, external actors, and the scope of the architecture.
--- a/docs/architecture/04_solution_strategy.md
+++ b/docs/architecture/04_solution_strategy.md
@@ -0,0 +1,20 @@
+# 04 — Solution Strategy
+
+Status: skeleton
+
+High-level solution strategy describing major approaches, technology choices, and trade-offs.
+
+## Monte Carlo engine & persistence
+
+- **Monte Carlo engine**: `services/simulation.py` will incorporate stochastic sampling (e.g., NumPy, SciPy) to populate `simulation_result` and feed reporting.
+- **Persistence of simulation results**: plan to extend `/api/simulations/run` to persist iterations to `models/simulation_result` and provide a retrieval endpoint for historical runs.
+
+## Simulation Roadmap
+
+- Implement stochastic sampling in `services/simulation.py` (e.g., NumPy random draws based on parameter distributions).
+- Store iterations in `models/simulation_result.py` via `/api/simulations/run`.
+- Feed persisted results into reporting for downstream analytics and historical comparisons.
+
+### Status update (2025-10-21)
+
+- A scaffolded simulation service (`services/simulation.py`) and `/api/simulations/run` route exist and return in-memory results. Persisting those iterations to `models/simulation_result` is scheduled for a follow-up change.
--- a/docs/architecture/04_solution_strategy_extended.md
+++ b/docs/architecture/04_solution_strategy_extended.md
@@ -0,0 +1,129 @@
+# Implementation Plan (extended)
+
+This file contains the migrated implementation plan (MVP features, steps, and estimates) originally in `docs/implementation_plan.md`.
+
+## Project Setup
+
+1. Connect to PostgreSQL database with schema `calminer`.
+1. Create and activate a virtual environment and install dependencies via `requirements.txt`.
+1. Define environment variables in `.env`, including `DATABASE_URL`.
+1. Configure FastAPI entrypoint in `main.py` to include routers.
+
+## Feature: Scenario Management
+
+### Scenario Management — Steps
+
+1. Create `models/scenario.py` for scenario CRUD.
+1. Implement API endpoints in `routes/scenarios.py` (GET, POST, PUT, DELETE).
+1. Write unit tests in `tests/unit/test_scenario.py`.
+1. Build UI component `components/ScenarioForm.html`.
+
+## Feature: Process Parameters
+
+### Parameters — Steps
+
+1. Create `models/parameters.py` for process parameters.
+1. Implement Pydantic schemas in `routes/parameters.py`.
+1. Add validation middleware in `middleware/validation.py`.
+1. Write unit tests in `tests/unit/test_parameter.py`.
+1. Build UI component `components/ParameterInput.html`.
+
+## Feature: Stochastic Variables
+
+### Stochastic Variables — Steps
+
+1. Create `models/distribution.py` for variable distributions.
+1. Implement API routes in `routes/distributions.py`.
+1. Write Pydantic schemas and validations.
+1. Write unit tests in `tests/unit/test_distribution.py`.
+1. Build UI component `components/DistributionEditor.html`.
+
+## Feature: Cost Tracking
+
+### Cost Tracking — Steps
+
+1. Create `models/capex.py` and `models/opex.py`.
+1. Implement API routes in `routes/costs.py`.
+1. Write Pydantic schemas for CAPEX/OPEX.
+1. Write unit tests in `tests/unit/test_costs.py`.
+1. Build UI component `components/CostForm.html`.
+
+## Feature: Consumption Tracking
+
+### Consumption Tracking — Steps
+
+1. Create models for consumption: `chemical_consumption.py`, `fuel_consumption.py`, `water_consumption.py`, `scrap_consumption.py`.
+1. Implement API routes in `routes/consumption.py`.
+1. Write Pydantic schemas for consumption data.
+1. Write unit tests in `tests/unit/test_consumption.py`.
+1. Build UI component `components/ConsumptionDashboard.html`.
+
+## Feature: Production Output
+
+### Production Output — Steps
+
+1. Create `models/production_output.py`.
+1. Implement API routes in `routes/production.py`.
+1. Write Pydantic schemas for production output.
+1. Write unit tests in `tests/unit/test_production.py`.
+1. Build UI component `components/ProductionChart.html`.
+
+## Feature: Equipment Management
+
+### Equipment Management — Steps
+
+1. Create `models/equipment.py` for equipment data.
+1. Implement API routes in `routes/equipment.py`.
+1. Write Pydantic schemas for equipment.
+1. Write unit tests in `tests/unit/test_equipment.py`.
+1. Build UI component `components/EquipmentList.html`.
+
+## Feature: Maintenance Logging
+
+### Maintenance Logging — Steps
+
+1. Create `models/maintenance.py` for maintenance events.
+1. Implement API routes in `routes/maintenance.py`.
+1. Write Pydantic schemas for maintenance logs.
+1. Write unit tests in `tests/unit/test_maintenance.py`.
+1. Build UI component `components/MaintenanceLog.html`.
+
+## Feature: Monte Carlo Simulation Engine
+
+### Monte Carlo Engine — Steps
+
+1. Implement Monte Carlo logic in `services/simulation.py`.
+1. Persist results in `models/simulation_result.py`.
+1. Expose endpoint in `routes/simulations.py`.
+1. Write integration tests in `tests/unit/test_simulation.py`.
+1. Build UI component `components/SimulationRunner.html`.
+
+## Feature: Reporting / Dashboard
+
+### Reporting / Dashboard — Steps
+
+1. Implement report calculations in `services/reporting.py`.
+1. Add detailed and summary endpoints in `routes/reporting.py`.
+1. Write unit tests in `tests/unit/test_reporting.py`.
+1. Enhance UI in `components/Dashboard.html` with charts.
+
+## MVP Feature Analysis (summary)
+
+Goal: Identify core MVP features, acceptance criteria, and quick estimates.
+
+### Edge cases to consider
+
+- Large simulation runs (memory / timeouts) — use streaming, chunking, or background workers.
+- DB migration and schema versioning.
+- Authentication/authorization for scenario access.
+
+### Next actionable items
+
+1. Break Scenario Management into sub-issues (models, routes, tests, simple UI).
+1. Scaffold Parameter Input & Validation (models/parameters.py, middleware, routes, tests).
+1. Prototype the simulation engine with a small deterministic runner and unit tests.
+1. Scaffold Monte Carlo Simulation endpoints (`services/simulation.py`, `routes/simulations.py`, tests).
+1. Scaffold Reporting endpoints (`services/reporting.py`, `routes/reporting.py`, front-end Dashboard, tests).
+1. Add CI job for tests and coverage.
+
+See `docs/architecture/13_ui_and_style.md` for the UI template audit, layout guidance, and next steps.
--- a/docs/architecture/05_building_block_view.md
+++ b/docs/architecture/05_building_block_view.md
@@ -0,0 +1,34 @@
+# 05 — Building Block View
+
+Status: skeleton
+
+Explain the static structure: modules, components, services and their relationships.
+
+## System Components
+
+- **FastAPI backend** (`main.py`, `routes/`): hosts REST endpoints for scenarios, parameters, costs, consumption, production, equipment, maintenance, simulations, and reporting. Each router encapsulates request/response schemas and DB access patterns, leveraging a shared dependency module (`routes/dependencies.get_db`) for SQLAlchemy session management.
+- **Service layer** (`services/`): houses business logic. `services/reporting.py` produces statistical summaries, while `services/simulation.py` provides the Monte Carlo integration point.
+- **Persistence** (`models/`, `config/database.py`): SQLAlchemy models map to PostgreSQL tables. Relationships connect scenarios to derived domain entities.
+- **Presentation** (`templates/`, `components/`): server-rendered views extend a shared `base.html` layout with a persistent left sidebar, pull global styles from `static/css/main.css`, and surface data entry (scenario and parameter forms) alongside the Chart.js-powered dashboard.
+  - **Reusable partials** (`templates/partials/components.html`): macro library that standardises select inputs, feedback/empty states, and table wrappers so pages remain consistent while keeping DOM hooks stable for existing JavaScript modules.
+- **Middleware** (`middleware/validation.py`): applies JSON validation before requests reach routers.
+- **Testing** (`tests/unit/`): pytest suite covering route and service behavior, including UI rendering checks and negative-path router validation tests to ensure consistent HTTP error semantics. Playwright end-to-end coverage is planned for core smoke flows (dashboard load, scenario inputs, reporting) and will attach in CI once scaffolding is completed.
+
+## Module Map (code)
+
+- `scenario.py`: central scenario entity with relationships to cost, consumption, production, equipment, maintenance, and simulation results.
+- `capex.py`, `opex.py`: financial expenditures tied to scenarios.
+- `consumption.py`, `production_output.py`: operational data tables.
+- `equipment.py`, `maintenance.py`: asset management models.
+
+## Architecture overview (migrated)
+
+This overview complements `docs/architecture.md` with a high-level map of CalMiner's module layout and request flow.
+
+Refer to the detailed architecture chapters in `docs/architecture/`:
+
+- Module map & components: `docs/architecture/05_building_block_view.md`
+- Request flow & runtime interactions: `docs/architecture/06_runtime_view.md`
+- Simulation roadmap & strategy: `docs/architecture/04_solution_strategy.md`
+
+Currency normalization and backfill tooling have been added (see `scripts/backfill_currency.py` and related migrations) to support canonical currency lookups across cost tables.
--- a/docs/architecture/06_runtime_view.md
+++ b/docs/architecture/06_runtime_view.md
@@ -0,0 +1,33 @@
+# 06 — Runtime View
+
+Status: skeleton
+
+Describe runtime aspects: request flows, lifecycle of key interactions, and runtime components.
+
+## Reporting Pipeline and UI Integration
+
+1. **Data Sources**
+
+   - Scenario-linked calculations (costs, consumption, production) produce raw figures stored in dedicated tables (`capex`, `opex`, `consumption`, `production_output`).
+   - Monte Carlo simulations (currently transient) generate arrays of `{ "result": float }` tuples that the dashboard or downstream tooling passes directly to reporting endpoints.
+
+2. **API Contract**
+
+   - `POST /api/reporting/summary` accepts a JSON array of result objects and validates shape through `_validate_payload` in `routes/reporting.py`.
+   - On success it returns a structured payload (`ReportSummary`) containing count, mean, median, min/max, standard deviation, and percentile values, all as floats.
+
+3. **Service Layer**
+
+   - `services/reporting.generate_report` converts the sanitized payload into descriptive statistics using Python’s standard library (`statistics` module) to avoid external dependencies.
+   - The service remains stateless; no database read/write occurs, which keeps summary calculations deterministic and idempotent.
+   - Extended KPIs (surfaced in the API and dashboard):
+     - `variance`: population variance computed as the square of the population standard deviation.
+     - `percentile_5` and `percentile_95`: lower and upper tail interpolated percentiles for sensitivity bounds.
+     - `value_at_risk_95`: 5th percentile threshold representing the minimum outcome within a 95% confidence band.
+     - `expected_shortfall_95`: mean of all outcomes at or below the `value_at_risk_95`, highlighting tail exposure.
+
+4. **UI Consumption**
+
+   - `templates/Dashboard.html` posts the user-provided dataset to the summary endpoint, renders metric cards for each field, and charts the distribution using Chart.js.
+   - `SUMMARY_FIELDS` now includes variance, 5th/10th/90th/95th percentiles, and tail-risk metrics (VaR/Expected Shortfall at 95%); tooltip annotations surface the tail metrics alongside the percentile line chart.
+   - Error handling surfaces HTTP failures inline so users can address malformed JSON or backend availability issues without leaving the page.
--- a/docs/architecture/07_deployment_view.md
+++ b/docs/architecture/07_deployment_view.md
@@ -0,0 +1,10 @@
+# 07 — Deployment View
+
+Status: skeleton
+
+Describe deployment topology, infrastructure components, and environments (dev/stage/prod).
+
+## Integrations and Future Work (deployment-related)
+
+- **Persistence of results**: `/api/simulations/run` currently returns in-memory results; next iteration should persist to `simulation_result` and reference scenarios.
+- **Deployment**: documentation focuses on local development; containerization and CI/CD pipelines remain to be defined. Consider Docker + GitHub Actions or a simple Docker Compose for local stacks.
--- a/docs/architecture/08_concepts.md
+++ b/docs/architecture/08_concepts.md
@@ -0,0 +1,17 @@
+# 08 — Concepts
+
+Status: skeleton
+
+Document key concepts, domain models, and terminology used throughout the architecture documentation.
+
+## Data Model Highlights
+
+- `scenario`: central entity describing a mining scenario; owns relationships to cost, consumption, production, equipment, and maintenance tables.
+- `capex`, `opex`: monetary tracking linked to scenarios.
+- `consumption`: resource usage entries parameterized by scenario and description.
+- `parameter`: scenario inputs with base `value` and optional distribution linkage via `distribution_id`, `distribution_type`, and JSON `distribution_parameters` to support simulation sampling.
+- `production_output`: production metrics per scenario.
+- `equipment` and `maintenance`: equipment inventory and maintenance events with dates/costs.
+- `simulation_result`: staging table for future Monte Carlo outputs (not yet populated by `run_simulation`).
+
+Foreign keys secure referential integrity between domain tables and their scenarios, enabling per-scenario analytics.
--- a/docs/architecture/09_architecture_decisions.md
+++ b/docs/architecture/09_architecture_decisions.md
@@ -0,0 +1,5 @@
+# 09 — Architecture Decisions
+
+Status: skeleton
+
+Record important architectural decisions, their rationale, and alternatives considered.
--- a/docs/architecture/10_quality_requirements.md
+++ b/docs/architecture/10_quality_requirements.md
@@ -0,0 +1,5 @@
+# 10 — Quality Requirements
+
+Status: skeleton
+
+List non-functional requirements (performance, scalability, reliability, security) and measurable acceptance criteria.
--- a/docs/architecture/11_technical_risks.md
+++ b/docs/architecture/11_technical_risks.md
@@ -0,0 +1,5 @@
+# 11 — Technical Risks
+
+Status: skeleton
+
+Document potential technical risks, mitigation strategies, and monitoring suggestions.
--- a/docs/architecture/12_glossary.md
+++ b/docs/architecture/12_glossary.md
@@ -0,0 +1,5 @@
+# 12 — Glossary
+
+Status: skeleton
+
+Project glossary and definitions for domain-specific terms.
--- a/docs/architecture/13_ui_and_style.md
+++ b/docs/architecture/13_ui_and_style.md
@@ -0,0 +1,85 @@
+# 13 — UI, templates and styling
+
+Status: migrated
+
+This chapter collects UI integration notes, reusable template components, styling audit points and per-page UI data/actions.
+
+## Reusable Template Components
+
+To reduce duplication across form-centric pages, shared Jinja macros live in `templates/partials/components.html`.
+
+- `select_field(...)`: renders labeled `<select>` controls with consistent placeholder handling and optional preselection. Existing JavaScript modules continue to target the generated IDs, so template calls must pass the same identifiers (`consumption-form-scenario`, etc.).
+- `feedback(...)` and `empty_state(...)`: wrap status messages in standard classes (`feedback`, `empty-state`) with optional `hidden` toggles so scripts can control visibility without reimplementing markup.
+- `table_container(...)`: provides a semantic wrapper and optional heading around tabular content; the `{% call %}` body supplies the `<thead>`, `<tbody>`, and `<tfoot>` elements while the macro applies the `table-container` class and manages hidden state.
+
+Pages like `templates/consumption.html` and `templates/costs.html` already consume these helpers to keep markup aligned while preserving existing JavaScript selectors.
+
+Import macros via:
+
+```jinja
+{% from "partials/components.html" import select_field, feedback, table_container with context %}
+```
+
+## Styling Audit Notes (2025-10-21)
+
+- **Spacing**: Panels (`section.panel`) sometimes lack consistent vertical rhythm between headings, form grids, and tables. Extra top/bottom margin utilities would help align content.
+- **Typography**: Headings rely on browser defaults; font-size scale is uneven between `<h2>` and `<h3>`. Define explicit scale tokens (e.g., `--font-size-lg`) for predictable sizing.
+- **Forms**: `.form-grid` uses fixed column gaps that collapse on small screens; introduce responsive grid rules to stack gracefully below ~768px.
+- **Tables**: `.table-container` wrappers need overflow handling for narrow viewports; consider `overflow-x: auto` with padding adjustments.
+- **Feedback/Empty states**: Messages use default font weight and spacing; a utility class for margin/padding would ensure consistent separation from forms or tables.
+
+## Per-page data & actions
+
+Short reference of per-page APIs and primary actions used by templates and scripts.
+
+- Scenarios (`templates/ScenarioForm.html`):
+
+  - Data: `GET /api/scenarios/`
+  - Actions: `POST /api/scenarios/`
+
+- Parameters (`templates/ParameterInput.html`):
+
+  - Data: `GET /api/scenarios/`, `GET /api/parameters/`
+  - Actions: `POST /api/parameters/`
+
+- Costs (`templates/costs.html`):
+
+  - Data: `GET /api/costs/capex`, `GET /api/costs/opex`
+  - Actions: `POST /api/costs/capex`, `POST /api/costs/opex`
+
+- Consumption (`templates/consumption.html`):
+
+  - Data: `GET /api/consumption/`
+  - Actions: `POST /api/consumption/`
+
+- Production (`templates/production.html`):
+
+  - Data: `GET /api/production/`
+  - Actions: `POST /api/production/`
+
+- Equipment (`templates/equipment.html`):
+
+  - Data: `GET /api/equipment/`
+  - Actions: `POST /api/equipment/`
+
+- Maintenance (`templates/maintenance.html`):
+
+  - Data: `GET /api/maintenance/` (pagination support)
+  - Actions: `POST /api/maintenance/`, `PUT /api/maintenance/{id}`, `DELETE /api/maintenance/{id}`
+
+- Simulations (`templates/simulations.html`):
+
+  - Data: `GET /api/scenarios/`, `GET /api/parameters/`
+  - Actions: `POST /api/simulations/run`
+
+- Reporting (`templates/reporting.html` and `templates/Dashboard.html`):
+  - Data: `POST /api/reporting/summary` (accepts arrays of `{ "result": float }` objects)
+  - Actions: Trigger summary refreshes and export/download actions.
+
+## UI Template Audit (2025-10-20)
+
+- Existing HTML templates: `ScenarioForm.html`, `ParameterInput.html`, and `Dashboard.html` (reporting summary view).
+- Coverage gaps remain for costs, consumption, production, equipment, maintenance, and simulation workflows—no dedicated templates yet.
+- Shared layout primitives (navigation/header/footer) are absent; current pages duplicate boilerplate markup.
+- Dashboard currently covers reporting metrics but should be wired to a central `/` route once the shared layout lands.
+- Next steps: introduce a `base.html`, refactor existing templates to extend it, and scaffold placeholder pages for the remaining features.
--- a/docs/architecture/14_testing_ci.md
+++ b/docs/architecture/14_testing_ci.md
@@ -0,0 +1,101 @@
+# 14 Testing, CI and Quality Assurance
+
+This chapter centralizes the project's testing strategy, CI configuration, and quality targets.
+
+## Overview
+
+CalMiner uses a combination of unit, integration, and end-to-end tests to ensure quality.
+
+### Frameworks
+
+- Backend: pytest for unit and integration tests.
+- Frontend: pytest with Playwright for E2E tests.
+- Database: pytest fixtures with psycopg2 for DB tests.
+
+### Test Types
+
+- Unit Tests: Test individual functions/modules.
+- Integration Tests: Test API endpoints and DB interactions.
+- E2E Tests: Playwright for full user flows.
+
+### CI/CD
+
+- Use GitHub Actions for CI.
+- Run tests on pull requests.
+- Code coverage target: 80% (using pytest-cov).
+
+### Running Tests
+
+- Unit: `pytest tests/unit/`
+- E2E: `pytest tests/e2e/`
+- All: `pytest`
+
+### Test Directory Structure
+
+Organize tests under the `tests/` directory mirroring the application structure:
+
+```text
+tests/
+  unit/
+    test_<module>.py
+  e2e/
+    test_<flow>.py
+  fixtures/
+    conftest.py
+```python
+
+### Fixtures and Test Data
+
+- Define reusable fixtures in `tests/fixtures/conftest.py`.
+- Use temporary in-memory databases or isolated schemas for DB tests.
+- Load sample data via fixtures for consistent test environments.
+- Leverage the `seeded_ui_data` fixture in `tests/unit/conftest.py` to populate scenarios with related cost, maintenance, and simulation records for deterministic UI route checks.
+
+### E2E (Playwright) Tests
+
+The E2E test suite, located in `tests/e2e/`, uses Playwright to simulate user interactions in a live browser environment. These tests are designed to catch issues in the UI, frontend-backend integration, and overall application flow.
+
+#### Fixtures
+
+- `live_server`: A session-scoped fixture that launches the FastAPI application in a separate process, making it accessible to the browser.
+- `playwright_instance`, `browser`, `page`: Standard `pytest-playwright` fixtures for managing the Playwright instance, browser, and individual pages.
+
+#### Smoke Tests
+
+- UI Page Loading: `test_smoke.py` contains a parameterized test that systematically navigates to all UI routes to ensure they load without errors, have the correct title, and display a primary heading.
+- Form Submissions: Each major form in the application has a corresponding test file (e.g., `test_scenarios.py`, `test_costs.py`) that verifies: page loads, create item by filling the form, success message, and UI updates.
+
+### Running E2E Tests
+
+To run the Playwright tests:
+
+```bash
+pytest tests/e2e/
+```
+
+To run headed mode:
+
+```bash
+pytest tests/e2e/ --headed
+```
+
+### Mocking and Dependency Injection
+
+- Use `unittest.mock` to mock external dependencies.
+- Inject dependencies via function parameters or FastAPI's dependency overrides in tests.
+
+### Code Coverage
+
+- Install `pytest-cov` to generate coverage reports.
+- Run with coverage: `pytest --cov --cov-report=term` (use `--cov-report=html` when visualizing hotspots).
+- Target 95%+ overall coverage. Focus on historically low modules: `services/simulation.py`, `services/reporting.py`, `middleware/validation.py`, and `routes/ui.py`.
+- Latest snapshot (2025-10-21): `pytest --cov=. --cov-report=term-missing` returns **91%** overall coverage.
+
+### CI Integration
+
+- Configure GitHub Actions workflow in `.github/workflows/ci.yml` to:
+  - Install dependencies, including Playwright browsers (`playwright install`).
+  - Run `pytest` with coverage for unit tests.
+  - Run `pytest tests/e2e/` for E2E tests.
+  - Fail on coverage <80%.
+  - Upload coverage artifacts under `reports/coverage/`.
--- a/docs/architecture/15_development_setup.md
+++ b/docs/architecture/15_development_setup.md
@@ -1,4 +1,6 @@
-# Development Setup Guide
+# 15 Development Setup Guide
+
+This document outlines the local development environment and steps to get the project running.

 ## Prerequisites

@@ -12,7 +14,7 @@
 # Clone the repository
 git clone https://git.allucanget.biz/allucanget/calminer.git
 cd calminer
-```
+```python

 ## Virtual Environment

@@ -20,13 +22,13 @@ cd calminer
 # Create and activate a virtual environment
 python -m venv .venv
 .\.venv\Scripts\Activate.ps1
-```
+```python

 ## Install Dependencies

 ```powershell
 pip install -r requirements.txt
-```
+```python

 ## Database Setup

@@ -36,29 +38,29 @@ pip install -r requirements.txt
 CREATE USER calminer_user WITH PASSWORD 'your_password';
 ```

-2. Create database:
+1. Create database:

 ```sql
 CREATE DATABASE calminer;
-   ```
+```python

 ## Environment Variables

 1. Copy `.env.example` to `.env` at project root.
-2. Edit `.env` to set database connection string:
+1. Edit `.env` to set database connection string:

 ```dotenv
 DATABASE_URL=postgresql://<user>:<password>@localhost:5432/calminer
 ```

-3. The application uses `python-dotenv` to load these variables.
+1. The application uses `python-dotenv` to load these variables.

 ## Running the Application

 ```powershell
 # Start the FastAPI server
 uvicorn main:app --reload
-```
+```python

 ## Testing

@@ -66,6 +68,4 @@ uvicorn main:app --reload
 pytest
 ```

-## Frontend Setup
-
-(TBD - add when frontend implemented)
+E2E tests use Playwright and a session-scoped `live_server` fixture that starts the app at `http://localhost:8001` for browser-driven tests.
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@@ -0,0 +1,34 @@
+---
+title: "CalMiner Architecture Documentation"
+description: "arc42-based architecture documentation for the CalMiner project"
+---
+
+# Architecture documentation (arc42 mapping)
+
+This folder mirrors the arc42 chapter structure (adapted to Markdown).
+
+Files:
+
+- `01_introduction_and_goals.md`
+- `02_architecture_constraints.md`
+- `03_context_and_scope.md`
+- `04_solution_strategy.md`
+- `05_building_block_view.md`
+- `06_runtime_view.md`
+- `07_deployment_view.md`
+- `08_concepts.md`
+- `09_architecture_decisions.md`
+- `10_quality_requirements.md`
+- `11_technical_risks.md`
+- `12_glossary.md`
+
+Mapping notes:
+
+- `docs/architecture.md` and `docs/architecture_overview.md` contain high-level content that will be split into these chapter files.
+- `docs/quickstart.md` contains developer quickstart, migrations, testing and current status and will remain as a separate quickstart reference.
+
+Next steps:
+
+1. Move relevant sections from `docs/architecture.md` and `docs/architecture_overview.md` into the appropriate chapter files.
+2. Expand each chapter with diagrams, examples, and references to code (files and modules).
+3. Add links from the top-level `README.md` to `docs/architecture/README.md` and `docs/quickstart.md`.
--- a/docs/architecture_overview.md
+++ b/docs/architecture_overview.md
@@ -1,44 +0,0 @@
-# Architecture Overview
-
-This overview complements `docs/architecture.md` with a high-level map of CalMiner's module layout and request flow.
-
-## Module Map
-
- `main.py`: FastAPI entry point bootstrapping routers and middleware.
- `models/`: SQLAlchemy declarative models for all database tables. Key modules:
-  - `scenario.py`: central scenario entity with relationships to cost, consumption, production, equipment, maintenance, and simulation results.
-  - `capex.py`, `opex.py`: financial expenditures tied to scenarios.
-  - `consumption.py`, `production_output.py`: operational data tables.
-  - `equipment.py`, `maintenance.py`: asset management models.
- `routes/`: REST endpoints grouped by domain (scenarios, parameters, costs, consumption, production, equipment, maintenance, reporting, simulations, UI).
- `services/`: business logic abstractions. `reporting.py` supplies summary statistics; `simulation.py` hosts the Monte Carlo extension point.
- `middleware/validation.py`: request JSON validation prior to hitting routers.
- `templates/`: Jinja2 templates for UI (scenario form, parameter input, dashboard).
-
-## Request Flow
-
-```mermaid
-graph TD
-    A[Browser / API Client] -->|HTTP| B[FastAPI Router]
-    B --> C[Dependency Injection]
-    C --> D[SQLAlchemy Session]
-    B --> E[Service Layer]
-    E --> D
-    E --> F[Reporting / Simulation Logic]
-    D --> G[PostgreSQL]
-    F --> H[Summary Response]
-    G --> H
-    H --> A
-```
-
-## Dashboard Interaction
-
-1. User loads `/dashboard`, served by `templates/Dashboard.html`.
-2. Template fetches `/api/reporting/summary` with sample or user-provided simulation outputs.
-3. Response metrics populate the summary grid and Chart.js visualization.
-
-## Simulation Roadmap
-
- Implement stochastic sampling in `services/simulation.py` (e.g., NumPy random draws based on parameter distributions).
- Store iterations in `models/simulation_result.py` via `/api/simulations/run`.
- Feed persisted results into reporting for downstream analytics and historical comparisons.
--- a/docs/implementation_plan.md
+++ b/docs/implementation_plan.md
@@ -1,161 +0,0 @@
-# Implementation Plan
-
-This document outlines the MVP features and implementation steps for CalMiner.
-
-Refer to the following for context alignment:
-
- System architecture: [docs/architecture.md](architecture.md)
- Development setup: [docs/development_setup.md](development_setup.md)
-
-## Project Setup
-
-1. Connect to PostgreSQL database with schema `calminer`.
-2. Create and activate a virtual environment and install dependencies via `requirements.txt`.
-3. Define environment variables in `.env`, including `DATABASE_URL`.
-4. Configure FastAPI entrypoint in `main.py` to include routers.
-
-## Feature: Scenario Management
-
-### Implementation Steps
-
-1. Create `models/scenario.py` for scenario CRUD.
-2. Implement API endpoints in `routes/scenarios.py` (GET, POST, PUT, DELETE).
-3. Write unit tests in `tests/unit/test_scenario.py`.
-4. Build UI component `components/ScenarioForm.html`.
-
-## Feature: Process Parameters
-
-### Implementation Steps
-
-1. Create `models/parameters.py` for process parameters.
-2. Implement Pydantic schemas in `routes/parameters.py`.
-3. Add validation middleware in `middleware/validation.py`.
-4. Write unit tests in `tests/unit/test_parameter.py`.
-5. Build UI component `components/ParameterInput.html`.
-
-## Feature: Stochastic Variables
-
-### Implementation Steps
-
-1. Create `models/distribution.py` for variable distributions.
-2. Implement API routes in `routes/distributions.py`.
-3. Write Pydantic schemas and validations.
-4. Write unit tests in `tests/unit/test_distribution.py`.
-5. Build UI component `components/DistributionEditor.html`.
-
-## Feature: Cost Tracking
-
-### Implementation Steps
-
-1. Create `models/capex.py` and `models/opex.py`.
-2. Implement API routes in `routes/costs.py`.
-3. Write Pydantic schemas for CAPEX/OPEX.
-4. Write unit tests in `tests/unit/test_costs.py`.
-5. Build UI component `components/CostForm.html`.
-
-## Feature: Consumption Tracking
-
-### Implementation Steps
-
-1. Create models for consumption: `chemical_consumption.py`, `fuel_consumption.py`, `water_consumption.py`, `scrap_consumption.py`.
-2. Implement API routes in `routes/consumption.py`.
-3. Write Pydantic schemas for consumption data.
-4. Write unit tests in `tests/unit/test_consumption.py`.
-5. Build UI component `components/ConsumptionDashboard.html`.
-
-## Feature: Production Output
-
-### Implementation Steps
-
-1. Create `models/production_output.py`.
-2. Implement API routes in `routes/production.py`.
-3. Write Pydantic schemas for production output.
-4. Write unit tests in `tests/unit/test_production.py`.
-5. Build UI component `components/ProductionChart.html`.
-
-## Feature: Equipment Management
-
-### Implementation Steps
-
-1. Create `models/equipment.py` for equipment data.
-2. Implement API routes in `routes/equipment.py`.
-3. Write Pydantic schemas for equipment.
-4. Write unit tests in `tests/unit/test_equipment.py`.
-5. Build UI component `components/EquipmentList.html`.
-
-## Feature: Maintenance Logging
-
-### Implementation Steps
-
-1. Create `models/maintenance.py` for maintenance events.
-2. Implement API routes in `routes/maintenance.py`.
-3. Write Pydantic schemas for maintenance logs.
-4. Write unit tests in `tests/unit/test_maintenance.py`.
-5. Build UI component `components/MaintenanceLog.html`.
-
-## Feature: Monte Carlo Simulation Engine
-
-### Implementation Steps
-
-1. Implement Monte Carlo logic in `services/simulation.py`.
-2. Persist results in `models/simulation_result.py`.
-3. Expose endpoint in `routes/simulations.py`.
-4. Write integration tests in `tests/unit/test_simulation.py`.
-5. Build UI component `components/SimulationRunner.html`.
-
-## Feature: Reporting / Dashboard
-
-### Implementation Steps
-
-1. Implement report calculations in `services/reporting.py`.
-2. Add detailed and summary endpoints in `routes/reporting.py`.
-3. Write unit tests in `tests/unit/test_reporting.py`.
-4. Enhance UI in `components/Dashboard.html` with charts.
-
-## MVP Feature Analysis (summary)
-
-Goal: Identify core MVP features, acceptance criteria, and quick estimates.
-
-Features:
-
- Scenario Management
-
-  - Acceptance: create/read/update/delete scenarios; persist to DB; API coverage with tests.
-  - Estimate: 3-5 days (backend + minimal UI).
-
- Parameter Input & Validation
-
-  - Acceptance: define parameter schemas, validate inputs, surface errors to API/UI.
-  - Estimate: 2-3 days.
-
- Monte Carlo Simulation Engine
-
-  - Acceptance: run parameterised simulations, store results, ability to rerun with different seeds, basic progress reporting.
-  - Estimate: 1-2 weeks (core engine + persistence).
-
- Reporting / Dashboard
-  - Acceptance: display simulation outputs (NPV, IRR distributions), basic charts, export CSV.
-  - Estimate: 4-7 days.
-
-Edge cases to consider:
-
- Large simulation runs (memory / timeouts) — use streaming, chunking, or background workers.
- DB migration and schema versioning.
- Authentication/authorization for scenario access.
-
-Next actionable items:
-
-1. Break Scenario Management into sub-issues (models, routes, tests, simple UI).
-2. Scaffold Parameter Input & Validation (models/parameters.py, middleware, routes, tests).
-3. Prototype the simulation engine with a small deterministic runner and unit tests.
-4. Scaffold Monte Carlo Simulation endpoints (`services/simulation.py`, `routes/simulations.py`, tests).
-5. Scaffold Reporting endpoints (`services/reporting.py`, `routes/reporting.py`, front-end Dashboard, tests).
-6. Add CI job for tests and coverage.
-
-## UI Template Audit (2025-10-20)
-
- Existing HTML templates: `ScenarioForm.html`, `ParameterInput.html`, and `Dashboard.html` (reporting summary view).
- Coverage gaps remain for costs, consumption, production, equipment, maintenance, and simulation workflows—no dedicated templates yet.
- Shared layout primitives (navigation/header/footer) are absent; current pages duplicate boilerplate markup.
- Dashboard currently covers reporting metrics but should be wired to a central `/` route once the shared layout lands.
- Next steps align with the updated TODO checklist: introduce a `base.html`, refactor existing templates to extend it, and scaffold placeholder pages for the remaining features.
--- a/docs/mvp.md
+++ b/docs/mvp.md
@@ -1,18 +0,0 @@
-# MVP Features
-
-## Prioritized Features
-
-1. **Scenario Creation and Management** (High Priority): Allow users to create, edit, and delete scenarios. Rationale: Core functionality for what-if analysis.
-2. **Parameter Input and Validation** (High Priority): Input process parameters with validation. Rationale: Ensures data integrity for simulations.
-3. **Monte Carlo Simulation Run** (High Priority): Execute simulations and store results. Rationale: Key differentiator for risk analysis.
-4. **Basic Reporting** (Medium Priority): Display NPV, IRR, EBITDA from simulation results. Rationale: Essential for decision-making.
-5. **Cost Tracking Dashboard** (Medium Priority): Visualize CAPEX and OPEX. Rationale: Helps monitor expenses.
-6. **Consumption Monitoring** (Low Priority): Track resource consumption. Rationale: Useful for optimization.
-7. **User Authentication** (Medium Priority): Basic login/logout. Rationale: Security for multi-user access.
-8. **Export Results** (Low Priority): Export simulation data to CSV/PDF. Rationale: For external analysis.
-
-## Rationale for Prioritization
-
- High: Core simulation and scenario features first.
- Medium: Reporting and auth for usability.
- Low: Nice-to-haves after basics.
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -0,0 +1,83 @@
+# Quickstart & Expanded Project Documentation
+
+This document contains the expanded development, usage, testing, and migration guidance moved out of the top-level README for brevity.
+
+## Development
+
+To get started locally:
+
+```powershell
+# Clone the repository
+git clone https://git.allucanget.biz/allucanget/calminer.git
+cd calminer
+
+# Create and activate a virtual environment
+python -m venv .venv
+.\.venv\Scripts\Activate.ps1
+
+# Install dependencies
+pip install -r requirements.txt
+
+# Start the development server
+uvicorn main:app --reload
+```
+
+## Usage Overview
+
+- **API base URL**: `http://localhost:8000/api`
+- Key routes include creating scenarios, parameters, costs, consumption, production, equipment, maintenance, and reporting summaries. See the `routes/` directory for full details.
+
+## Dashboard Preview
+
+1. Start the FastAPI server and navigate to `/`.
+2. Review the headline metrics, scenario snapshot table, and cost/activity charts sourced from the current database state.
+3. Use the "Refresh Dashboard" button to pull freshly aggregated data via `/ui/dashboard/data` without reloading the page.
+
+## Testing
+
+Run the unit test suite:
+
+```powershell
+pytest
+```
+
+E2E tests use Playwright and a session-scoped `live_server` fixture that starts the app at `http://localhost:8001` for browser-driven tests.
+
+## Migrations & Currency Backfill
+
+The project includes a referential `currency` table and migration/backfill tooling to normalize legacy currency fields.
+
+### Run migrations and backfill (development)
+
+Ensure `DATABASE_URL` is set in your PowerShell session to point at a development Postgres instance.
+
+```powershell
+$env:DATABASE_URL = 'postgresql://user:pass@host/db'
+python scripts/run_migrations.py
+python scripts/backfill_currency.py --dry-run
+python scripts/backfill_currency.py --create-missing
+```
+
+Use `--dry-run` first to verify what will change.
+
+## Database Objects
+
+The database contains tables such as `capex`, `opex`, `chemical_consumption`, `fuel_consumption`, `water_consumption`, `scrap_consumption`, `production_output`, `equipment_operation`, `ore_batch`, `exchange_rate`, and `simulation_result`.
+
+## Current implementation status (2025-10-21)
+
+- Currency normalization: a `currency` table and backfill scripts exist; routes accept `currency_id` and `currency_code` for compatibility.
+- Simulation engine: scaffolding in `services/simulation.py` and `/api/simulations/run` return in-memory results; persistence to `models/simulation_result` is planned.
+- Reporting: `services/reporting.py` provides summary statistics used by `POST /api/reporting/summary`.
+- Tests & coverage: unit and E2E suites exist; recent local coverage is >90%.
+- Remaining work: authentication, persist simulation runs, CI/CD and containerization.
+
+## Where to look next
+
+- Architecture overview & chapters: `docs/architecture.md` (per-chapter files under `docs/architecture/`)
+- Testing & CI: `docs/architecture/14_testing_ci.md`
+- Development setup: `docs/architecture/15_development_setup.md`
+- Implementation plan & roadmap: `docs/architecture/04_solution_strategy_extended.md`
+- Routes: `routes/` directory
+- Services: `services/` directory
+- Scripts: `scripts/` directory (migrations and backfills)
--- a/docs/testing.md
+++ b/docs/testing.md
@@ -1,113 +0,0 @@
-# Testing Strategy
-
-## Overview
-
-CalMiner will use a combination of unit, integration, and end-to-end tests to ensure quality.
-
-## Frameworks
-
- **Backend**: pytest for unit and integration tests.
- **Frontend**: pytest with Playwright for E2E tests.
- **Database**: pytest fixtures with psycopg2 for DB tests.
-
-## Test Types
-
- **Unit Tests**: Test individual functions/modules.
- **Integration Tests**: Test API endpoints and DB interactions.
- **E2E Tests**: Playwright for full user flows.
-
-## CI/CD
-
- Use GitHub Actions for CI.
- Run tests on pull requests.
- Code coverage target: 80% (using pytest-cov).
-
-## Running Tests
-
- Unit: `pytest tests/unit/`
- E2E: `pytest tests/e2e/`
- All: `pytest`
-
-## Test Directory Structure
-
-Organize tests under the `tests/` directory mirroring the application structure:
-
-```text
-tests/
-  unit/
-    test_<module>.py
-  e2e/
-    test_<flow>.py
-  fixtures/
-    conftest.py
-```
-
-## Writing Tests
-
- Name tests with the `test_` prefix.
- Group related tests in classes or modules.
- Use descriptive assertion messages.
-
-## Fixtures and Test Data
-
- Define reusable fixtures in `tests/fixtures/conftest.py`.
- Use temporary in-memory databases or isolated schemas for DB tests.
- Load sample data via fixtures for consistent test environments.
- Leverage the `seeded_ui_data` fixture in `tests/unit/conftest.py` to populate scenarios with related cost, maintenance, and simulation records for deterministic UI route checks.
- Use `tests/unit/test_ui_routes.py` to verify that `/ui/dashboard`, `/ui/scenarios`, and `/ui/reporting` render expected context and that `/ui/dashboard/data` emits aggregated JSON payloads.
- Use `tests/unit/test_router_validation.py` to exercise request validation branches for scenario creation, parameter distribution rules, simulation inputs, reporting summaries, and maintenance costs.
-
-## E2E (Playwright) Tests
-
-The E2E test suite, located in `tests/e2e/`, uses Playwright to simulate user interactions in a live browser environment. These tests are designed to catch issues in the UI, frontend-backend integration, and overall application flow.
-
-### Fixtures
-
- `live_server`: A session-scoped fixture that launches the FastAPI application in a separate process, making it accessible to the browser.
- `playwright_instance`, `browser`, `page`: Standard `pytest-playwright` fixtures for managing the Playwright instance, browser, and individual pages.
-
-### Smoke Tests
-
- **UI Page Loading**: `test_smoke.py` contains a parameterized test that systematically navigates to all UI routes to ensure they load without errors, have the correct title, and display a primary heading.
- **Form Submissions**: Each major form in the application has a corresponding test file (e.g., `test_scenarios.py`, `test_costs.py`) that verifies:
-  - The form page loads correctly.
-  - A new item can be created by filling out and submitting the form.
-  - The application provides immediate visual feedback (e.g., a success message).
-  - The UI is dynamically updated to reflect the new item (e.g., a new row in a table).
-
-### Running E2E Tests
-
-To run the Playwright tests, use the following command:
-
-```bash
-pytest tests/e2e/
-```
-
-To run the tests in headed mode and observe the browser interactions, use:
-
-```bash
-pytest tests/e2e/ --headed
-```
-
-## Mocking and Dependency Injection
-
- Use `unittest.mock` to mock external dependencies.
- Inject dependencies via function parameters or FastAPI's dependency overrides in tests.
-
-## Code Coverage
-
- Install `pytest-cov` to generate coverage reports.
- Run with coverage: `pytest --cov --cov-report=term` for quick baselines (use `--cov-report=html` when visualizing hotspots).
- Target 95%+ overall coverage. Focus on historically low modules: `services/simulation.py`, `services/reporting.py`, `middleware/validation.py`, and `routes/ui.py`.
- Recent additions include unit tests that validate Monte Carlo parameter errors, reporting fallbacks, and JSON middleware rejection paths to guard against malformed inputs.
- Latest snapshot (2025-10-21): `pytest --cov=. --cov-report=term-missing` returns **91%** overall coverage after achieving full coverage in `routes/ui.py` and `services/simulation.py`.
- Archive coverage artifacts by running `pytest --cov=. --cov-report=xml:reports/coverage/coverage-2025-10-21.xml --cov-report=term-missing`; the generated XML lives under `reports/coverage/` for CI uploads or historical comparisons.
-
-## CI Integration
-
- Configure GitHub Actions workflow in `.github/workflows/ci.yml` to:
-  - Install dependencies, including Playwright browsers (`playwright install`).
-  - Run `pytest` with coverage for unit tests.
-  - Run `pytest tests/e2e/` for E2E tests.
-  - Fail on coverage <80%.
-  - Upload coverage artifact.
--- a/scripts/check_docs_links.py
+++ b/scripts/check_docs_links.py
@@ -0,0 +1,43 @@
+"""Simple Markdown link checker for local docs/ files.
+
+Checks only local file links (relative paths) and reports missing targets.
+
+Run from the repository root using the project's Python environment.
+"""
+import re
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parent.parent
+DOCS = ROOT / 'docs'
+
+MD_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
+
+errors = []
+
+for md in DOCS.rglob('*.md'):
+    text = md.read_text(encoding='utf-8')
+    for m in MD_LINK_RE.finditer(text):
+        label, target = m.groups()
+        # skip URLs
+        if target.startswith('http://') or target.startswith('https://') or target.startswith('#'):
+            continue
+        # strip anchors
+        target_path = target.split('#')[0]
+        # if link is to a directory index, allow
+        candidate = (md.parent / target_path).resolve()
+        if candidate.exists():
+            continue
+        # check common implicit index: target/ -> target/README.md or target/index.md
+        candidate_dir = md.parent / target_path
+        if candidate_dir.is_dir():
+            if (candidate_dir / 'README.md').exists() or (candidate_dir / 'index.md').exists():
+                continue
+        errors.append((str(md.relative_to(ROOT)), target, label))
+
+if errors:
+    print('Broken local links found:')
+    for src, tgt, label in errors:
+        print(f'- {src} -> {tgt}  ({label})')
+    exit(2)
+
+print('No broken local links detected.')
--- a/scripts/format_docs_md.py
+++ b/scripts/format_docs_md.py
@@ -0,0 +1,79 @@
+"""Lightweight Markdown formatter: normalizes first-line H1, adds code-fence language hints for common shebangs, trims trailing whitespace.
+
+This is intentionally small and non-destructive; it touches only files under docs/ and makes safe changes.
+"""
+import re
+from pathlib import Path
+
+DOCS = Path(__file__).resolve().parents[1] / "docs"
+
+CODE_LANG_HINTS = {
+    'powershell': ('powershell',),
+    'bash': ('bash', 'sh'),
+    'sql': ('sql',),
+    'python': ('python',),
+}
+
+
+def add_code_fence_language(match):
+    fence = match.group(0)
+    inner = match.group(1)
+    # If language already present, return unchanged
+    if fence.startswith('```') and len(fence.splitlines()[0].strip()) > 3:
+        return fence
+    # Try to infer language from the code content
+    code = inner.strip().splitlines()[0] if inner.strip() else ''
+    lang = ''
+    if code.startswith('$') or code.startswith('PS') or code.lower().startswith('powershell'):
+        lang = 'powershell'
+    elif code.startswith('#') or code.startswith('import') or code.startswith('from'):
+        lang = 'python'
+    elif re.match(r'^(select|insert|update|create)\b', code.strip(), re.I):
+        lang = 'sql'
+    elif code.startswith('git') or code.startswith('./') or code.startswith('sudo'):
+        lang = 'bash'
+    if lang:
+        return f'```{lang}\n{inner}\n```'
+    return fence
+
+
+def normalize_file(path: Path):
+    text = path.read_text(encoding='utf-8')
+    orig = text
+    # Trim trailing whitespace and ensure single trailing newline
+    text = '\n'.join(line.rstrip() for line in text.splitlines()) + '\n'
+    # Ensure first non-empty line is H1
+    lines = text.splitlines()
+    for i, ln in enumerate(lines):
+        if ln.strip():
+            if not ln.startswith('#'):
+                lines[i] = '# ' + ln
+            break
+    text = '\n'.join(lines) + '\n'
+    # Add basic code fence languages where missing (simple heuristic)
+    text = re.sub(r'```\n([\s\S]*?)\n```', add_code_fence_language, text)
+    if text != orig:
+        path.write_text(text, encoding='utf-8')
+        return True
+    return False
+
+
+def main():
+    changed = []
+    for p in DOCS.rglob('*.md'):
+        if p.is_file():
+            try:
+                if normalize_file(p):
+                    changed.append(str(p.relative_to(Path.cwd())))
+            except Exception as e:
+                print(f"Failed to format {p}: {e}")
+    if changed:
+        print('Formatted files:')
+        for c in changed:
+            print(' -', c)
+    else:
+        print('No formatting changes required.')
+
+
+if __name__ == '__main__':
+    main()