# Baseline Seed Data Plan This document captures the datasets that should be present in a fresh CalMiner installation and the structure required to manage them through `scripts/seed_data.py`. ## Currency Catalog The `currency` table already exists and is seeded today via `scripts/seed_data.py`. The goal is to keep the canonical list in one place and ensure the default currency (USD) is always active. | Code | Name | Symbol | Notes | | ---- | ------------------- | ------ | ---------------------------------------- | | USD | US Dollar | $ | Default currency (`DEFAULT_CURRENCY_CODE`) | | EUR | Euro | EUR symbol | | | CLP | Chilean Peso | $ | | | RMB | Chinese Yuan | RMB symbol | | | GBP | British Pound | GBP symbol | | | CAD | Canadian Dollar | $ | | | AUD | Australian Dollar | $ | | Seeding behaviour: - Upsert by ISO code; keep existing name/symbol when updated manually. - Ensure `is_active` remains true for USD and defaults to true for new rows. - Defer to runtime validation in `routes.currencies` for enforcing default behaviour. ## Measurement Units UI routes (`routes/ui.py`) currently rely on the in-memory `MEASUREMENT_UNITS` list to populate dropdowns for consumption and production forms. To make this configurable and available to the API, introduce a dedicated `measurement_unit` table and seed it. Proposed schema: | Column | Type | Notes | | ------------- | -------------- | ------------------------------------ | | id | SERIAL / BIGINT | Primary key. | | code | TEXT | Stable slug (e.g. `tonnes`). Unique. | | name | TEXT | Display label. | | symbol | TEXT | Short symbol (nullable). | | unit_type | TEXT | Category (`mass`, `volume`, `energy`).| | is_active | BOOLEAN | Default `true` for soft disabling. | | created_at | TIMESTAMP | Optional `NOW()` default. | | updated_at | TIMESTAMP | Optional `NOW()` trigger/default. | Initial seed set (mirrors existing UI list plus type categorisation): | Code | Name | Symbol | Unit Type | | --------------- | ---------------- | ------ | --------- | | tonnes | Tonnes | t | mass | | kilograms | Kilograms | kg | mass | | pounds | Pounds | lb | mass | | liters | Liters | L | volume | | cubic_meters | Cubic Meters | m3 | volume | | kilowatt_hours | Kilowatt Hours | kWh | energy | Seeding behaviour: - Upsert rows by `code`. - Preserve `unit_type` and `symbol` unless explicitly changed via administration tooling. - Continue surfacing unit options to the UI by querying this table instead of the static constant. ## Default Settings The application expects certain defaults to exist: - **Default currency**: enforced by `routes.currencies._ensure_default_currency`; ensure seeds keep USD active. - **Fallback measurement unit**: UI currently auto-selects the first option in the list. Once units move to the database, expose an application setting to choose a fallback (future work tracked under "Application Settings management"). ## Seeding Structure Updates To support the datasets above: 1. Extend `scripts/seed_data.py` with a `SeedDataset` registry so each dataset (currencies, units, future defaults) can declare its loader/upsert function and optional dependencies. 2. Add a `--dataset` CLI selector for targeted seeding while keeping `--all` as the default for `setup_database.py` integrations. 3. Update `scripts/setup_database.py` to: - Run migration ensuring `measurement_unit` table exists. - Execute the unit seeder after currencies when `--seed-data` is supplied. - Verify post-seed counts, logging which dataset was inserted/updated. 4. Adjust UI routes to load measurement units from the database and remove the hard-coded list once the table is available. This plan aligns with the TODO item for seeding initial data and lays the groundwork for consolidating migrations around a single baseline file that introduces both the schema and seed data in an idempotent manner.