feat: documentation update
- Completed export workflow implementation (query builders, CSV/XLSX serializers, streaming API endpoints, UI modals, automated tests). - Added export modal UI and client script to trigger downloads directly from dashboard. - Documented import/export field mapping and usage guidelines in FR-008. - Updated installation guide with export environment variables, dependencies, and CLI/CI usage instructions.
This commit is contained in:
141
specifications/monte_carlo_simulation.md
Normal file
141
specifications/monte_carlo_simulation.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# Monte Carlo Simulation Specification
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
Define the configuration, inputs, and outputs for CalMiner's Monte Carlo
|
||||
simulation engine used to evaluate project scenarios with stochastic cash-flow
|
||||
assumptions. The engine augments deterministic profitability metrics by
|
||||
sampling cash-flow distributions and aggregating resulting Net Present Value
|
||||
(NPV), Internal Rate of Return (IRR), and Payback Period statistics.
|
||||
|
||||
## 2. Scope
|
||||
|
||||
- Applies to scenario-level profitability analysis executed via
|
||||
`services/simulation.py`.
|
||||
- Covers configuration dataclasses (`SimulationConfig`, `CashFlowSpec`,
|
||||
`DistributionSpec`) and supported distribution families.
|
||||
- Outlines expectations for downstream reporting and visualization modules that
|
||||
consume simulation results.
|
||||
|
||||
## 3. Inputs
|
||||
|
||||
### 3.1 Cash Flow Specifications
|
||||
|
||||
Each Monte Carlo run receives an ordered collection of `CashFlowSpec` entries.
|
||||
Each spec pairs a deterministic `CashFlow` (amount, period index/date) with an
|
||||
optional `DistributionSpec`. When no distribution is provided the deterministic
|
||||
value is used for every iteration.
|
||||
|
||||
### 3.2 Simulation Configuration
|
||||
|
||||
`SimulationConfig` controls execution:
|
||||
|
||||
| Field | Description |
|
||||
| ------------------------------------- | ----------------------------------------------------------------- |
|
||||
| `iterations` | Number of Monte Carlo iterations (must be > 0). |
|
||||
| `discount_rate` | Annual discount rate (decimal) passed to NPV helper. |
|
||||
| `seed` | Optional RNG seed to ensure reproducible sampling. |
|
||||
| `metrics` | Tuple of requested metrics (`npv`, `irr`, `payback`). |
|
||||
| `percentiles` | Percentile cutoffs (0–100) computed for each metric. |
|
||||
| `compounds_per_year` | Compounding frequency reused by financial helpers. |
|
||||
| `return_samples` | When `True`, raw metric samples are returned alongside summaries. |
|
||||
| `residual_value` / `residual_periods` | Optional residual cash flow inputs reused by NPV. |
|
||||
|
||||
### 3.3 Context Metadata
|
||||
|
||||
Optional dictionaries provide dynamic parameters when sourcing distribution
|
||||
means or other values:
|
||||
|
||||
- `scenario_context`: scenario-specific values (e.g., salvage mean, cost
|
||||
overrides).
|
||||
- `metadata`: shared configuration (e.g., global commodity price expectations).
|
||||
|
||||
## 4. Distributions
|
||||
|
||||
`DistributionSpec` defines stochastic behaviour:
|
||||
|
||||
| Property | Description |
|
||||
| ------------ | ------------------------------------------------------------------------------- |
|
||||
| `type` | `normal`, `lognormal`, `triangular`, or `discrete`. |
|
||||
| `parameters` | Mapping of required parameters per distribution family. |
|
||||
| `source` | How base parameters are sourced: `static`, `scenario_field`, or `metadata_key`. |
|
||||
| `source_key` | Identifier used for non-static sources. |
|
||||
|
||||
### 4.1 Parameter Validation
|
||||
|
||||
- `normal`: requires non-negative `std_dev`; defaults `mean` to baseline cash
|
||||
flow amount when omitted.
|
||||
- `lognormal`: requires `mean` (mu in log space) and non-negative `sigma`.
|
||||
- `triangular`: requires `min`, `mode`, `max` with constraint `min <= mode <= max`.
|
||||
- `discrete`: requires paired `values`/`probabilities` sequences; probabilities
|
||||
must be non-negative and sum to 1.0.
|
||||
|
||||
Invalid definitions raise `DistributionConfigError` before sampling.
|
||||
|
||||
## 5. Algorithm Overview
|
||||
|
||||
1. Seed a NumPy `Generator` (`default_rng(seed)`) unless a generator instance is
|
||||
supplied.
|
||||
2. For each iteration:
|
||||
- Realise cash flows by sampling distributions or using deterministic
|
||||
values.
|
||||
- Compute requested metrics using shared helpers from
|
||||
`services/financial.py`:
|
||||
- NPV via `net_present_value` (respecting `residual_value` inputs).
|
||||
- IRR via `internal_rate_of_return`; non-converging or invalid trajectories
|
||||
return `NaN` and increment `failed_runs`.
|
||||
- Payback via `payback_period`; scenarios failing to hit non-negative
|
||||
cumulative cash flow record `NaN`.
|
||||
3. Aggregate results into per-metric arrays; calculate summary statistics:
|
||||
mean, sample standard deviation, min/max, and configured percentiles using
|
||||
`numpy.percentile`.
|
||||
4. Assemble `SimulationResult` containing summary descriptors and optional raw
|
||||
samples when `return_samples` is enabled.
|
||||
|
||||
## 6. Outputs
|
||||
|
||||
`SimulationResult` includes:
|
||||
|
||||
- `iterations`: total iteration count executed.
|
||||
- `summaries`: mapping of `SimulationMetric` to `MetricSummary` objects with:
|
||||
- `mean`, `std_dev`, `minimum`, `maximum`.
|
||||
- `percentiles`: mapping of configured percentile cutoffs to values.
|
||||
- `sample_size`: number of successful (non-NaN) samples.
|
||||
- `failed_runs`: count of iterations producing `NaN` for the metric.
|
||||
- `samples`: optional mapping of metric to raw `numpy.ndarray` of samples when
|
||||
detailed analysis is required downstream.
|
||||
|
||||
## 7. Error Handling
|
||||
|
||||
- Invalid configuration or missing context raises `DistributionConfigError`.
|
||||
- Zero iterations or invalid percentile ranges raise `ValueError`.
|
||||
- Financial helper exceptions (`ConvergenceError`, `PaybackNotReachedError`)
|
||||
are captured per iteration and converted to `NaN` samples to preserve
|
||||
aggregate results while flagging failure counts.
|
||||
|
||||
## 8. Usage Guidance
|
||||
|
||||
- Scenario services should construct `CashFlowSpec` instances from persisted
|
||||
financial inputs and optional uncertainty definitions stored alongside the
|
||||
scenario.
|
||||
- Reporting routes can request raw samples when producing histogram or violin
|
||||
plots; otherwise rely on `MetricSummary` statistics for tabular output.
|
||||
- Visualizations implementing FR-005 should leverage percentile outputs to
|
||||
render fan charts or confidence intervals.
|
||||
- When integrating with scheduling workflows, persist the deterministic seed to
|
||||
ensure repeated runs remain comparable.
|
||||
|
||||
## 9. Testing
|
||||
|
||||
`tests/test_simulation.py` covers deterministic parity with financial helpers,
|
||||
seed reproducibility, context parameter sourcing, failure accounting for metrics
|
||||
that cannot be computed, error handling for misconfigured distributions, and
|
||||
sample-return functionality. Additional regression cases should accompany new
|
||||
metrics or distribution families.
|
||||
|
||||
## 10. References
|
||||
|
||||
- Implementation: `calminer/services/simulation.py`
|
||||
- Financial helpers: `calminer/services/financial.py`
|
||||
- Tests: `calminer/tests/test_simulation.py`
|
||||
- Related specification: `calminer-docs/specifications/financial_metrics.md`
|
||||
Reference in New Issue
Block a user