- Completed export workflow implementation (query builders, CSV/XLSX serializers, streaming API endpoints, UI modals, automated tests). - Added export modal UI and client script to trigger downloads directly from dashboard. - Documented import/export field mapping and usage guidelines in FR-008. - Updated installation guide with export environment variables, dependencies, and CLI/CI usage instructions.
6.6 KiB
Monte Carlo Simulation Specification
1. Purpose
Define the configuration, inputs, and outputs for CalMiner's Monte Carlo simulation engine used to evaluate project scenarios with stochastic cash-flow assumptions. The engine augments deterministic profitability metrics by sampling cash-flow distributions and aggregating resulting Net Present Value (NPV), Internal Rate of Return (IRR), and Payback Period statistics.
2. Scope
- Applies to scenario-level profitability analysis executed via
services/simulation.py. - Covers configuration dataclasses (
SimulationConfig,CashFlowSpec,DistributionSpec) and supported distribution families. - Outlines expectations for downstream reporting and visualization modules that consume simulation results.
3. Inputs
3.1 Cash Flow Specifications
Each Monte Carlo run receives an ordered collection of CashFlowSpec entries.
Each spec pairs a deterministic CashFlow (amount, period index/date) with an
optional DistributionSpec. When no distribution is provided the deterministic
value is used for every iteration.
3.2 Simulation Configuration
SimulationConfig controls execution:
| Field | Description |
|---|---|
iterations |
Number of Monte Carlo iterations (must be > 0). |
discount_rate |
Annual discount rate (decimal) passed to NPV helper. |
seed |
Optional RNG seed to ensure reproducible sampling. |
metrics |
Tuple of requested metrics (npv, irr, payback). |
percentiles |
Percentile cutoffs (0–100) computed for each metric. |
compounds_per_year |
Compounding frequency reused by financial helpers. |
return_samples |
When True, raw metric samples are returned alongside summaries. |
residual_value / residual_periods |
Optional residual cash flow inputs reused by NPV. |
3.3 Context Metadata
Optional dictionaries provide dynamic parameters when sourcing distribution means or other values:
scenario_context: scenario-specific values (e.g., salvage mean, cost overrides).metadata: shared configuration (e.g., global commodity price expectations).
4. Distributions
DistributionSpec defines stochastic behaviour:
| Property | Description |
|---|---|
type |
normal, lognormal, triangular, or discrete. |
parameters |
Mapping of required parameters per distribution family. |
source |
How base parameters are sourced: static, scenario_field, or metadata_key. |
source_key |
Identifier used for non-static sources. |
4.1 Parameter Validation
normal: requires non-negativestd_dev; defaultsmeanto baseline cash flow amount when omitted.lognormal: requiresmean(mu in log space) and non-negativesigma.triangular: requiresmin,mode,maxwith constraintmin <= mode <= max.discrete: requires pairedvalues/probabilitiessequences; probabilities must be non-negative and sum to 1.0.
Invalid definitions raise DistributionConfigError before sampling.
5. Algorithm Overview
- Seed a NumPy
Generator(default_rng(seed)) unless a generator instance is supplied. - For each iteration:
- Realise cash flows by sampling distributions or using deterministic values.
- Compute requested metrics using shared helpers from
services/financial.py:- NPV via
net_present_value(respectingresidual_valueinputs). - IRR via
internal_rate_of_return; non-converging or invalid trajectories returnNaNand incrementfailed_runs. - Payback via
payback_period; scenarios failing to hit non-negative cumulative cash flow recordNaN.
- NPV via
- Aggregate results into per-metric arrays; calculate summary statistics:
mean, sample standard deviation, min/max, and configured percentiles using
numpy.percentile. - Assemble
SimulationResultcontaining summary descriptors and optional raw samples whenreturn_samplesis enabled.
6. Outputs
SimulationResult includes:
iterations: total iteration count executed.summaries: mapping ofSimulationMetrictoMetricSummaryobjects with:mean,std_dev,minimum,maximum.percentiles: mapping of configured percentile cutoffs to values.sample_size: number of successful (non-NaN) samples.failed_runs: count of iterations producingNaNfor the metric.
samples: optional mapping of metric to rawnumpy.ndarrayof samples when detailed analysis is required downstream.
7. Error Handling
- Invalid configuration or missing context raises
DistributionConfigError. - Zero iterations or invalid percentile ranges raise
ValueError. - Financial helper exceptions (
ConvergenceError,PaybackNotReachedError) are captured per iteration and converted toNaNsamples to preserve aggregate results while flagging failure counts.
8. Usage Guidance
- Scenario services should construct
CashFlowSpecinstances from persisted financial inputs and optional uncertainty definitions stored alongside the scenario. - Reporting routes can request raw samples when producing histogram or violin
plots; otherwise rely on
MetricSummarystatistics for tabular output. - Visualizations implementing FR-005 should leverage percentile outputs to render fan charts or confidence intervals.
- When integrating with scheduling workflows, persist the deterministic seed to ensure repeated runs remain comparable.
9. Testing
tests/test_simulation.py covers deterministic parity with financial helpers,
seed reproducibility, context parameter sourcing, failure accounting for metrics
that cannot be computed, error handling for misconfigured distributions, and
sample-return functionality. Additional regression cases should accompany new
metrics or distribution families.
10. References
- Implementation:
calminer/services/simulation.py - Financial helpers:
calminer/services/financial.py - Tests:
calminer/tests/test_simulation.py - Related specification:
calminer-docs/specifications/financial_metrics.md