Files
calminer-docs/specifications/monte_carlo_simulation.md
zwitschi 29f16139a3 feat: documentation update
- Completed export workflow implementation (query builders, CSV/XLSX serializers, streaming API endpoints, UI modals, automated tests).
- Added export modal UI and client script to trigger downloads directly from dashboard.
- Documented import/export field mapping and usage guidelines in FR-008.
- Updated installation guide with export environment variables, dependencies, and CLI/CI usage instructions.
2025-11-11 18:34:02 +01:00

6.6 KiB
Raw Blame History

Monte Carlo Simulation Specification

1. Purpose

Define the configuration, inputs, and outputs for CalMiner's Monte Carlo simulation engine used to evaluate project scenarios with stochastic cash-flow assumptions. The engine augments deterministic profitability metrics by sampling cash-flow distributions and aggregating resulting Net Present Value (NPV), Internal Rate of Return (IRR), and Payback Period statistics.

2. Scope

  • Applies to scenario-level profitability analysis executed via services/simulation.py.
  • Covers configuration dataclasses (SimulationConfig, CashFlowSpec, DistributionSpec) and supported distribution families.
  • Outlines expectations for downstream reporting and visualization modules that consume simulation results.

3. Inputs

3.1 Cash Flow Specifications

Each Monte Carlo run receives an ordered collection of CashFlowSpec entries. Each spec pairs a deterministic CashFlow (amount, period index/date) with an optional DistributionSpec. When no distribution is provided the deterministic value is used for every iteration.

3.2 Simulation Configuration

SimulationConfig controls execution:

Field Description
iterations Number of Monte Carlo iterations (must be > 0).
discount_rate Annual discount rate (decimal) passed to NPV helper.
seed Optional RNG seed to ensure reproducible sampling.
metrics Tuple of requested metrics (npv, irr, payback).
percentiles Percentile cutoffs (0100) computed for each metric.
compounds_per_year Compounding frequency reused by financial helpers.
return_samples When True, raw metric samples are returned alongside summaries.
residual_value / residual_periods Optional residual cash flow inputs reused by NPV.

3.3 Context Metadata

Optional dictionaries provide dynamic parameters when sourcing distribution means or other values:

  • scenario_context: scenario-specific values (e.g., salvage mean, cost overrides).
  • metadata: shared configuration (e.g., global commodity price expectations).

4. Distributions

DistributionSpec defines stochastic behaviour:

Property Description
type normal, lognormal, triangular, or discrete.
parameters Mapping of required parameters per distribution family.
source How base parameters are sourced: static, scenario_field, or metadata_key.
source_key Identifier used for non-static sources.

4.1 Parameter Validation

  • normal: requires non-negative std_dev; defaults mean to baseline cash flow amount when omitted.
  • lognormal: requires mean (mu in log space) and non-negative sigma.
  • triangular: requires min, mode, max with constraint min <= mode <= max.
  • discrete: requires paired values/probabilities sequences; probabilities must be non-negative and sum to 1.0.

Invalid definitions raise DistributionConfigError before sampling.

5. Algorithm Overview

  1. Seed a NumPy Generator (default_rng(seed)) unless a generator instance is supplied.
  2. For each iteration:
    • Realise cash flows by sampling distributions or using deterministic values.
    • Compute requested metrics using shared helpers from services/financial.py:
      • NPV via net_present_value (respecting residual_value inputs).
      • IRR via internal_rate_of_return; non-converging or invalid trajectories return NaN and increment failed_runs.
      • Payback via payback_period; scenarios failing to hit non-negative cumulative cash flow record NaN.
  3. Aggregate results into per-metric arrays; calculate summary statistics: mean, sample standard deviation, min/max, and configured percentiles using numpy.percentile.
  4. Assemble SimulationResult containing summary descriptors and optional raw samples when return_samples is enabled.

6. Outputs

SimulationResult includes:

  • iterations: total iteration count executed.
  • summaries: mapping of SimulationMetric to MetricSummary objects with:
    • mean, std_dev, minimum, maximum.
    • percentiles: mapping of configured percentile cutoffs to values.
    • sample_size: number of successful (non-NaN) samples.
    • failed_runs: count of iterations producing NaN for the metric.
  • samples: optional mapping of metric to raw numpy.ndarray of samples when detailed analysis is required downstream.

7. Error Handling

  • Invalid configuration or missing context raises DistributionConfigError.
  • Zero iterations or invalid percentile ranges raise ValueError.
  • Financial helper exceptions (ConvergenceError, PaybackNotReachedError) are captured per iteration and converted to NaN samples to preserve aggregate results while flagging failure counts.

8. Usage Guidance

  • Scenario services should construct CashFlowSpec instances from persisted financial inputs and optional uncertainty definitions stored alongside the scenario.
  • Reporting routes can request raw samples when producing histogram or violin plots; otherwise rely on MetricSummary statistics for tabular output.
  • Visualizations implementing FR-005 should leverage percentile outputs to render fan charts or confidence intervals.
  • When integrating with scheduling workflows, persist the deterministic seed to ensure repeated runs remain comparable.

9. Testing

tests/test_simulation.py covers deterministic parity with financial helpers, seed reproducibility, context parameter sourcing, failure accounting for metrics that cannot be computed, error handling for misconfigured distributions, and sample-return functionality. Additional regression cases should accompany new metrics or distribution families.

10. References

  • Implementation: calminer/services/simulation.py
  • Financial helpers: calminer/services/financial.py
  • Tests: calminer/tests/test_simulation.py
  • Related specification: calminer-docs/specifications/financial_metrics.md