feat: add backtesting functionality with UI and API endpoints

- Introduced backtesting page and fragment in the dashboard for running backtests and viewing recent reports. - Implemented backtest run logic with configuration options including event path, starting balances, trade capital, and fee profiles. - Added recent backtest reports storage and retrieval. - Created a new strategy module for statistical arbitrage experiments with validation on configuration parameters. - Updated settings to include parameters for the statistical arbitrage strategy. - Enhanced dashboard controls to support the new strategy mode. - Added unit tests for backtesting functionality and strategy validation. - Updated templates for backtesting UI integration.
2026-06-02 09:28:22 +02:00
parent f612c8533a
commit 38e1d64437
17 changed files with 1089 additions and 165 deletions
@@ -349,125 +349,11 @@ Example pushed image tag shape:
 git.allucanget.biz/allucanget/arbitrade:latest
 ```

-## Project Layout
+## Architecture Docs

-```text
-arbitrade/
-├── .gitea/workflows/ci.yml
-├── .github/instructions/TODO.md
-├── PLAN.md
-├── pyproject.toml
-├── src/arbitrade/
-│   ├── api/
-│   ├── config/
-│   ├── storage/
-│   ├── logging_setup.py
-│   └── main.py
-├── tests/
-└── web/templates/
-```
+Implementation detail moved into arc42 docs:

-## Next Work
+- [arc42 overview](docs/architecture/arc42.md) - system context, building blocks, runtime, deployment, quality goals, risks.
+- [current implementation snapshot](docs/architecture/current-implementation.md) - codebase state, active routes, backtesting, strategy flags, deployment flow.

-Next planned implementation slice:
-
- Kraken REST client skeleton
- native Kraken WebSocket client
- in-memory order book cache
- latency instrumentation
-
-## Troubleshooting
-
-PowerShell blocks activation script:
-
-```powershell
-Set-ExecutionPolicy -Scope Process -ExecutionPolicy RemoteSigned
-```
-
-Then activate again:
-
-```powershell
-.\.venv\Scripts\Activate.ps1
-```
-
-If app import fails, confirm editable install ran:
-
-```powershell
-uv pip install -e .[dev]
-```
-
-If DuckDB file missing, start app once or create `data/` directory manually.
-
-## Security Hardening
-
-Threat model notes:
-
- Primary risk surfaces: environment secrets, dashboard auth credentials, exchange API key scope, and dependency supply chain.
- Assumed attacker model: leaked repository content, leaked CI logs/artifacts, or unauthorized dashboard access.
- High-impact outcomes to prevent: credential exfiltration, unauthorized withdrawals, and unsafe live-trading control changes.
-
-Hardening checklist:
-
- Use least-privilege Kraken API keys: query + trade only; never enable withdrawal.
- Rotate API keys immediately if secret scan flags a potential exposure.
- Keep dashboard auth enabled in non-local environments and avoid default/shared credentials.
- Run `pip-audit -r requirements/latest-runtime.in` in CI; treat vulnerability findings as release blockers.
- Run `python scripts/security_scan.py` before release and after major merges.
- Store secrets in environment/secret manager; never commit `.env` or key material.
-
-## Performance Hardening
-
-Profile scenarios:
-
- `book_update_burst`
- `execution_spike`
- `reconnect_storm`
-
-## Backtesting
-
-Run a deterministic replay backtest from a JSONL event stream:
-
-```powershell
-python scripts/backtest_replay.py --events path\to\replay.jsonl --starting-balances USD=1000.0
-```
-
-Run parameter sweep with train/test split and promotion scoring:
-
-```powershell
-python scripts/backtest_sweep.py --events path\to\replay.jsonl --starting-balances USD=1000.0 --output ops/backtesting/parameter_sweep_results.json
-```
-
-Replay event format:
-
-```json
-{
-  "timestamp": "2026-06-01T12:00:00Z",
-  "symbol": "BTC/USD",
-  "bids": [[100.0, 1.0]],
-  "asks": [[101.0, 1.0]]
-}
-```
-
-Notes:
-
- Events are replayed in timestamp order.
- The replay engine reuses the production detector, pre-trade validation, trade limits, and execution sequencer.
- The simulated execution path applies configurable slippage and execution latency so reports include deterministic trade/miss statistics.
- Parameter sweep splits replay data into in-sample and out-of-sample windows, ranks configurations by out-of-sample score, and flags overfit via train/test generalization-gap checks.
- Sweep output persists ranked combinations and promotion-ready candidates for paper-trading canary promotion decisions.
- Latency baseline and threshold artifacts:
-
- `ops/performance/latency_baseline.json`
- `ops/performance/latency_thresholds.json`
-
-CI guardrail:
-
- `.gitea/workflows/ci.yml` runs `scripts/check_latency_regression.py` and fails on regression.
-
-Measured optimization impact (2026-06-01):
-
- `MetricsCalculator.compute()` switched from Python row scans to DuckDB SQL aggregates/quantiles.
- Benchmark (`scripts/benchmark_metrics_compute.py`):
-  - Python scan avg: `12.623 ms`
-  - SQL aggregate avg: `11.039 ms`
-  - Speedup: `1.14x`
+For navigation from README, use the docs above instead of this file for deep architecture detail.