feat: enhance backtesting panel with flash messages and pairing checks

2026-06-07 21:51:09 +02:00
parent 5e7732b85f
commit f221464daa
4 changed files with 269 additions and 79 deletions
@@ -53,7 +53,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
 - `detection/` - triangular graph and incremental detector.
 - `risk/` - pre-trade and trade-limit guards.
 - `execution/` - multi-leg trade sequencing.
- `backtesting/` - replay engine, parameter sweep, experiment scaffolds.
+- `backtesting/` - replay engine, parameter sweep, experiment scaffolds. See [backtesting.md](backtesting.md).
 - `strategy/` - experimental strategy modules such as stat-arb.
 - `storage/` - PostgreSQL schema and repositories.
 - `alerting/` - multi-channel notifications.
@@ -77,7 +77,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
 3. Incremental detector scores impacted cycles.
 4. Risk manager validates the opportunity.
 5. Execution sequencer places legs if approved.
-7. Trades and snapshots persist to PostgreSQL.
+6. Trades and snapshots persist to PostgreSQL.
 7. Dashboard and alerts reflect state changes.

 ### 6.2 Dashboard Control Flow
@@ -89,11 +89,14 @@ The bot consumes Kraken market data, detects opportunities, and executes trades

 ### 6.3 Backtesting Flow

-1. User selects JSONL replay file and run parameters.
-2. Replay engine loads ordered book events.
-3. Detector, risk, and execution logic run in simulation mode.
-4. Report is stored in memory for recent UI display.
-5. Parameter sweeps split data into train/test windows, rank results, and flag overfit.
+See [backtesting.md](backtesting.md) for full design and implementation details.
+
+1. User picks currency pairs (from config/pairings page, or all enabled).
+2. User sets starting balances (required), time range (required), min profit threshold (required).
+3. Fee profile defaults to "api (from Kraken)"; slippage (4.0 bps) and execution latency (20 ms) are optional with sensible defaults.
+4. Job is queued via `POST /dashboard/backtesting/run`.
+5. Backend loads events from `market_snapshots` table, builds triangular cycles, runs replay engine.
+6. Report stored in `backtest_jobs` table, visible in recent jobs list.

 ## 7. Deployment View

@@ -0,0 +1,130 @@
+# Backtesting Architecture
+
+> Detailed design and implementation of the backtesting subsystem.
+> See [`README.md`](README.md#63-backtesting-flow) for the high-level user flow.
+
+## Data Flow
+
+```txt
+market_snapshots (DB) ─┐
+                        ├──→ load_replay_events_from_db() ──→ list[ReplayBookEvent]
+JSONL file ─────────────┘
+                                      │
+                                      ▼
+                           BacktestReplayEngine.run()
+                                      │
+                                      ▼
+                              BacktestReport
+                                      │
+                                      ▼
+                           BacktestJobRepository.store_report()
+```
+
+Two event sources:
+
+- **DB mode** (default) — loads snapshots from `market_snapshots` table. Supports symbol/time filtering.
+- **File mode** — reads JSONL files from disk (legacy, used by `backtest_replay.py` script).
+
+## Core Types
+
+### `ReplayClock`
+
+Timekeeper for simulation. Ensures events advance monotonically. Supports `advance_ms()` to model execution latency.
+
+### `ReplayBookEvent`
+
+One atomic book state at a point in time. Fields: `occurred_at`, `symbol`, `bids: tuple[BookLevel]`, `asks: tuple[BookLevel]`.
+
+### `BacktestConfig`
+
+| Field                    | Default  | Description                                           |
+| ------------------------ | -------- | ----------------------------------------------------- |
+| `fee_rate`               | `0.0`    | 0.0 → API-sourced fee from `kraken_account_snapshots` |
+| `min_profit_threshold`   | `0.0005` | Minimum net profit to attempt trade                   |
+| `trade_capital`          | `100.0`  | Capital allocated per trade                           |
+| `quote_asset`            | `"USD"`  | Base currency for P&L                                 |
+| `slippage_bps`           | `4.0`    | Simulated slippage in basis points                    |
+| `execution_latency_ms`   | `20.0`   | Simulated latency per leg                             |
+| `max_depth_levels`       | `10`     | Order book depth for detection                        |
+| `max_concurrent_trades`  | `1`      | Max simultaneous trades                               |
+| `min_order_size_by_pair` | `None`   | Per-pair min order size overrides                     |
+
+### `BacktestReport`
+
+| Field                            | Type           | Description                        |
+| -------------------------------- | -------------- | ---------------------------------- |
+| `started_at` / `finished_at`     | datetime       | Simulation window                  |
+| `processed_events`               | int            | Events consumed                    |
+| `opportunities_seen`             | int            | Detected opportunities             |
+| `trades_executed`                | int            | Successful trades                  |
+| `win_rate`                       | float or None  | Fraction of profitable trades      |
+| `fill_rate`                      | float or None  | Average fill ratio                 |
+| `realized_pnl_usd`               | float          | Net P&L after slippage             |
+| `max_drawdown_usd`               | float          | Peak-to-trough equity drop         |
+| `miss_reasons`                   | dict[str, int] | Counters for skipped opportunities |
+| `execution_latency_p50/95/99_ms` | float or None  | Latency percentiles                |
+
+## Simulation Client
+
+`_SimulatedRestClient` replaces the real Kraken REST client during backtesting.
+
+- **Slippage model:** `fill_ratio = max(0.85, 1.0 - (slippage_bps / 10000.0) * 8.0)`
+- **Latency model:** Clock advances by `execution_latency_ms` before each simulated fill
+- Orders always fill (status = `"closed"`) at the modeled ratio
+
+## Job Worker
+
+`backtest_worker` is an `asyncio.Task` started in `create_app()` lifespan:
+
+```python
+backtest_task = asyncio.create_task(
+    backtest_worker(backtest_queue, db),
+    name="backtest_worker",
+)
+```
+
+Workflow per job:
+
+1. Dequeue `(job_id, config_dict)` from `asyncio.Queue`
+2. Update status → `"running"` in `backtest_jobs` table
+3. Load events (DB or file)
+4. Build currency graph → triangular cycles
+5. Instantiate `BacktestReplayEngine` → `engine.run()`
+6. Store report → update status → `"completed"` (or `"failed"` on exception)
+
+## Sweep Pipeline
+
+`run_parameter_search` performs grid search over backtest parameters:
+
+1. **Split** events into train/test windows by time ratio
+2. **Build grid** — cartesian product of `theta_values × trade_capital_values × pair_universes × staleness_threshold_values`
+3. **For each parameter set:**
+   - Filter events to pair universe + apply staleness gate
+   - Build cycles restricted to pair universe
+   - Run engine on train window → `train_report`
+   - Run engine on test window → `test_report`
+   - Score = `realized_pnl + win_rate_bonus + fill_rate_bonus - max_drawdown`
+   - Compute generalization gap = `|train_score - test_score| / max(train_score, test_score)`
+4. **Evaluate promotion:**
+   - `PromotionCriteria` checks: min test P&L, min win rate ≥ 0.5, min fill rate ≥ 0.9, max drawdown ≤ $25, generalization gap ≤ 0.5
+   - Results passing all criteria are flagged `promotion_ready`
+
+## UI
+
+> See `backtesting.html` → `partials/backtesting_panel.html`.
+
+- **Shell page** loads the panel via `hx-get="/dashboard/fragment/backtesting"`
+- **Run form** — starting balances, time range, profit threshold (required); fee profile, slippage, latency (advanced/collapsible)
+- **Status card** — current job status + message
+- **Recent jobs table** — lists last 20 jobs with status, events, trades, P&L; each row has a detail button
+- **Job detail** — `GET /dashboard/backtesting/job/{id}` returns report HTML
+
+Pairings are managed on the `/dashboard/config/pairings` page. Backtest uses DB-enabled pairings by default when no symbols are specified.
+
+## Source Files
+
+| File                    | Role                                                                                                                                                                          |
+| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `backtesting/replay.py` | `ReplayClock`, `ReplayBookEvent`, `BacktestConfig`, `BacktestReport`, `_SimulatedRestClient`, `BacktestReplayEngine`, `load_replay_events`, `load_replay_events_from_db`      |
+| `backtesting/runner.py` | `run_backtest_job`, `backtest_worker`, `_build_cycles_from_events`, `_parse_balances`                                                                                         |
+| `backtesting/sweep.py`  | `SweepParameters`, `SweepResult`, `SweepArtifacts`, `PromotionCriteria`, `split_events_time_windows`, `build_parameter_grid`, `run_parameter_search`, `persist_sweep_results` |