feat: enhance backtesting panel with flash messages and pairing checks
CI / lint-test-build (push) Failing after 12s

This commit is contained in:
2026-06-07 21:51:09 +02:00
parent 5e7732b85f
commit f221464daa
4 changed files with 269 additions and 79 deletions
+10 -7
View File
@@ -53,7 +53,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
- `detection/` - triangular graph and incremental detector.
- `risk/` - pre-trade and trade-limit guards.
- `execution/` - multi-leg trade sequencing.
- `backtesting/` - replay engine, parameter sweep, experiment scaffolds.
- `backtesting/` - replay engine, parameter sweep, experiment scaffolds. See [backtesting.md](backtesting.md).
- `strategy/` - experimental strategy modules such as stat-arb.
- `storage/` - PostgreSQL schema and repositories.
- `alerting/` - multi-channel notifications.
@@ -77,7 +77,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
3. Incremental detector scores impacted cycles.
4. Risk manager validates the opportunity.
5. Execution sequencer places legs if approved.
7. Trades and snapshots persist to PostgreSQL.
6. Trades and snapshots persist to PostgreSQL.
7. Dashboard and alerts reflect state changes.
### 6.2 Dashboard Control Flow
@@ -89,11 +89,14 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
### 6.3 Backtesting Flow
1. User selects JSONL replay file and run parameters.
2. Replay engine loads ordered book events.
3. Detector, risk, and execution logic run in simulation mode.
4. Report is stored in memory for recent UI display.
5. Parameter sweeps split data into train/test windows, rank results, and flag overfit.
See [backtesting.md](backtesting.md) for full design and implementation details.
1. User picks currency pairs (from config/pairings page, or all enabled).
2. User sets starting balances (required), time range (required), min profit threshold (required).
3. Fee profile defaults to "api (from Kraken)"; slippage (4.0 bps) and execution latency (20 ms) are optional with sensible defaults.
4. Job is queued via `POST /dashboard/backtesting/run`.
5. Backend loads events from `market_snapshots` table, builds triangular cycles, runs replay engine.
6. Report stored in `backtest_jobs` table, visible in recent jobs list.
## 7. Deployment View
+130
View File
@@ -0,0 +1,130 @@
# Backtesting Architecture
> Detailed design and implementation of the backtesting subsystem.
> See [`README.md`](README.md#63-backtesting-flow) for the high-level user flow.
## Data Flow
```txt
market_snapshots (DB) ─┐
├──→ load_replay_events_from_db() ──→ list[ReplayBookEvent]
JSONL file ─────────────┘
BacktestReplayEngine.run()
BacktestReport
BacktestJobRepository.store_report()
```
Two event sources:
- **DB mode** (default) — loads snapshots from `market_snapshots` table. Supports symbol/time filtering.
- **File mode** — reads JSONL files from disk (legacy, used by `backtest_replay.py` script).
## Core Types
### `ReplayClock`
Timekeeper for simulation. Ensures events advance monotonically. Supports `advance_ms()` to model execution latency.
### `ReplayBookEvent`
One atomic book state at a point in time. Fields: `occurred_at`, `symbol`, `bids: tuple[BookLevel]`, `asks: tuple[BookLevel]`.
### `BacktestConfig`
| Field | Default | Description |
| ------------------------ | -------- | ----------------------------------------------------- |
| `fee_rate` | `0.0` | 0.0 → API-sourced fee from `kraken_account_snapshots` |
| `min_profit_threshold` | `0.0005` | Minimum net profit to attempt trade |
| `trade_capital` | `100.0` | Capital allocated per trade |
| `quote_asset` | `"USD"` | Base currency for P&L |
| `slippage_bps` | `4.0` | Simulated slippage in basis points |
| `execution_latency_ms` | `20.0` | Simulated latency per leg |
| `max_depth_levels` | `10` | Order book depth for detection |
| `max_concurrent_trades` | `1` | Max simultaneous trades |
| `min_order_size_by_pair` | `None` | Per-pair min order size overrides |
### `BacktestReport`
| Field | Type | Description |
| -------------------------------- | -------------- | ---------------------------------- |
| `started_at` / `finished_at` | datetime | Simulation window |
| `processed_events` | int | Events consumed |
| `opportunities_seen` | int | Detected opportunities |
| `trades_executed` | int | Successful trades |
| `win_rate` | float or None | Fraction of profitable trades |
| `fill_rate` | float or None | Average fill ratio |
| `realized_pnl_usd` | float | Net P&L after slippage |
| `max_drawdown_usd` | float | Peak-to-trough equity drop |
| `miss_reasons` | dict[str, int] | Counters for skipped opportunities |
| `execution_latency_p50/95/99_ms` | float or None | Latency percentiles |
## Simulation Client
`_SimulatedRestClient` replaces the real Kraken REST client during backtesting.
- **Slippage model:** `fill_ratio = max(0.85, 1.0 - (slippage_bps / 10000.0) * 8.0)`
- **Latency model:** Clock advances by `execution_latency_ms` before each simulated fill
- Orders always fill (status = `"closed"`) at the modeled ratio
## Job Worker
`backtest_worker` is an `asyncio.Task` started in `create_app()` lifespan:
```python
backtest_task = asyncio.create_task(
backtest_worker(backtest_queue, db),
name="backtest_worker",
)
```
Workflow per job:
1. Dequeue `(job_id, config_dict)` from `asyncio.Queue`
2. Update status → `"running"` in `backtest_jobs` table
3. Load events (DB or file)
4. Build currency graph → triangular cycles
5. Instantiate `BacktestReplayEngine``engine.run()`
6. Store report → update status → `"completed"` (or `"failed"` on exception)
## Sweep Pipeline
`run_parameter_search` performs grid search over backtest parameters:
1. **Split** events into train/test windows by time ratio
2. **Build grid** — cartesian product of `theta_values × trade_capital_values × pair_universes × staleness_threshold_values`
3. **For each parameter set:**
- Filter events to pair universe + apply staleness gate
- Build cycles restricted to pair universe
- Run engine on train window → `train_report`
- Run engine on test window → `test_report`
- Score = `realized_pnl + win_rate_bonus + fill_rate_bonus - max_drawdown`
- Compute generalization gap = `|train_score - test_score| / max(train_score, test_score)`
4. **Evaluate promotion:**
- `PromotionCriteria` checks: min test P&L, min win rate ≥ 0.5, min fill rate ≥ 0.9, max drawdown ≤ $25, generalization gap ≤ 0.5
- Results passing all criteria are flagged `promotion_ready`
## UI
> See `backtesting.html` → `partials/backtesting_panel.html`.
- **Shell page** loads the panel via `hx-get="/dashboard/fragment/backtesting"`
- **Run form** — starting balances, time range, profit threshold (required); fee profile, slippage, latency (advanced/collapsible)
- **Status card** — current job status + message
- **Recent jobs table** — lists last 20 jobs with status, events, trades, P&L; each row has a detail button
- **Job detail** — `GET /dashboard/backtesting/job/{id}` returns report HTML
Pairings are managed on the `/dashboard/config/pairings` page. Backtest uses DB-enabled pairings by default when no symbols are specified.
## Source Files
| File | Role |
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `backtesting/replay.py` | `ReplayClock`, `ReplayBookEvent`, `BacktestConfig`, `BacktestReport`, `_SimulatedRestClient`, `BacktestReplayEngine`, `load_replay_events`, `load_replay_events_from_db` |
| `backtesting/runner.py` | `run_backtest_job`, `backtest_worker`, `_build_cycles_from_events`, `_parse_balances` |
| `backtesting/sweep.py` | `SweepParameters`, `SweepResult`, `SweepArtifacts`, `PromotionCriteria`, `split_events_time_windows`, `build_parameter_grid`, `run_parameter_search`, `persist_sweep_results` |