arbitrade/docs/architecture/backtesting.md

# Backtesting Architecture

> Detailed design and implementation of the backtesting subsystem.
> See [`README.md`](README.md#63-backtesting-flow) for the high-level user flow.

## Data Flow

```txt
market_snapshots (DB) ─┐
                        ├──→ load_replay_events_from_db() ──→ list[ReplayBookEvent]
JSONL file ─────────────┘
                                      │
                                      ▼
                           BacktestReplayEngine.run()
                                      │
                                      ▼
                              BacktestReport
                                      │
                                      ▼
                           BacktestJobRepository.store_report()
```

Two event sources:

- **DB mode** (default) — loads snapshots from `market_snapshots` table. Supports symbol/time filtering.
- **File mode** — reads JSONL files from disk (legacy, used by `backtest_replay.py` script).

## Core Types

### `ReplayClock`

Timekeeper for simulation. Ensures events advance monotonically. Supports `advance_ms()` to model execution latency.

### `ReplayBookEvent`

One atomic book state at a point in time. Fields: `occurred_at`, `symbol`, `bids: tuple[BookLevel]`, `asks: tuple[BookLevel]`.

### `BacktestConfig`

| Field                    | Default  | Description                                           |
| ------------------------ | -------- | ----------------------------------------------------- |
| `fee_rate`               | `0.0`    | 0.0 → API-sourced fee from `kraken_account_snapshots` |
| `min_profit_threshold`   | `0.0005` | Minimum net profit to attempt trade                   |
| `trade_capital`          | `100.0`  | Capital allocated per trade                           |
| `quote_asset`            | `"USD"`  | Base currency for P&L                                 |
| `slippage_bps`           | `4.0`    | Simulated slippage in basis points                    |
| `execution_latency_ms`   | `20.0`   | Simulated latency per leg                             |
| `max_depth_levels`       | `10`     | Order book depth for detection                        |
| `max_concurrent_trades`  | `1`      | Max simultaneous trades                               |
| `min_order_size_by_pair` | `None`   | Per-pair min order size overrides                     |

### `BacktestReport`

| Field                            | Type           | Description                        |
| -------------------------------- | -------------- | ---------------------------------- |
| `started_at` / `finished_at`     | datetime       | Simulation window                  |
| `processed_events`               | int            | Events consumed                    |
| `opportunities_seen`             | int            | Detected opportunities             |
| `trades_executed`                | int            | Successful trades                  |
| `win_rate`                       | float or None  | Fraction of profitable trades      |
| `fill_rate`                      | float or None  | Average fill ratio                 |
| `realized_pnl_usd`               | float          | Net P&L after slippage             |
| `max_drawdown_usd`               | float          | Peak-to-trough equity drop         |
| `miss_reasons`                   | dict[str, int] | Counters for skipped opportunities |
| `execution_latency_p50/95/99_ms` | float or None  | Latency percentiles                |

## Simulation Client

`_SimulatedRestClient` replaces the real Kraken REST client during backtesting.

- **Slippage model:** `fill_ratio = max(0.85, 1.0 - (slippage_bps / 10000.0) * 8.0)`
- **Latency model:** Clock advances by `execution_latency_ms` before each simulated fill
- Orders always fill (status = `"closed"`) at the modeled ratio

## Job Worker

`backtest_worker` is an `asyncio.Task` started in `create_app()` lifespan:

```python
backtest_task = asyncio.create_task(
    backtest_worker(backtest_queue, db),
    name="backtest_worker",
)
```

Workflow per job:

1. Dequeue `(job_id, config_dict)` from `asyncio.Queue`
2. Update status → `"running"` in `backtest_jobs` table
3. Load events (DB or file)
4. Build currency graph → triangular cycles
5. Instantiate `BacktestReplayEngine` → `engine.run()`
6. Store report → update status → `"completed"` (or `"failed"` on exception)

## Sweep Pipeline

`run_parameter_search` performs grid search over backtest parameters:

1. **Split** events into train/test windows by time ratio
2. **Build grid** — cartesian product of `theta_values × trade_capital_values × pair_universes × staleness_threshold_values`
3. **For each parameter set:**
   - Filter events to pair universe + apply staleness gate
   - Build cycles restricted to pair universe
   - Run engine on train window → `train_report`
   - Run engine on test window → `test_report`
   - Score = `realized_pnl + win_rate_bonus + fill_rate_bonus - max_drawdown`
   - Compute generalization gap = `|train_score - test_score| / max(train_score, test_score)`
4. **Evaluate promotion:**
   - `PromotionCriteria` checks: min test P&L, min win rate ≥ 0.5, min fill rate ≥ 0.9, max drawdown ≤ $25, generalization gap ≤ 0.5
   - Results passing all criteria are flagged `promotion_ready`

## UI

> See `backtesting.html` → `partials/backtesting_panel.html`.

- **Shell page** loads the panel via `hx-get="/dashboard/fragment/backtesting"`
- **Run form** — starting balances, time range, profit threshold (required); fee profile, slippage, latency (advanced/collapsible)
- **Status card** — current job status + message
- **Recent jobs table** — lists last 20 jobs with status, events, trades, P&L; each row has a detail button
- **Job detail** — `GET /dashboard/backtesting/job/{id}` returns report HTML

Pairings are managed on the `/dashboard/config/pairings` page. Backtest uses DB-enabled pairings by default when no symbols are specified.

## Source Files

| File                    | Role                                                                                                                                                                          |
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `backtesting/replay.py` | `ReplayClock`, `ReplayBookEvent`, `BacktestConfig`, `BacktestReport`, `_SimulatedRestClient`, `BacktestReplayEngine`, `load_replay_events`, `load_replay_events_from_db`      |
| `backtesting/runner.py` | `run_backtest_job`, `backtest_worker`, `_build_cycles_from_events`, `_parse_balances`                                                                                         |
| `backtesting/sweep.py`  | `SweepParameters`, `SweepResult`, `SweepArtifacts`, `PromotionCriteria`, `split_events_time_windows`, `build_parameter_grid`, `run_parameter_search`, `persist_sweep_results` |