feat: enhance backtesting panel with flash messages and pairing checks
CI / lint-test-build (push) Failing after 12s

This commit is contained in:
2026-06-07 21:51:09 +02:00
parent 5e7732b85f
commit f221464daa
4 changed files with 269 additions and 79 deletions
+10 -7
View File
@@ -53,7 +53,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
- `detection/` - triangular graph and incremental detector. - `detection/` - triangular graph and incremental detector.
- `risk/` - pre-trade and trade-limit guards. - `risk/` - pre-trade and trade-limit guards.
- `execution/` - multi-leg trade sequencing. - `execution/` - multi-leg trade sequencing.
- `backtesting/` - replay engine, parameter sweep, experiment scaffolds. - `backtesting/` - replay engine, parameter sweep, experiment scaffolds. See [backtesting.md](backtesting.md).
- `strategy/` - experimental strategy modules such as stat-arb. - `strategy/` - experimental strategy modules such as stat-arb.
- `storage/` - PostgreSQL schema and repositories. - `storage/` - PostgreSQL schema and repositories.
- `alerting/` - multi-channel notifications. - `alerting/` - multi-channel notifications.
@@ -77,7 +77,7 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
3. Incremental detector scores impacted cycles. 3. Incremental detector scores impacted cycles.
4. Risk manager validates the opportunity. 4. Risk manager validates the opportunity.
5. Execution sequencer places legs if approved. 5. Execution sequencer places legs if approved.
7. Trades and snapshots persist to PostgreSQL. 6. Trades and snapshots persist to PostgreSQL.
7. Dashboard and alerts reflect state changes. 7. Dashboard and alerts reflect state changes.
### 6.2 Dashboard Control Flow ### 6.2 Dashboard Control Flow
@@ -89,11 +89,14 @@ The bot consumes Kraken market data, detects opportunities, and executes trades
### 6.3 Backtesting Flow ### 6.3 Backtesting Flow
1. User selects JSONL replay file and run parameters. See [backtesting.md](backtesting.md) for full design and implementation details.
2. Replay engine loads ordered book events.
3. Detector, risk, and execution logic run in simulation mode. 1. User picks currency pairs (from config/pairings page, or all enabled).
4. Report is stored in memory for recent UI display. 2. User sets starting balances (required), time range (required), min profit threshold (required).
5. Parameter sweeps split data into train/test windows, rank results, and flag overfit. 3. Fee profile defaults to "api (from Kraken)"; slippage (4.0 bps) and execution latency (20 ms) are optional with sensible defaults.
4. Job is queued via `POST /dashboard/backtesting/run`.
5. Backend loads events from `market_snapshots` table, builds triangular cycles, runs replay engine.
6. Report stored in `backtest_jobs` table, visible in recent jobs list.
## 7. Deployment View ## 7. Deployment View
+130
View File
@@ -0,0 +1,130 @@
# Backtesting Architecture
> Detailed design and implementation of the backtesting subsystem.
> See [`README.md`](README.md#63-backtesting-flow) for the high-level user flow.
## Data Flow
```txt
market_snapshots (DB) ─┐
├──→ load_replay_events_from_db() ──→ list[ReplayBookEvent]
JSONL file ─────────────┘
BacktestReplayEngine.run()
BacktestReport
BacktestJobRepository.store_report()
```
Two event sources:
- **DB mode** (default) — loads snapshots from `market_snapshots` table. Supports symbol/time filtering.
- **File mode** — reads JSONL files from disk (legacy, used by `backtest_replay.py` script).
## Core Types
### `ReplayClock`
Timekeeper for simulation. Ensures events advance monotonically. Supports `advance_ms()` to model execution latency.
### `ReplayBookEvent`
One atomic book state at a point in time. Fields: `occurred_at`, `symbol`, `bids: tuple[BookLevel]`, `asks: tuple[BookLevel]`.
### `BacktestConfig`
| Field | Default | Description |
| ------------------------ | -------- | ----------------------------------------------------- |
| `fee_rate` | `0.0` | 0.0 → API-sourced fee from `kraken_account_snapshots` |
| `min_profit_threshold` | `0.0005` | Minimum net profit to attempt trade |
| `trade_capital` | `100.0` | Capital allocated per trade |
| `quote_asset` | `"USD"` | Base currency for P&L |
| `slippage_bps` | `4.0` | Simulated slippage in basis points |
| `execution_latency_ms` | `20.0` | Simulated latency per leg |
| `max_depth_levels` | `10` | Order book depth for detection |
| `max_concurrent_trades` | `1` | Max simultaneous trades |
| `min_order_size_by_pair` | `None` | Per-pair min order size overrides |
### `BacktestReport`
| Field | Type | Description |
| -------------------------------- | -------------- | ---------------------------------- |
| `started_at` / `finished_at` | datetime | Simulation window |
| `processed_events` | int | Events consumed |
| `opportunities_seen` | int | Detected opportunities |
| `trades_executed` | int | Successful trades |
| `win_rate` | float or None | Fraction of profitable trades |
| `fill_rate` | float or None | Average fill ratio |
| `realized_pnl_usd` | float | Net P&L after slippage |
| `max_drawdown_usd` | float | Peak-to-trough equity drop |
| `miss_reasons` | dict[str, int] | Counters for skipped opportunities |
| `execution_latency_p50/95/99_ms` | float or None | Latency percentiles |
## Simulation Client
`_SimulatedRestClient` replaces the real Kraken REST client during backtesting.
- **Slippage model:** `fill_ratio = max(0.85, 1.0 - (slippage_bps / 10000.0) * 8.0)`
- **Latency model:** Clock advances by `execution_latency_ms` before each simulated fill
- Orders always fill (status = `"closed"`) at the modeled ratio
## Job Worker
`backtest_worker` is an `asyncio.Task` started in `create_app()` lifespan:
```python
backtest_task = asyncio.create_task(
backtest_worker(backtest_queue, db),
name="backtest_worker",
)
```
Workflow per job:
1. Dequeue `(job_id, config_dict)` from `asyncio.Queue`
2. Update status → `"running"` in `backtest_jobs` table
3. Load events (DB or file)
4. Build currency graph → triangular cycles
5. Instantiate `BacktestReplayEngine``engine.run()`
6. Store report → update status → `"completed"` (or `"failed"` on exception)
## Sweep Pipeline
`run_parameter_search` performs grid search over backtest parameters:
1. **Split** events into train/test windows by time ratio
2. **Build grid** — cartesian product of `theta_values × trade_capital_values × pair_universes × staleness_threshold_values`
3. **For each parameter set:**
- Filter events to pair universe + apply staleness gate
- Build cycles restricted to pair universe
- Run engine on train window → `train_report`
- Run engine on test window → `test_report`
- Score = `realized_pnl + win_rate_bonus + fill_rate_bonus - max_drawdown`
- Compute generalization gap = `|train_score - test_score| / max(train_score, test_score)`
4. **Evaluate promotion:**
- `PromotionCriteria` checks: min test P&L, min win rate ≥ 0.5, min fill rate ≥ 0.9, max drawdown ≤ $25, generalization gap ≤ 0.5
- Results passing all criteria are flagged `promotion_ready`
## UI
> See `backtesting.html` → `partials/backtesting_panel.html`.
- **Shell page** loads the panel via `hx-get="/dashboard/fragment/backtesting"`
- **Run form** — starting balances, time range, profit threshold (required); fee profile, slippage, latency (advanced/collapsible)
- **Status card** — current job status + message
- **Recent jobs table** — lists last 20 jobs with status, events, trades, P&L; each row has a detail button
- **Job detail** — `GET /dashboard/backtesting/job/{id}` returns report HTML
Pairings are managed on the `/dashboard/config/pairings` page. Backtest uses DB-enabled pairings by default when no symbols are specified.
## Source Files
| File | Role |
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `backtesting/replay.py` | `ReplayClock`, `ReplayBookEvent`, `BacktestConfig`, `BacktestReport`, `_SimulatedRestClient`, `BacktestReplayEngine`, `load_replay_events`, `load_replay_events_from_db` |
| `backtesting/runner.py` | `run_backtest_job`, `backtest_worker`, `_build_cycles_from_events`, `_parse_balances` |
| `backtesting/sweep.py` | `SweepParameters`, `SweepResult`, `SweepArtifacts`, `PromotionCriteria`, `split_events_time_windows`, `build_parameter_grid`, `run_parameter_search`, `persist_sweep_results` |
+10 -1
View File
@@ -745,6 +745,8 @@ async def _backtesting_panel_context(
return { return {
"status": status, "status": status,
"message": message, "message": message,
"flash_message": "",
"no_enabled_pairings": False,
"latest_report": latest, "latest_report": latest,
"recent_reports": reports, "recent_reports": reports,
"run_endpoint": "/dashboard/backtesting/run", "run_endpoint": "/dashboard/backtesting/run",
@@ -812,10 +814,16 @@ async def dashboard_backtesting_page(request: Request) -> HTMLResponse:
@router.get("/dashboard/fragment/backtesting", response_class=HTMLResponse) @router.get("/dashboard/fragment/backtesting", response_class=HTMLResponse)
async def dashboard_backtesting_fragment(request: Request) -> HTMLResponse: async def dashboard_backtesting_fragment(request: Request) -> HTMLResponse:
d_context = await _dashboard_config_context(request) d_context = await _dashboard_config_context(request)
ctx = await _backtesting_panel_context(request)
ctx["flash_message"] = ""
# Check if any pairings are enabled
repo = ConfigPairingRepository(request.app.state.store)
enabled = await repo.list_pairings(enabled_only=True)
ctx["no_enabled_pairings"] = len(enabled) == 0
return templates.TemplateResponse( return templates.TemplateResponse(
request=request, request=request,
name="partials/backtesting_panel.html", name="partials/backtesting_panel.html",
context={"request": request, **d_context}, context={"request": request, **ctx},
) )
@@ -1035,6 +1043,7 @@ async def dashboard_backtesting_run(request: Request) -> HTMLResponse:
message=f"Job {msg_job}... queued. Refresh to see results.", message=f"Job {msg_job}... queued. Refresh to see results.",
defaults=defaults, defaults=defaults,
) )
context["flash_message"] = f"Backtest job {msg_job}... submitted successfully."
except ValueError as exc: except ValueError as exc:
context = await _backtesting_panel_context( context = await _backtesting_panel_context(
request, request,
@@ -45,6 +45,46 @@
<article class="card" style="margin-top: 16px"> <article class="card" style="margin-top: 16px">
<div class="label">Run Backtest</div> <div class="label">Run Backtest</div>
{% if no_enabled_pairings %}
<div
class="flash"
style="
background: rgba(255, 193, 7, 0.15);
border: 1px solid rgba(255, 193, 7, 0.3);
border-radius: 8px;
padding: 10px 16px;
margin-bottom: 16px;
color: #ffe58f;
font-size: 0.9rem;
"
>
No enabled pairings found. Enable at least one pairing on the
<a href="/dashboard/config/pairings" style="color: #ffe58f"
>Pairings page</a
>
before running a backtest.
</div>
{% endif %} {% if flash_message %}
<div
class="flash"
style="
background: rgba(82, 196, 26, 0.15);
border: 1px solid rgba(82, 196, 26, 0.3);
border-radius: 8px;
padding: 10px 16px;
margin-bottom: 16px;
color: #b7eb8f;
font-size: 0.9rem;
"
hx-trigger="load delay:5s"
hx-target="this"
hx-swap="delete"
>
{{ flash_message }}
</div>
{% endif %}
<form <form
class="form-grid" class="form-grid"
hx-post="{{ run_endpoint }}" hx-post="{{ run_endpoint }}"
@@ -58,26 +98,10 @@
<a href="/dashboard/config/pairings">Configuration → Pairings</a>. Only <a href="/dashboard/config/pairings">Configuration → Pairings</a>. Only
enabled pairings are backtested. enabled pairings are backtested.
</div> </div>
<!-- Required fields -->
<label class="field"> <label class="field">
<span>Start time (ISO datetime, optional)</span> <span>Starting balances <span style="color: #ff4d4f">*</span></span>
<input
name="start_time"
type="text"
value="{{ start_time | default('') }}"
placeholder="2025-01-01T00:00:00"
/>
</label>
<label class="field">
<span>End time (ISO datetime, optional)</span>
<input
name="end_time"
type="text"
value="{{ end_time | default('') }}"
placeholder="2025-01-02T00:00:00"
/>
</label>
<label class="field">
<span>Starting balances</span>
<input <input
name="starting_balances" name="starting_balances"
type="text" type="text"
@@ -86,17 +110,25 @@
/> />
</label> </label>
<label class="field"> <label class="field">
<span>Trade capital</span> <span>Start time <span style="color: #ff4d4f">*</span></span>
<input <input
name="trade_capital" name="start_time"
type="number" type="text"
min="0" value="{{ start_time | default('') }}"
step="0.01" placeholder="2025-01-01T00:00:00"
value="{{ trade_capital }}"
/> />
</label> </label>
<label class="field"> <label class="field">
<span>Min profit threshold</span> <span>End time <span style="color: #ff4d4f">*</span></span>
<input
name="end_time"
type="text"
value="{{ end_time | default('') }}"
placeholder="2025-01-02T00:00:00"
/>
</label>
<label class="field">
<span>Min profit threshold <span style="color: #ff4d4f">*</span></span>
<input <input
name="min_profit_threshold" name="min_profit_threshold"
type="number" type="number"
@@ -105,51 +137,67 @@
value="{{ min_profit_threshold }}" value="{{ min_profit_threshold }}"
/> />
</label> </label>
<label class="field">
<span>Fee profile</span> <!-- Advanced -->
<select name="fee_profile"> <details style="grid-column: 1 / -1; margin-top: 8px">
{% set sel = "selected" if fee_profile == "api" else "" %} <summary style="cursor: pointer; opacity: 0.7; font-size: 0.85rem">
<option value="api" {{ sel }}>api (from Kraken)</option> Advanced options (fee profile, slippage, latency)
{% set sel = "selected" if fee_profile == "standard" else "" %} </summary>
<option value="standard" {{ sel }}>standard</option> <div
{% set sel = "selected" if fee_profile == "maker_heavy" else "" %} class="form-grid"
<option value="maker_heavy" {{ sel }}>maker_heavy</option> style="
{% set sel = "selected" if fee_profile == "taker_heavy" else "" %} margin-top: 12px;
<option value="taker_heavy" {{ sel }}>taker_heavy</option> grid-template-columns: repeat(auto-fit, minmax(240px, 1fr));
{% set sel = "selected" if fee_profile == "custom" else "" %} "
<option value="custom" {{ sel }}>custom</option> >
</select> <label class="field">
</label> <span>Fee profile</span>
<label class="field"> <select name="fee_profile">
<span>Custom fee rate (if fee profile = custom)</span> {% set sel = "selected" if fee_profile == "api" else "" %}
<input <option value="api" {{ sel }}>api (from Kraken)</option>
name="custom_fee_rate" {% set sel = "selected" if fee_profile == "standard" else "" %}
type="number" <option value="standard" {{ sel }}>standard</option>
min="0" {% set sel = "selected" if fee_profile == "maker_heavy" else "" %}
step="0.0001" <option value="maker_heavy" {{ sel }}>maker_heavy</option>
value="{{ custom_fee_rate }}" {% set sel = "selected" if fee_profile == "taker_heavy" else "" %}
/> <option value="taker_heavy" {{ sel }}>taker_heavy</option>
</label> {% set sel = "selected" if fee_profile == "custom" else "" %}
<label class="field"> <option value="custom" {{ sel }}>custom</option>
<span>Slippage (bps)</span> </select>
<input </label>
name="slippage_bps" <label class="field">
type="number" <span>Custom fee rate (if custom profile)</span>
min="0" <input
step="0.1" name="custom_fee_rate"
value="{{ slippage_bps }}" type="number"
/> min="0"
</label> step="0.0001"
<label class="field"> value="{{ custom_fee_rate }}"
<span>Execution latency (ms)</span> />
<input </label>
name="execution_latency_ms" <label class="field">
type="number" <span>Slippage (bps)</span>
min="0" <input
step="0.1" name="slippage_bps"
value="{{ execution_latency_ms }}" type="number"
/> min="0"
</label> step="0.1"
value="{{ slippage_bps }}"
/>
</label>
<label class="field">
<span>Execution latency (ms)</span>
<input
name="execution_latency_ms"
type="number"
min="0"
step="0.1"
value="{{ execution_latency_ms }}"
/>
</label>
</div>
</details>
<button type="submit" class="button">Submit Job</button> <button type="submit" class="button">Submit Job</button>
</form> </form>
</article> </article>