allucanget/arbitrade

Fork 0

Files

T

zwitschi f221464daa

CI / lint-test-build (push) Failing after 12s

Details

feat: enhance backtesting panel with flash messages and pairing checks

2026-06-07 21:51:09 +02:00

6.9 KiB

Raw Permalink Blame History

Backtesting Architecture

Detailed design and implementation of the backtesting subsystem. See README.md for the high-level user flow.

Data Flow

market_snapshots (DB) ─┐
                        ├──→ load_replay_events_from_db() ──→ list[ReplayBookEvent]
JSONL file ─────────────┘
                                      │
                                      ▼
                           BacktestReplayEngine.run()
                                      │
                                      ▼
                              BacktestReport
                                      │
                                      ▼
                           BacktestJobRepository.store_report()

Two event sources:

DB mode (default) — loads snapshots from market_snapshots table. Supports symbol/time filtering.
File mode — reads JSONL files from disk (legacy, used by backtest_replay.py script).

Core Types

`ReplayClock`

Timekeeper for simulation. Ensures events advance monotonically. Supports advance_ms() to model execution latency.

`ReplayBookEvent`

One atomic book state at a point in time. Fields: occurred_at, symbol, bids: tuple[BookLevel], asks: tuple[BookLevel].

`BacktestConfig`

Field	Default	Description
`fee_rate`	`0.0`	0.0 → API-sourced fee from `kraken_account_snapshots`
`min_profit_threshold`	`0.0005`	Minimum net profit to attempt trade
`trade_capital`	`100.0`	Capital allocated per trade
`quote_asset`	`"USD"`	Base currency for P&L
`slippage_bps`	`4.0`	Simulated slippage in basis points
`execution_latency_ms`	`20.0`	Simulated latency per leg
`max_depth_levels`	`10`	Order book depth for detection
`max_concurrent_trades`	`1`	Max simultaneous trades
`min_order_size_by_pair`	`None`	Per-pair min order size overrides

`BacktestReport`

Field	Type	Description
`started_at` / `finished_at`	datetime	Simulation window
`processed_events`	int	Events consumed
`opportunities_seen`	int	Detected opportunities
`trades_executed`	int	Successful trades
`win_rate`	float or None	Fraction of profitable trades
`fill_rate`	float or None	Average fill ratio
`realized_pnl_usd`	float	Net P&L after slippage
`max_drawdown_usd`	float	Peak-to-trough equity drop
`miss_reasons`	dict[str, int]	Counters for skipped opportunities
`execution_latency_p50/95/99_ms`	float or None	Latency percentiles

Simulation Client

_SimulatedRestClient replaces the real Kraken REST client during backtesting.

Slippage model: fill_ratio = max(0.85, 1.0 - (slippage_bps / 10000.0) * 8.0)
Latency model: Clock advances by execution_latency_ms before each simulated fill
Orders always fill (status = "closed") at the modeled ratio

Job Worker

backtest_worker is an asyncio.Task started in create_app() lifespan:

backtest_task = asyncio.create_task(
    backtest_worker(backtest_queue, db),
    name="backtest_worker",
)

Workflow per job:

Dequeue (job_id, config_dict) from asyncio.Queue
Update status → "running" in backtest_jobs table
Load events (DB or file)
Build currency graph → triangular cycles
Instantiate BacktestReplayEngine → engine.run()
Store report → update status → "completed" (or "failed" on exception)

Sweep Pipeline

run_parameter_search performs grid search over backtest parameters:

Split events into train/test windows by time ratio
Build grid — cartesian product of theta_values × trade_capital_values × pair_universes × staleness_threshold_values
For each parameter set:
- Filter events to pair universe + apply staleness gate
- Build cycles restricted to pair universe
- Run engine on train window → train_report
- Run engine on test window → test_report
- Score = realized_pnl + win_rate_bonus + fill_rate_bonus - max_drawdown
- Compute generalization gap = |train_score - test_score| / max(train_score, test_score)
Evaluate promotion:
- PromotionCriteria checks: min test P&L, min win rate ≥ 0.5, min fill rate ≥ 0.9, max drawdown ≤ $25, generalization gap ≤ 0.5
- Results passing all criteria are flagged promotion_ready

UI

See backtesting.html → partials/backtesting_panel.html.

Shell page loads the panel via hx-get="/dashboard/fragment/backtesting"
Run form — starting balances, time range, profit threshold (required); fee profile, slippage, latency (advanced/collapsible)
Status card — current job status + message
Recent jobs table — lists last 20 jobs with status, events, trades, P&L; each row has a detail button
Job detail — GET /dashboard/backtesting/job/{id} returns report HTML

Pairings are managed on the /dashboard/config/pairings page. Backtest uses DB-enabled pairings by default when no symbols are specified.

Source Files

File	Role
`backtesting/replay.py`	`ReplayClock`, `ReplayBookEvent`, `BacktestConfig`, `BacktestReport`, `_SimulatedRestClient`, `BacktestReplayEngine`, `load_replay_events`, `load_replay_events_from_db`
`backtesting/runner.py`	`run_backtest_job`, `backtest_worker`, `_build_cycles_from_events`, `_parse_balances`
`backtesting/sweep.py`	`SweepParameters`, `SweepResult`, `SweepArtifacts`, `PromotionCriteria`, `split_events_time_windows`, `build_parameter_grid`, `run_parameter_search`, `persist_sweep_results`

6.9 KiB Raw Permalink Blame History Unescape Escape