10 KiB
Arbitrade
Low-latency cryptocurrency arbitrage bot scaffold for Kraken.
Current stack:
- Python 3.12+
- FastAPI + HTMX/Jinja2
- DuckDB for dev/test/prod
- Native Kraken WebSocket planned for market-data hot path
- Gitea Actions + Gitea container registry
Project plan lives in PLAN.md. Task checklist lives in .github/instructions/TODO.md.
Current Status
Bootstrap complete for foundation layer:
- repo initialized
- typed settings and env loading
- structured logging
- encrypted secret helpers
- DuckDB connection + base schema
- FastAPI app with health endpoint
- Gitea Actions CI scaffold
- Docker / docker-compose scaffold
Not implemented yet:
- Kraken REST client
- Kraken native WebSocket client
- arbitrage detection engine
- trade execution
- dashboard beyond health/bootstrap page
Prerequisites
- Python 3.12+
uvfor env/package management- Git
- Docker Desktop or Docker Engine
- Gitea account on
git.allucanget.bizfor push/CI/registry access
Optional:
- PowerShell 7 on Windows
Repository Setup
Clone repo:
git clone https://git.allucanget.biz/allucanget/arbitrade.git
Set-Location arbitrade
If repo already exists locally, confirm remote:
git remote -v
Expected origin:
https://git.allucanget.biz/allucanget/arbitrade.git
Local Development Setup
Create virtualenv with uv:
uv venv
Activate env on Windows:
.\.venv\Scripts\Activate.ps1
Install app + dev dependencies:
uv pip install -e .[dev]
Dependency source of truth:
- Runtime dependencies live in
requirements/latest-runtime.in. - Dev dependencies live in
requirements/latest-dev.in. pyproject.tomlreads both files dynamically during package install.
Create local env file:
Copy-Item .env.example .env
Minimum .env values:
APP_ENV=dev
APP_HOST=0.0.0.0
APP_PORT=8000
LOG_LEVEL=INFO
LOG_JSON=true
DUCKDB_PATH=./data/arbitrade.duckdb
FERNET_KEY=
KRAKEN_API_KEY=
KRAKEN_API_SECRET=
KRAKEN_API_KEY_PERMISSIONS=query,trade
Notes:
- Leave Kraken creds empty until Kraken integration lands.
- If Kraken creds are set, both key and secret are required.
KRAKEN_API_KEY_PERMISSIONSmust includequery,tradeand must not include withdrawal scope.FERNET_KEYoptional. If empty, keyring-backed key generation used by secret helper.- On Windows, app falls back to default
asyncioloop. On non-Windows,uvloopinstalls automatically.
Run App
Start app:
python -m arbitrade.main
Health endpoints:
- HTML:
http://localhost:8000/ - JSON:
http://localhost:8000/health
Database
DuckDB used everywhere: local dev, tests, production.
Default database file:
./data/arbitrade.duckdb
Schema bootstrap runs automatically on app startup.
Current tables:
schema_migrationsopportunitiestradesportfolio_snapshots
Audit trail table:
audit_events(append-only operational decision log)
Audit retention and compaction guidance:
- Keep at least 30 days of
audit_eventsin active DB for incident triage. - Archive older rows to a timestamped export file before deletion.
- Example monthly archive workflow:
COPY (
SELECT *
FROM audit_events
WHERE occurred_at < NOW() - INTERVAL 30 DAY
) TO 'data/audit_events_archive_YYYYMM.parquet' (FORMAT PARQUET);
DELETE FROM audit_events
WHERE occurred_at < NOW() - INTERVAL 30 DAY;
- Back up archive files and the main DuckDB file together.
- For production, run archive + backup as scheduled maintenance (cron/task scheduler).
Quality Checks
Run tests:
pytest -q
Run Ruff:
ruff check .
Run Black check:
black --check .
Run mypy:
mypy src
Run dependency vulnerability audit:
pip-audit -r requirements/latest-runtime.in
Run secret scan (worktree + git history):
python scripts/security_scan.py
Generate latency profile baseline:
python scripts/profile_latency.py --iterations 600 --output ops/performance/latency_baseline.json
Run latency regression guardrails:
python scripts/check_latency_regression.py --baseline ops/performance/latency_baseline.json --thresholds ops/performance/latency_thresholds.json --iterations 600
Install pre-commit hooks:
pre-commit install
Run hooks manually:
pre-commit run --all-files
Docker
Build locally:
docker build -t arbitrade:local .
Container dependency install flow:
- Docker installs runtime dependencies from
requirements/latest-runtime.in. - Docker then installs the package with
--no-depsso dependency resolution is driven by requirements files.
Run with compose:
docker compose up --build
Compose mounts local data/ folder into container at /app/data.
Important:
- docker-compose.yml uses
git.allucanget.biz/allucanget/arbitrade:latestas the default image reference.
Coolify Deployment (Nixpacks)
Use this when deploying directly from Git in Coolify without the Dockerfile path.
1) Create application in Coolify
- In Coolify, create a new
Applicationfrom your Git repository. - Branch:
main(or your release branch). - Build Pack:
Nixpacks. - Root Directory:
.
2) Configure build and start behavior
Set these in Coolify application settings:
- Build Command: leave empty (let Nixpacks auto-detect Python).
- Install Command: leave empty (Nixpacks will install from
pyproject.toml, which readsrequirements/latest-runtime.in). - Start Command:
python -m arbitrade.main - Port:
8000
3) Configure health check and networking
- Health Check Path:
/health - Exposed Port:
8000 - Use Coolify-generated domain or attach your own domain.
4) Configure persistent storage
Add a persistent volume in Coolify:
- Mount Path:
/app/data
This preserves DuckDB and other runtime artifacts across restarts/redeploys.
5) Configure environment variables
Add runtime environment variables in Coolify (UI: Environment Variables):
APP_ENV=prodAPP_HOST=0.0.0.0APP_PORT=8000DUCKDB_PATH=/app/data/arbitrade.duckdbLOG_LEVEL=INFOLOG_JSON=trueKRAKEN_API_KEY=...KRAKEN_API_SECRET=...KRAKEN_API_KEY_PERMISSIONS=query,trade
Recommended:
- Configure
FERNET_KEYin Coolify secrets (do not commit it). - Keep all exchange keys/secrets in Coolify secret variables only.
6) Deploy and verify
- Trigger deploy in Coolify.
- Verify app boot logs show startup completed.
- Verify
GET /healthreturns success on deployed URL.
Gitea CI / Registry Setup
CI file:
Required Gitea Actions secrets:
REGISTRY_USERNAMEREGISTRY_TOKENREGISTRY_NAMESPACE
Expected namespace now likely:
allucanget
Example registry login:
docker login git.allucanget.biz
Example pushed image tag shape:
git.allucanget.biz/allucanget/arbitrade:<tag>
Project Layout
arbitrade/
├── .gitea/workflows/ci.yml
├── .github/instructions/TODO.md
├── PLAN.md
├── pyproject.toml
├── src/arbitrade/
│ ├── api/
│ ├── config/
│ ├── storage/
│ ├── logging_setup.py
│ └── main.py
├── tests/
└── web/templates/
Next Work
Next planned implementation slice:
- Kraken REST client skeleton
- native Kraken WebSocket client
- in-memory order book cache
- latency instrumentation
Troubleshooting
PowerShell blocks activation script:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy RemoteSigned
Then activate again:
.\.venv\Scripts\Activate.ps1
If app import fails, confirm editable install ran:
uv pip install -e .[dev]
If DuckDB file missing, start app once or create data/ directory manually.
Security Hardening
Threat model notes:
- Primary risk surfaces: environment secrets, dashboard auth credentials, exchange API key scope, and dependency supply chain.
- Assumed attacker model: leaked repository content, leaked CI logs/artifacts, or unauthorized dashboard access.
- High-impact outcomes to prevent: credential exfiltration, unauthorized withdrawals, and unsafe live-trading control changes.
Hardening checklist:
- Use least-privilege Kraken API keys: query + trade only; never enable withdrawal.
- Rotate API keys immediately if secret scan flags a potential exposure.
- Keep dashboard auth enabled in non-local environments and avoid default/shared credentials.
- Run
pip-audit --skip-editablein CI; treat vulnerability findings as release blockers. - Run
python scripts/security_scan.pybefore release and after major merges. - Store secrets in environment/secret manager; never commit
.envor key material.
Performance Hardening
Profile scenarios:
book_update_burstexecution_spikereconnect_storm
Backtesting
Run a deterministic replay backtest from a JSONL event stream:
python scripts/backtest_replay.py --events path\to\replay.jsonl --starting-balances USD=1000.0
Replay event format:
{
"timestamp": "2026-06-01T12:00:00Z",
"symbol": "BTC/USD",
"bids": [[100.0, 1.0]],
"asks": [[101.0, 1.0]]
}
Notes:
-
Events are replayed in timestamp order.
-
The replay engine reuses the production detector, pre-trade validation, trade limits, and execution sequencer.
-
The simulated execution path applies configurable slippage and execution latency so reports include deterministic trade/miss statistics. Latency baseline and threshold artifacts:
-
ops/performance/latency_baseline.json -
ops/performance/latency_thresholds.json
CI guardrail:
.gitea/workflows/ci.ymlrunsscripts/check_latency_regression.pyand fails on regression.
Measured optimization impact (2026-06-01):
MetricsCalculator.compute()switched from Python row scans to DuckDB SQL aggregates/quantiles.- Benchmark (
scripts/benchmark_metrics_compute.py):- Python scan avg:
12.623 ms - SQL aggregate avg:
11.039 ms - Speedup:
1.14x
- Python scan avg: