From d5a94947dee6bc4abaa3be19ab85e01255f09dfd Mon Sep 17 00:00:00 2001 From: zwitschi Date: Wed, 29 Apr 2026 18:25:53 +0200 Subject: [PATCH] feat: update documentation with project details, deployment instructions, and database concurrency management Co-authored-by: Copilot --- docs/1-introduction-and-goals.md | 5 +- docs/5-building-block-view.md | 24 +++---- docs/7-deployment-view.md | 108 +++++++++++++++++-------------- docs/8-crosscutting-concepts.md | 77 +++------------------- docs/8.1-openrouter.md | 26 ++++++++ docs/8.2-database.md | 46 +++++++++++++ 6 files changed, 154 insertions(+), 132 deletions(-) create mode 100644 docs/8.1-openrouter.md create mode 100644 docs/8.2-database.md diff --git a/docs/1-introduction-and-goals.md b/docs/1-introduction-and-goals.md index 424eeeb..bc40fb6 100644 --- a/docs/1-introduction-and-goals.md +++ b/docs/1-introduction-and-goals.md @@ -4,7 +4,8 @@ Describes the relevant requirements and the driving forces that software archite ## Requirements Overview -**Project name**: All You Can GET AI Biz +**Project name**: All You Can GET AI +**URL**: [https://ai.allucanget.biz](https://ai.allucanget.biz) **Purpose**: Provide AI‑powered text, image, and video generation services via a web application. Users can choose between different AI models for: @@ -14,6 +15,8 @@ Users can choose between different AI models for: - Text‑to‑video generation - Image‑to‑video generation +Users can create accounts, log in, and view their generation history in a gallery. An admin dashboard allows managing users, models, and video generation jobs. + ## Quality Goals | Priority | Quality Goal | Scenario | diff --git a/docs/5-building-block-view.md b/docs/5-building-block-view.md index 5e6978b..a4364fe 100644 --- a/docs/5-building-block-view.md +++ b/docs/5-building-block-view.md @@ -5,21 +5,21 @@ Static decomposition of the system into building blocks (modules, components, su ## Level 1 – Whitebox Overall System ```text -┌───────────────────────┐ -│ Frontend (Flask) │ -└───────┬───────────────┘ +┌────────────────────────┐ +│ Frontend (Flask) │ +└───────┬────────────────┘ │ REST API calls -┌───────▼───────────────┐ -│ FastAPI Backend │ -│ ├─ Auth Service │ -│ ├─ User Service │ -│ ├─ AI Service │ +┌───────▼────────────────┐ +│ FastAPI Backend │ +│ ├─ Auth Service │ +│ ├─ User Service │ +│ ├─ AI Service │ │ └─ DB Service (DuckDB)│ -└───────┬───────────────┘ +└───────┬────────────────┘ │ DB access -┌───────▼───────────────┐ -│ DuckDB Database │ -└───────────────────────┘ +┌───────▼────────────────┐ +│ DuckDB Database │ +└────────────────────────┘ ``` **Motivation:** Separating the UI (Flask) from the API (FastAPI) allows independent scaling and testing of each layer. diff --git a/docs/7-deployment-view.md b/docs/7-deployment-view.md index 6c62bbc..461eb82 100644 --- a/docs/7-deployment-view.md +++ b/docs/7-deployment-view.md @@ -5,71 +5,79 @@ Describes: 1. Technical infrastructure used to execute your system, with infrastructure elements like geographical locations, environments, computers, processors, channels and net topologies. 2. Mapping of (software) building blocks to that infrastructure elements. +**See**: [Coolify Deployment Guide](./deployment/coolify.md) for detailed instructions. + ## Infrastructure Level 1 -```text -┌────────────────────────────────────────────┐ -│ Host / VM │ -│ ┌─────────────┐ ┌────────────────────┐ │ -│ │ frontend │ │ backend │ │ -│ │ (Flask) │ │ (FastAPI) │ │ -│ │ :12016 │ │ :12015 │ │ -│ └──────┬──────┘ └─────────┬──────────┘ │ -│ │ │ │ -│ └────────┬──────────┘ │ -│ │ │ -│ ┌───────▼────────┐ │ -│ │ db (DuckDB) │ │ -│ │ data/app.db │ │ -│ └────────────────┘ │ -└────────────────────────────────────────────┘ +Hosted on a single VM running docker containers, deployed via Coolify with Nixpacks to 192.168.88.18 for production. + +Containers run behind nginx at 192.168.88.11 which handles TLS termination and reverse proxying to the frontend on port 12016 and backend on port 12015. The database is a file on the host filesystem at `data/app.db` accessed by the backend service. + +```mermaid +graph TD + Users[Users / Internet] + Nginx[nginx reverse proxy\nTLS termination] + Users -->|HTTPS| Nginx + + subgraph Coolify Server + direction TB + subgraph AI Frontend + AI_Frontend[AI Frontend\nFlask\nServes HTML/CSS/JS UI] + end + subgraph AI Backend + AI_Backend[AI Backend\nFastAPI\nCommunicates with openrouter.ai API] + db[(DuckDB Database\nFile: data/app.db)] + AI_Backend --> db + end + AI_Frontend -->|BACKEND_URL:12015| AI_Backend + end + Nginx -->|12016| AI_Frontend ``` -**Motivation:** All three components run on a single VM (or as Docker containers) for simplicity and low operational overhead. +**Motivation:** All three components run as Docker containers for simplicity and low operational overhead. **Quality and/or Performance Features:** The frontend and backend are stateless; DuckDB persists data on the host filesystem. **Mapping of Building Blocks to Infrastructure:** -| Building Block | Container / Process | Port | -| --------------- | ---------------------------- | ----- | -| Flask frontend | `frontend` | 12016 | -| FastAPI backend | `backend` | 12015 | -| DuckDB | File on host (`data/app.db`) | — | +| Building Block | Container / Process | Port | +| --------------- | ---------------------------- | --------------- | +| Nginx | `nginx` | 80/443 (public) | +| Coolify Server | `coolify` | — | +| Flask frontend | `frontend` | 12016 | +| FastAPI backend | `backend` | 12015 | +| DuckDB | File on host (`data/app.db`) | — | ## Infrastructure Level 2 ### Coolify with Nixpacks (Production) -Both services are deployed as separate Nixpacks resources in Coolify: +Both services are deployed as separate Nixpacks resources in Coolify, which results in two separate containers running on the same host. The database is a file on the host filesystem, mounted as a volume in the backend container. -```text -┌──────────────────────────────────────────────────────────┐ -│ Coolify Server │ -│ ┌────────────────────────────┐ │ -│ │ Backend Service (FastAPI) │ │ -│ │ - Base Dir: /backend │ │ -│ │ - Port: 12015 │ │ -│ │ - Volume: /app/data │ │ -│ ├────────────────────────────┤ │ -│ │ Frontend Service (Flask) │ │ -│ │ - Base Dir: /frontend │ │ -│ │ - Port: 12016 (public) │ │ -│ │ - BACKEND_URL: :12015 │ │ -│ └────────────────────────────┘ │ -│ ▲ │ -│ Coolify reverse proxy (TLS termination) │ -└──────────────────────────────────────────────────────────┘ - │ - Users / Internet +#### Frontend + +```mermaid +graph TD + subgraph Coolify Server + direction TB + subgraph AI Frontend + AI_Frontend[AI Frontend\nNixpacks\nBase Dir: /frontend] + end + end + Users[Users / Internet] -->|HTTPS| AI_Frontend ``` -**Deployment Steps:** +#### Backend -1. Create backend Nixpacks service in Coolify with Base Directory `/backend` -2. Create frontend Nixpacks service with Base Directory `/frontend` -3. Set environment variables per service -4. Attach domain to frontend on port `12016` -5. Enable Auto HTTPS in Coolify - -**See**: [Coolify Deployment Guide](./deployment/coolify.md) for detailed instructions. +```mermaid +graph TD + subgraph Coolify Server + direction TB + subgraph AI Backend + AI_Backend[AI Backend\nNixpacks\nBase Dir: /backend] + db[(DuckDB Database\nVolume: /app/data)] + AI_Backend --> db + end + end + Frontend[Frontend Container] -->|BACKEND_URL:12015| AI_Backend +``` diff --git a/docs/8-crosscutting-concepts.md b/docs/8-crosscutting-concepts.md index 09aece1..bb3ebe5 100644 --- a/docs/8-crosscutting-concepts.md +++ b/docs/8-crosscutting-concepts.md @@ -4,6 +4,14 @@ Describes crosscutting concepts (practices, patterns, regulations or solution id > Pick **only** the most-needed topics for your system. +## OpenRouter API Integration + +see [docs/8.1-openrouter.md](./8.1-openrouter.md) for details on how the backend integrates with OpenRouter for multi-modal AI generation, including image and video generation flows. + +## DuckDB Concurrency and Storage + +See [docs/8.2-duckdb.md](./8.2-duckdb.md) for details on how the backend handles concurrent access to DuckDB and manages the database file on the host filesystem. + ## Security - All API endpoints (except `/auth/login`) require a valid JWT in the `Authorization: Bearer` header. @@ -25,72 +33,3 @@ Describes crosscutting concepts (practices, patterns, regulations or solution id - All secrets (API keys, DB path, JWT secret) loaded from environment variables or `.env` file. - No secrets committed to source control. - -## DuckDB Concurrency and Storage - -### Single Writer Per Process - -DuckDB allows only one process to open the database file in read-write mode at a time. The FastAPI backend must be run with a single worker (`uvicorn --workers 1`). Running multiple workers against the same DuckDB file will cause startup errors. - -### asyncio.Lock for Writes - -All database write operations (`INSERT`, `UPDATE`, `DELETE`) in the FastAPI async context are wrapped in a single `asyncio.Lock` (`get_write_lock()` from `backend/app/db.py`). This prevents concurrent coroutines from issuing overlapping writes within the single process, which would otherwise raise DuckDB optimistic concurrency errors. - -Read operations (`SELECT`) do not require the lock — DuckDB's MVCC provides consistent read snapshots. - -### Schema - -```sql -CREATE TABLE users ( - id UUID DEFAULT uuid() PRIMARY KEY, - email VARCHAR NOT NULL UNIQUE, - password_hash VARCHAR NOT NULL, - role VARCHAR DEFAULT 'user', - created_at TIMESTAMP DEFAULT now(), - updated_at TIMESTAMP DEFAULT now() -); - -CREATE TABLE refresh_tokens ( - jti UUID DEFAULT uuid() PRIMARY KEY, - user_id UUID NOT NULL, -- soft FK to users.id - issued_at TIMESTAMP DEFAULT now(), - expires_at TIMESTAMP NOT NULL, - revoked BOOLEAN DEFAULT false -); -``` - -> The `REFERENCES users(id)` foreign key is intentionally omitted from `refresh_tokens`. DuckDB fires FK checks on `UPDATE` of the parent table (including email changes), causing false constraint violations. Referential integrity is enforced manually: deleting a user also deletes their refresh tokens in the same write transaction. - -### Access Tokens - -Access tokens are **stateless** JWTs — not stored in the database. They are validated by signature and expiry claim only. The short TTL (15 minutes) limits the blast radius if a token is leaked. - -### Refresh Tokens - -Refresh tokens store a JTI (JWT ID) UUID in the `refresh_tokens` table. On each use the old JTI is revoked and a new one issued (rotation). On logout the JTI is immediately revoked. Expired and revoked tokens can be purged via `POST /admin/tokens/purge`. - -### Future: AI Generation History - -AI generation metadata (model, prompt, cost, result URLs) can be stored as JSON columns in a future `generation_history` table in DuckDB, enabling per-user analytics and usage dashboards at zero extra infrastructure cost. - -## OpenRouter API Integration - -### Image Generation - -Image generation uses two different OpenRouter endpoints depending on the model: - -- **Legacy endpoint** (`/images/generations`): Used by DALL-E 3 and similar models. Returns `data[].url` and `data[].b64_json`. -- **Chat completions** (`/chat/completions` with `modalities: ["image"]`): Used by FLUX.2 Klein 4B and GPT-5 Image Mini. Returns `choices[0].message.images[].image_url.url` as base64 data URLs. - -The router auto-detects the model type and routes accordingly. Image configuration (`aspect_ratio`, `image_size`) is passed via `image_config` for chat-based models. - -### Video Generation - -Video generation uses OpenRouter's `/api/v1/videos` endpoint with a **submit-and-poll** pattern: - -1. `POST /api/v1/videos` with `model`, `prompt`, `aspect_ratio`, `resolution`, `duration_seconds` -2. Response: `{"id": "job_id", "polling_url": "https://..."}` with `status: "queued"` -3. Poll `GET polling_url` every 5 seconds until `status` is `"completed"` or `"failed"` -4. Completed response includes `unsigned_urls: [str]` array with video download URLs - -Supported models: `openai/sora-2-pro`, `google/veo-3.1-fast`. Both text-to-video and image-to-video use the same `/api/v1/videos` endpoint (image-to-video includes `image_url` in the request body). diff --git a/docs/8.1-openrouter.md b/docs/8.1-openrouter.md new file mode 100644 index 0000000..0e21a19 --- /dev/null +++ b/docs/8.1-openrouter.md @@ -0,0 +1,26 @@ +# OpenRouter API Integration + +## Text Generation + +> [!warning] +> TODO: Add more details on how the backend integrates with OpenRouter for text generation, including chat completions and single-prompt generation flows. + +## Image Generation + +Image generation uses two different OpenRouter endpoints depending on the model: + +- **Legacy endpoint** (`/images/generations`): Used by DALL-E 3 and similar models. Returns `data[].url` and `data[].b64_json`. +- **Chat completions** (`/chat/completions` with `modalities: ["image"]`): Used by FLUX.2 Klein 4B and GPT-5 Image Mini. Returns `choices[0].message.images[].image_url.url` as base64 data URLs. + +The router auto-detects the model type and routes accordingly. Image configuration (`aspect_ratio`, `image_size`) is passed via `image_config` for chat-based models. + +## Video Generation + +Video generation uses OpenRouter's `/api/v1/videos` endpoint with a **submit-and-poll** pattern: + +1. `POST /api/v1/videos` with `model`, `prompt`, `aspect_ratio`, `resolution`, `duration_seconds` +2. Response: `{"id": "job_id", "polling_url": "https://..."}` with `status: "queued"` +3. Poll `GET polling_url` every 5 seconds until `status` is `"completed"` or `"failed"` +4. Completed response includes `unsigned_urls: [str]` array with video download URLs + +Supported models: `openai/sora-2-pro`, `google/veo-3.1-fast`. Both text-to-video and image-to-video use the same `/api/v1/videos` endpoint (image-to-video includes `image_url` in the request body). diff --git a/docs/8.2-database.md b/docs/8.2-database.md new file mode 100644 index 0000000..8aa0772 --- /dev/null +++ b/docs/8.2-database.md @@ -0,0 +1,46 @@ +# DuckDB Concurrency and Storage + +## Single Writer Per Process + +DuckDB allows only one process to open the database file in read-write mode at a time. The FastAPI backend must be run with a single worker (`uvicorn --workers 1`). Running multiple workers against the same DuckDB file will cause startup errors. + +## asyncio.Lock for Writes + +All database write operations (`INSERT`, `UPDATE`, `DELETE`) in the FastAPI async context are wrapped in a single `asyncio.Lock` (`get_write_lock()` from `backend/app/db.py`). This prevents concurrent coroutines from issuing overlapping writes within the single process, which would otherwise raise DuckDB optimistic concurrency errors. + +Read operations (`SELECT`) do not require the lock — DuckDB's MVCC provides consistent read snapshots. + +## Schema + +```sql +CREATE TABLE users ( + id UUID DEFAULT uuid() PRIMARY KEY, + email VARCHAR NOT NULL UNIQUE, + password_hash VARCHAR NOT NULL, + role VARCHAR DEFAULT 'user', + created_at TIMESTAMP DEFAULT now(), + updated_at TIMESTAMP DEFAULT now() +); + +CREATE TABLE refresh_tokens ( + jti UUID DEFAULT uuid() PRIMARY KEY, + user_id UUID NOT NULL, -- soft FK to users.id + issued_at TIMESTAMP DEFAULT now(), + expires_at TIMESTAMP NOT NULL, + revoked BOOLEAN DEFAULT false +); +``` + +> The `REFERENCES users(id)` foreign key is intentionally omitted from `refresh_tokens`. DuckDB fires FK checks on `UPDATE` of the parent table (including email changes), causing false constraint violations. Referential integrity is enforced manually: deleting a user also deletes their refresh tokens in the same write transaction. + +## Access Tokens + +Access tokens are **stateless** JWTs — not stored in the database. They are validated by signature and expiry claim only. The short TTL (15 minutes) limits the blast radius if a token is leaked. + +## Refresh Tokens + +Refresh tokens store a JTI (JWT ID) UUID in the `refresh_tokens` table. On each use the old JTI is revoked and a new one issued (rotation). On logout the JTI is immediately revoked. Expired and revoked tokens can be purged via `POST /admin/tokens/purge`. + +## Future: AI Generation History + +AI generation metadata (model, prompt, cost, result URLs) can be stored as JSON columns in a future `generation_history` table in DuckDB, enabling per-user analytics and usage dashboards at zero extra infrastructure cost.