Files
ai.allucanget.biz/docs/9-architectural-decisions.md
T

114 lines
6.9 KiB
Markdown

# 9. Architecture Decisions
Important, expensive, large scale or risky architecture decisions including rationales. With "decisions" we mean selecting one alternative based on given criteria.
Refer to section 4 (Solution Strategy) where the most important decisions are already captured. Avoid redundancy.
> Consider using [ADRs (Architecture Decision Records)](https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions) for every important decision.
## ADR-001: Use DuckDB as the embedded database
**Status:** accepted
**Context:** The application needs persistent storage for user data. A full RDBMS (PostgreSQL, MySQL) would require a separate server process and adds operational complexity.
**Decision:** Use DuckDB as an embedded, file-based database accessed in-process by the FastAPI backend.
**Consequences:** No separate DB server needed. Limited to single-writer access patterns. Suitable for the expected load.
---
## ADR-002: Use FastAPI for the backend
**Status:** accepted
**Context:** The backend needs async performance for concurrent AI generation requests and automatic OpenAPI documentation.
**Decision:** Use FastAPI with uvicorn as the ASGI server.
**Consequences:** Async endpoints enable high concurrency. Auto-generated OpenAPI docs simplify frontend integration and testing.
---
## ADR-003: Use Flask for the frontend
**Status:** accepted
**Context:** A lightweight server-side rendering layer is needed for the UI with minimal frontend complexity.
**Decision:** Use Flask with Jinja2 templates to serve HTML pages.
**Consequences:** Simple, familiar framework. No JavaScript build toolchain required. Frontend calls FastAPI over HTTP.
---
## ADR-004: Serialize DuckDB writes with asyncio.Lock
**Status:** accepted
**Context:** FastAPI runs async coroutines concurrently in a single process. DuckDB's optimistic concurrency model raises errors when multiple coroutines issue writes simultaneously to the same connection.
**Decision:** All write operations (`INSERT`, `UPDATE`, `DELETE`) acquire a single process-wide `asyncio.Lock` before executing. The lock is released immediately after the statement completes.
**Consequences:** Writes are serialised within the process, eliminating concurrency errors. Read performance is unaffected. Throughput is acceptable for the expected user load. If write throughput becomes a bottleneck in future, migrating to PostgreSQL is the preferred path.
---
## ADR-005: Use OpenRouter as the unified AI provider gateway
**Status:** accepted
**Context:** The application needs access to multiple AI model providers (OpenAI, Anthropic, Stability AI, Runway, etc.) for text, image, and video generation.
**Decision:** Route all AI generation requests through the [OpenRouter](https://openrouter.ai) API, which exposes an OpenAI-compatible REST interface for hundreds of models.
**Consequences:** Single API key and base URL for all model providers. Model switching requires only a change to the `model` field in the request payload. If OpenRouter is unavailable, all generation endpoints return `502 Bad Gateway`. Pricing and rate limits are governed by OpenRouter's policies per model.
---
## ADR-006: Use submit-and-poll pattern for video generation
**Status:** accepted
**Context:** OpenRouter's video generation models (Sora 2 Pro, Veo 3.1 Fast) do not return video URLs immediately. Video generation is a long-running operation (typically 30-120 seconds) that requires polling.
**Decision:** Use the `/api/v1/videos` endpoint with a two-step pattern: (1) `POST` to submit the job and receive a `polling_url`, (2) `GET` the `polling_url` every 5 seconds until `status` is `"completed"` or `"failed"`. The Flask frontend proxies polling requests via `GET /generate/video/status?polling_url=...` and the frontend JavaScript polls this endpoint automatically.
**Consequences:** The video generation endpoint returns immediately with `status: "queued"` and a `polling_url`. The frontend displays a "Processing..." message and polls for updates. When complete, the video is displayed in a `<video>` element. This adds complexity to the frontend but is necessary for long-running operations. If OpenRouter's polling endpoint is unavailable, the frontend shows an error after a timeout.
---
## ADR-007: Auto-detect image generation model type
**Status:** accepted
**Context:** OpenRouter supports image generation through two different endpoints: the legacy `/images/generations` endpoint (DALL-E 3) and the chat completions endpoint with `modalities: ["image"]` (FLUX.2 Klein 4B, GPT-5 Image Mini). These endpoints have different request/response formats.
**Decision:** The `/generate/image` router auto-detects the model type by checking if the model slug contains `"flux"` or `"gpt-5-image-mini"`. If so, it routes to `/chat/completions` with `modalities: ["image"]` and `image_config` (aspect_ratio, image_size). Otherwise, it uses `/images/generations` with `size` and `n`.
**Consequences:** Users can specify any image generation model in the form without needing to know which endpoint it uses. The router handles the routing transparently. Adding new image models requires only updating the detection logic if they use a different endpoint.
---
## ADR-008: Flask session-based auth with role caching
**Status:** accepted
**Context:** The Flask frontend needs to know the user's authentication state and role for route protection (`@login_required`, `@admin_required`) without making an extra API call on every request.
**Decision:** Store the JWT access token, refresh token, user email, and user role in the Flask server-side session cookie after login. The `@login_required` decorator checks for `access_token` in the session. The `@admin_required` decorator checks `session["user_role"] == "admin"`. This avoids an extra API call to `/users/me` on every request.
**Consequences:** The user role is cached in the session and may become stale if an admin changes a user's role while the user is logged in. The user must log out and log back in to see the updated role. This is acceptable for the expected usage pattern. The session cookie is signed (Flask's default) to prevent tampering.
---
## ADR-009: Separate generation pages in frontend
**Status:** accepted
**Context:** The original `/generate` page handled text, image, and video generation in a single form, which became unwieldy as more generation types were added.
**Decision:** Create separate Flask routes and Jinja2 templates for each generation type: `/generate/text`, `/generate/image`, `/generate/video`. The `/generate` route redirects to `/generate/text`. The navigation bar includes a "Generate" dropdown with links to each sub-page. The video page uses tabs for text-to-video and image-to-video.
**Consequences:** Each generation type has its own URL, making it bookmarkable and shareable. The navigation is clearer with a dropdown menu. Adding new generation types (e.g., audio) follows the same pattern. The `/generate` redirect provides a sensible default entry point.