Files
ai.allucanget.biz/docs/9-architectural-decisions.md

6.9 KiB

9. Architecture Decisions

Important, expensive, large scale or risky architecture decisions including rationales. With "decisions" we mean selecting one alternative based on given criteria.

Refer to section 4 (Solution Strategy) where the most important decisions are already captured. Avoid redundancy.

Consider using ADRs (Architecture Decision Records) for every important decision.

ADR-001: Use DuckDB as the embedded database

Status: accepted

Context: The application needs persistent storage for user data. A full RDBMS (PostgreSQL, MySQL) would require a separate server process and adds operational complexity.

Decision: Use DuckDB as an embedded, file-based database accessed in-process by the FastAPI backend.

Consequences: No separate DB server needed. Limited to single-writer access patterns. Suitable for the expected load.


ADR-002: Use FastAPI for the backend

Status: accepted

Context: The backend needs async performance for concurrent AI generation requests and automatic OpenAPI documentation.

Decision: Use FastAPI with uvicorn as the ASGI server.

Consequences: Async endpoints enable high concurrency. Auto-generated OpenAPI docs simplify frontend integration and testing.


ADR-003: Use Flask for the frontend

Status: accepted

Context: A lightweight server-side rendering layer is needed for the UI with minimal frontend complexity.

Decision: Use Flask with Jinja2 templates to serve HTML pages.

Consequences: Simple, familiar framework. No JavaScript build toolchain required. Frontend calls FastAPI over HTTP.


ADR-004: Serialize DuckDB writes with asyncio.Lock

Status: accepted

Context: FastAPI runs async coroutines concurrently in a single process. DuckDB's optimistic concurrency model raises errors when multiple coroutines issue writes simultaneously to the same connection.

Decision: All write operations (INSERT, UPDATE, DELETE) acquire a single process-wide asyncio.Lock before executing. The lock is released immediately after the statement completes.

Consequences: Writes are serialised within the process, eliminating concurrency errors. Read performance is unaffected. Throughput is acceptable for the expected user load. If write throughput becomes a bottleneck in future, migrating to PostgreSQL is the preferred path.


ADR-005: Use OpenRouter as the unified AI provider gateway

Status: accepted

Context: The application needs access to multiple AI model providers (OpenAI, Anthropic, Stability AI, Runway, etc.) for text, image, and video generation.

Decision: Route all AI generation requests through the OpenRouter API, which exposes an OpenAI-compatible REST interface for hundreds of models.

Consequences: Single API key and base URL for all model providers. Model switching requires only a change to the model field in the request payload. If OpenRouter is unavailable, all generation endpoints return 502 Bad Gateway. Pricing and rate limits are governed by OpenRouter's policies per model.


ADR-006: Use submit-and-poll pattern for video generation

Status: accepted

Context: OpenRouter's video generation models (Sora 2 Pro, Veo 3.1 Fast) do not return video URLs immediately. Video generation is a long-running operation (typically 30-120 seconds) that requires polling.

Decision: Use the /api/v1/videos endpoint with a two-step pattern: (1) POST to submit the job and receive a polling_url, (2) GET the polling_url every 5 seconds until status is "completed" or "failed". The Flask frontend proxies polling requests via GET /generate/video/status?polling_url=... and the frontend JavaScript polls this endpoint automatically.

Consequences: The video generation endpoint returns immediately with status: "queued" and a polling_url. The frontend displays a "Processing..." message and polls for updates. When complete, the video is displayed in a <video> element. This adds complexity to the frontend but is necessary for long-running operations. If OpenRouter's polling endpoint is unavailable, the frontend shows an error after a timeout.


ADR-007: Auto-detect image generation model type

Status: accepted

Context: OpenRouter supports image generation through two different endpoints: the legacy /images/generations endpoint (DALL-E 3) and the chat completions endpoint with modalities: ["image"] (FLUX.2 Klein 4B, GPT-5 Image Mini). These endpoints have different request/response formats.

Decision: The /generate/image router auto-detects the model type by checking if the model slug contains "flux" or "gpt-5-image-mini". If so, it routes to /chat/completions with modalities: ["image"] and image_config (aspect_ratio, image_size). Otherwise, it uses /images/generations with size and n.

Consequences: Users can specify any image generation model in the form without needing to know which endpoint it uses. The router handles the routing transparently. Adding new image models requires only updating the detection logic if they use a different endpoint.


ADR-008: Flask session-based auth with role caching

Status: accepted

Context: The Flask frontend needs to know the user's authentication state and role for route protection (@login_required, @admin_required) without making an extra API call on every request.

Decision: Store the JWT access token, refresh token, user email, and user role in the Flask server-side session cookie after login. The @login_required decorator checks for access_token in the session. The @admin_required decorator checks session["user_role"] == "admin". This avoids an extra API call to /users/me on every request.

Consequences: The user role is cached in the session and may become stale if an admin changes a user's role while the user is logged in. The user must log out and log back in to see the updated role. This is acceptable for the expected usage pattern. The session cookie is signed (Flask's default) to prevent tampering.


ADR-009: Separate generation pages in frontend

Status: accepted

Context: The original /generate page handled text, image, and video generation in a single form, which became unwieldy as more generation types were added.

Decision: Create separate Flask routes and Jinja2 templates for each generation type: /generate/text, /generate/image, /generate/video. The /generate route redirects to /generate/text. The navigation bar includes a "Generate" dropdown with links to each sub-page. The video page uses tabs for text-to-video and image-to-video.

Consequences: Each generation type has its own URL, making it bookmarkable and shareable. The navigation is clearer with a dropdown menu. Adding new generation types (e.g., audio) follows the same pattern. The /generate redirect provides a sensible default entry point.