diff --git a/docs/5-building-block-view.md b/docs/5-building-block-view.md index d34062b..5e6978b 100644 --- a/docs/5-building-block-view.md +++ b/docs/5-building-block-view.md @@ -75,14 +75,19 @@ Operational endpoints for application management. Model listing and multi-modal generation via openrouter.ai. -| Method | Path | Auth required | Description | -| ------ | ---------------------------- | ------------- | ------------------------------------------------------ | -| GET | `/ai/models` | ✓ | List available OpenRouter models | -| POST | `/ai/chat` | ✓ | Multi-turn chat completion | -| POST | `/generate/text` | ✓ | Single-prompt text generation (optional system prompt) | -| POST | `/generate/image` | ✓ | Text-to-image generation | -| POST | `/generate/video` | ✓ | Text-to-video generation | -| POST | `/generate/video/from-image` | ✓ | Image-to-video generation | +| Method | Path | Auth required | Description | +| ------ | ---------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------- | +| GET | `/ai/models` | ✓ | List available OpenRouter models | +| POST | `/ai/chat` | ✓ | Multi-turn chat completion | +| POST | `/generate/text` | ✓ | Single-prompt text generation (optional system prompt) | +| POST | `/generate/image` | ✓ | Text-to-image (DALL-E via `/images/generations` or FLUX/GPT-5 Image Mini via `/chat/completions` with `modalities`) | +| POST | `/generate/video` | ✓ | Text-to-video (Sora 2 Pro, Veo 3.1 Fast) — returns `polling_url` | +| POST | `/generate/video/from-image` | ✓ | Image-to-video — returns `polling_url` | +| GET | `/generate/video/status` | ✓ | Poll video generation status via `polling_url` | + +**Video generation flow:** The `/generate/video` and `/generate/video/from-image` endpoints submit a job to OpenRouter's `/api/v1/videos` endpoint and return immediately with `status: "queued"` and a `polling_url`. Clients poll `/generate/video/status?polling_url=...` every 5 seconds until `status` is `"completed"` (returns `unsigned_urls`) or `"failed"`. + +**Image generation routing:** The router auto-detects the model type — models containing `"flux"` or `"gpt-5-image-mini"` are routed to `/chat/completions` with `modalities: ["image"]`, while others (e.g. DALL-E 3) use the legacy `/images/generations` endpoint. ### White Box DB Service (`db.py`) diff --git a/docs/6-runtime-view.md b/docs/6-runtime-view.md index b12a110..342ee23 100644 --- a/docs/6-runtime-view.md +++ b/docs/6-runtime-view.md @@ -28,30 +28,42 @@ Describes concrete behavior and interactions of the system's building blocks in ## Scenario 3: Image Generation -1. User submits image generation form -2. Flask POSTs to `POST /generate/image` -3. AI Service calls openrouter.ai image model -4. Image URL returned to Flask -5. Flask renders page with generated image +1. User submits image generation form with prompt, model, size, aspect ratio, and resolution +2. Flask POSTs to `POST /generate/image` with JWT header +3. Router auto-detects model type: + - **FLUX / GPT-5 Image Mini**: calls `/chat/completions` with `modalities: ["image"]` and `image_config` + - **DALL-E 3**: calls `/images/generations` with `size` and `n` +4. Image URL (base64 data URL or hosted URL) returned to Flask +5. Flask renders page with generated image(s) + +## Scenario 3a: Image Generation with Aspect Ratio & Resolution + +1. User selects aspect ratio (e.g. `16:9`) and resolution (`2K`) on the image generation form +2. Flask POSTs `aspect_ratio` and `image_size` to `POST /generate/image` +3. Backend passes these as `image_config` to the chat completions endpoint (for FLUX/GPT-5 Image Mini) +4. Generated image respects the requested aspect ratio and resolution ## Scenario 4: Video Generation (Text-to-Video) -1. User submits video generation form with prompt and model selection +1. User submits video generation form with prompt, model, aspect ratio, resolution, and duration 2. Flask POSTs to `POST /generate/video` with JWT header 3. Auth Service validates JWT -4. AI Service calls OpenRouter `/video/generations` -5. OpenRouter returns a job response (`status: "queued"` or `"completed"`) -6. FastAPI returns `VideoResponse` to Flask -7. Flask renders result page; if status is `queued`, the UI may poll or notify asynchronously +4. Backend calls OpenRouter `POST /api/v1/videos` with model, prompt, aspect_ratio, resolution, duration_seconds +5. OpenRouter returns `{"id": "...", "polling_url": "..."}` with `status: "queued"` +6. FastAPI returns `VideoResponse` with `polling_url` to Flask +7. Flask renders result page with polling UI +8. Frontend JavaScript polls `GET /generate/video/status?polling_url=...` every 5 seconds +9. When `status` becomes `"completed"`, the response includes `unsigned_urls` — the video is displayed in a `