299ad7d943
Co-authored-by: Copilot <copilot@github.com>
32 lines
2.0 KiB
Markdown
32 lines
2.0 KiB
Markdown
# OpenRouter API Integration
|
|
|
|
## Text Generation
|
|
|
|
> [!warning]
|
|
> TODO: Add more details on how the backend integrates with OpenRouter for text generation, including chat completions and single-prompt generation flows.
|
|
|
|
## Image Generation
|
|
|
|
Image generation uses two different OpenRouter endpoints depending on the model:
|
|
|
|
- **Legacy endpoint** (`/images/generations`): Used by DALL-E 3 and similar models. Returns `data[].url` and `data[].b64_json`.
|
|
- **Chat completions** (`/chat/completions` with `modalities: ["image"]`): Used by FLUX.2 Klein 4B and GPT-5 Image Mini. Returns `choices[0].message.images[].image_url.url` as base64 data URLs.
|
|
|
|
The router auto-detects the model type and routes accordingly. Image configuration (`aspect_ratio`, `image_size`) is passed via `image_config` for chat-based models.
|
|
|
|
## Video Generation
|
|
|
|
Video generation uses OpenRouter's `/api/v1/videos` endpoint with a **submit-and-poll** pattern orchestrated by a background worker:
|
|
|
|
1. User submits a video request via `POST /generate/video` (or `/generate/video/from-image`)
|
|
2. Backend inserts a row into `generated_videos` with `status: "queued"` and returns immediately
|
|
3. Background worker (`video_worker.py`) picks up queued jobs every 15 seconds:
|
|
- Calls `POST /api/v1/videos` with `model`, `prompt`, `aspect_ratio`, `resolution`, `duration`
|
|
- Receives `{"id": "job_id", "polling_url": "https://..."}` and updates DB to `status: "processing"`
|
|
- Polls `GET polling_url` every 15 seconds until `status` is `"completed"` or `"failed"`
|
|
- Updates DB with final status, `video_url`, and any `error` message
|
|
4. Frontend polls `GET /generate/video/{db_id}/status` every 5 seconds to show live updates
|
|
5. Completed response includes `video_url` — the video is displayed in a `<video>` element
|
|
|
|
Supported models: `openai/sora-2-pro`, `google/veo-3.1-fast`. Both text-to-video and image-to-video use the same `/api/v1/videos` endpoint (image-to-video includes `frame_images` with `first_frame` in the request body).
|