diff --git a/docs/FEISHU_MODEL_API.md b/docs/FEISHU_MODEL_API.md new file mode 100644 index 0000000..a31474a --- /dev/null +++ b/docs/FEISHU_MODEL_API.md @@ -0,0 +1,255 @@ +# 飞书版模型服务 API 文档 + +面向“故事到视频”链路,整理了可直接在飞书文档发布的接口说明,覆盖语言模型分镜生成、文生图、图生视频以及任务查询与 WebSocket 推送。 + +## 基础信息 +- **Base URL**:`https://api.example.com`(按部署替换) +- **版本**:`v1` +- **认证**:`Authorization: Bearer `,可选 `X-Request-Id` 便于链路追踪 +- **编码**:`Content-Type: application/json; charset=utf-8` +- **幂等**:推荐传 `Idempotency-Key` 防重复提交 +- **默认模型**:文生图 `sd-turbo`,图生视频 `svd-img2vid` + +### 通用返回格式 +```json +{ + "code": 0, + "message": "ok", + "data": {} +} +``` +- `code`:0 表示成功;常见错误码:400 参数错误,401 未授权,404 不存在,429 频控,500 内部错误。 + +### 任务模型 +- 耗时任务返回 `job_id`,通过查询接口或 WebSocket 订阅进度。 +- 状态枚举:`pending` / `running` / `succeeded` / `failed` / `canceled`。 + +#### 通用任务查询 +- `GET /v1/jobs/{job_id}` + +响应示例: +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_xxx", + "type": "llm|t2i|i2v", + "status": "running", + "progress": 42, + "result": {}, + "error": {} + } +} +``` + +## 接口一览 +1. 语言模型:生成分镜 `POST /v1/llm/storyboard` +2. 文生图:生成关键帧 `POST /v1/image/generate` +3. 图生视频:生成短视频片段 `POST /v1/video/generate` +4. 任务查询:`GET /v1/jobs/{job_id}` +5. WebSocket 进度推送(可选):`GET wss://api.example.com/v1/ws` + +--- + +## 1) 语言模型:生成分镜 +- **URL**:`POST /v1/llm/storyboard` +- **功能**:将故事文本转为分镜 JSON,含标题、Prompt、旁白、BGM 建议等。 + +请求体: +```json +{ + "story_id": "story_001", + "story_text": "很久以前……", + "style": "movie", + "target_shots": 8, + "lang": "zh", + "extras": { + "tone": "warm", + "duration_hint_sec": 60 + } +} +``` + +成功响应(任务创建): +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_llm_123", + "status": "running" + } +} +``` + +任务完成结果(经 `GET /v1/jobs/{job_id}` 获取): +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_llm_123", + "status": "succeeded", + "result": { + "story_id": "story_001", + "shots": [ + { + "shot_id": "shot_001", + "title": "清晨的街道", + "prompt": "cinematic morning street, soft light, ...", + "narration": "清晨,主角踏上旅程……", + "bgm_hint": "lofi calm", + "duration_sec": 5 + } + ] + } + } +} +``` + +--- + +## 2) 文生图:生成关键帧 +- **URL**:`POST /v1/image/generate` +- **功能**:基于分镜 Prompt 生成关键帧图片(默认 SD Turbo)。 + +请求体: +```json +{ + "shot_id": "shot_001", + "prompt": "cinematic morning street, soft light, ...", + "negative_prompt": "blurry, lowres", + "style": "movie", + "size": "1024x576", + "model": "sd-turbo", + "seed": 42, + "steps": 15, + "guidance_scale": 3.5, + "scheduler": "euler", + "safety_check": true +} +``` + +成功响应: +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_t2i_456", + "status": "running" + } +} +``` + +任务完成结果: +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_t2i_456", + "status": "succeeded", + "result": { + "shot_id": "shot_001", + "image_url": "https://cdn.example.com/shot_001.png", + "meta": { + "seed": 42, + "size": "1024x576", + "steps": 15, + "model": "sd-turbo" + } + } + } +} +``` + +--- + +## 3) 图生视频:生成短视频片段 +- **URL**:`POST /v1/video/generate` +- **功能**:以关键帧生成短视频片段,默认 Stable Video Diffusion(Img2Vid)。 + +请求体: +```json +{ + "shot_id": "shot_001", + "image_url": "https://cdn.example.com/shot_001.png", + "duration_sec": 4, + "fps": 24, + "resolution": "1280x720", + "model": "svd-img2vid", + "transition": "kenburns", + "motion_strength": 0.7, + "seed": 123, + "audio": { + "voiceover_url": "https://cdn.example.com/vo_shot001.wav", + "bgm_url": "https://cdn.example.com/bgm_lofi.mp3", + "ducking": true + } +} +``` + +成功响应: +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_i2v_789", + "status": "running" + } +} +``` + +任务完成结果: +```json +{ + "code": 0, + "message": "ok", + "data": { + "job_id": "job_i2v_789", + "status": "succeeded", + "result": { + "shot_id": "shot_001", + "video_url": "https://cdn.example.com/shot_001.mp4", + "meta": { + "duration_sec": 4, + "fps": 24, + "resolution": "1280x720", + "model": "svd-img2vid" + } + } + } +} +``` + +--- + +## 4) WebSocket 进度推送(可选) +- **URL**:`GET wss://api.example.com/v1/ws` +- **鉴权**:沿用 HTTP Header。 +- **订阅**:传入 `job_id` 列表。 + +订阅示例: +```json +{ "action": "subscribe", "job_ids": ["job_llm_123","job_t2i_456","job_i2v_789"] } +``` + +推送示例: +```json +{ + "job_id": "job_i2v_789", + "status": "running", + "progress": 65, + "message": "rendering frames" +} +``` + +--- + +## FAQ +- **并行与合成**:每个镜头独立 job,可并行提交;最终合成由客户端/服务端 FFmpeg 处理。 +- **安全与审计**:可在文生图/图生视频前置安全审核(暴恐涉政等)。 +- **失败重试**:结合 `Idempotency-Key` 识别重复请求,失败可重提或按 `job_id` 重拉结果。 diff --git a/model/.dockerignore b/model/.dockerignore new file mode 100644 index 0000000..ff36840 --- /dev/null +++ b/model/.dockerignore @@ -0,0 +1,13 @@ +__pycache__ +*.pyc +*.pyo +*.pyd +*.swp +.env +venv +build +dist +.cache +.huggingface +models +weights diff --git a/model/API.md b/model/API.md new file mode 100644 index 0000000..c811280 --- /dev/null +++ b/model/API.md @@ -0,0 +1,136 @@ +# Model Node API Documentation + +This document specifies the HTTP API exposed by the model node FastAPI service. All endpoints are JSON-based and designed to be wired to real pipelines (Qwen/Ollama, Stable Diffusion Turbo, Stable-Video-Diffusion-Img2Vid, CosyVoice-mini). + +## General +- **Base URL (default docker compose)**: `http://localhost:8000` +- **Content type**: `application/json` +- **Authentication**: not required in the sample; add gateway/token in production. +- **Swagger UI**: `GET /docs` +- **OpenAPI JSON**: `GET /openapi.json` + +## Health +- **Endpoint**: `GET /health` +- **Purpose**: Liveness/readiness signal. +- **Response** + ```json + { + "status": "ok", + "ts": "2024-07-01T10:00:00.000000" + } + ``` + +## Storyboard (LLM) +- **Endpoint**: `POST /llm/storyboard` +- **Description**: Convert a free-form story into structured shots. +- **Request body** + ```json + { + "story": "夕阳下的海边散步", + "style": "pixar" + } + ``` + - `story` (string, required): user story text. + - `style` (string, optional): tone/visual hint. +- **Response 200** + ```json + { + "shots": [ + { + "title": "自动生成分镜", + "prompt": "pixar | 夕阳下的海边散步", + "narration": "夕阳下的海边散步", + "bgm": "lofi-chill" + } + ], + "generated_at": "2024-07-01T10:00:00.123456" + } + ``` + +## Text-to-Image +- **Endpoint**: `POST /sd_generate` +- **Description**: Generate a keyframe image from a prompt. +- **Request body** + ```json + { + "prompt": "sunset beach cinematic", + "style": "anime", + "width": 1024, + "height": 576 + } + ``` + - `prompt` (string, required): image description. + - `style` (string, optional): style preset/tag. + - `width` (int, optional, default 1024) + - `height` (int, optional, default 576) +- **Response 200** + ```json + { + "url": "https://example.com/keyframe.png", + "note": "Requested 1024x576 image in style=anime" + } + ``` + +## Image-to-Video (optional) +- **Endpoint**: `POST /img2vid` +- **Description**: Turn a keyframe into a short clip. +- **Request body** + ```json + { + "image_url": "https://example.com/keyframe.png", + "duration_seconds": 3.0, + "transition": "dissolve" + } + ``` + - `image_url` (string, required): source frame. + - `duration_seconds` (float, optional, default 3.0): clip length. + - `transition` (string, optional): e.g., `dissolve`, `zoom`, `cut`. +- **Response 200** + ```json + { + "url": "https://example.com/clip.mp4", + "note": "duration=3.0s transition=dissolve" + } + ``` + +## Text-to-Speech +- **Endpoint**: `POST /tts` +- **Description**: Generate narration audio for a shot. +- **Request body** + ```json + { + "text": "欢迎使用 StoryToVideo", + "voice": "female" + } + ``` + - `text` (string, required): narration text. + - `voice` (string, optional): speaker/voice style. +- **Response 200** + ```json + { + "url": "https://example.com/narration.wav", + "note": "voice=female" + } + ``` + +## Error model +- **Status codes**: `200` on success. FastAPI will emit `422` for validation errors. +- **Example 422 response** + ```json + { + "detail": [ + { + "loc": ["body", "story"], + "msg": "Field required", + "type": "value_error.missing" + } + ] + } + ``` + +## Quick smoke tests +- Health: `curl http://localhost:8000/health` +- Storyboard: `curl -X POST http://localhost:8000/llm/storyboard -H "Content-Type: application/json" -d '{"story":"夕阳下的海边散步","style":"pixar"}'` +- Text-to-Image: `curl -X POST http://localhost:8000/sd_generate -H "Content-Type: application/json" -d '{"prompt":"sunset beach cinematic","style":"anime"}'` +- Img2Vid: `curl -X POST http://localhost:8000/img2vid -H "Content-Type: application/json" -d '{"image_url":"https://example.com/keyframe.png"}'` +- TTS: `curl -X POST http://localhost:8000/tts -H "Content-Type: application/json" -d '{"text":"欢迎使用 StoryToVideo"}'` diff --git a/model/Dockerfile b/model/Dockerfile new file mode 100644 index 0000000..46f1658 --- /dev/null +++ b/model/Dockerfile @@ -0,0 +1,30 @@ +# CUDA 12.4 + cuDNN + Python 3.10 for GPU model serving +FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 + +ENV DEBIAN_FRONTEND=noninteractive \ + PYTHONDONTWRITEBYTECODE=1 \ + PYTHONUNBUFFERED=1 \ + PIP_NO_CACHE_DIR=1 + +WORKDIR /workspace + +# System deps +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + python3.10 python3.10-venv python3-pip git curl ca-certificates \ + && rm -rf /var/lib/apt/lists/* + +# Upgrade pip and install torch with CUDA 12.4 wheels +RUN python3 -m pip install --upgrade pip \ + && python3 -m pip install \ + torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 \ + --index-url https://download.pytorch.org/whl/cu124 + +COPY requirements.txt ./ +RUN python3 -m pip install -r requirements.txt + +COPY main.py ./ + +EXPOSE 8000 + +CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/model/README.md b/model/README.md index d8df77d..e055b8d 100644 --- a/model/README.md +++ b/model/README.md @@ -1,11 +1,10 @@ # Model Node (Local GPU) / 模型节点 -Purpose: host generation capabilities decoupled from server; accessed via HTTP (FastAPI) and optionally exposed through FRP. +Purpose: host generation capabilities decoupled from server; accessed via HTTP (FastAPI) and optionally exposed through FRP. 目的:承载生成能力,与服务端解耦,通过 FastAPI/HTTP 暴露,必要时用 FRP 打通。 ## Suggested components / 推荐组件 -- LLM: Qwen2.5-0.5B via Ollama → story structure / storyboard JSON / narration draft. - 文本生成分镜 JSON/旁白。 +- LLM: Qwen2.5-0.5B via Ollama → story structure / storyboard JSON / narration draft. 文本生成分镜 JSON/旁白。 - T2I: Stable Diffusion Turbo (diffusers) → keyframes. / 关键帧生图 - I2V (optional): Stable-Video-Diffusion-Img2Vid → short clips. / 图生视频(可选) - TTS: CosyVoice-mini → narration audio. / 旁白语音 @@ -24,6 +23,128 @@ async def sd_generate(req: dict): return {"url": "https://.../image.png"} ``` +## GPU Dockerized model node (RTX 4060 Laptop, CUDA 12.4) +针对截图中的 Windows 11 + RTX 4060 Laptop + 最新 NVIDIA 驱动(550+),新增了一个可直接构建的 GPU 模型容器。容器默认暴露 FastAPI 模型桩服务,并附带一个 GPU 版 Ollama 服务用来拉取 Qwen2.5-0.5B。 + +### 1) Host prerequisites / 主机前置 +- Windows 11 + WSL2 + Docker Desktop;NVIDIA 550+ 驱动(支持 CUDA 12.4)。 +- 安装 `nvidia-container-toolkit`,确保 `nvidia-smi` 在 WSL 中可用: + ```bash + # inside WSL + distribution=$(. /etc/os-release;echo $ID$VERSION_ID) + curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg + curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list + sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit + sudo nvidia-ctk runtime configure --runtime=docker + sudo systemctl restart docker + ``` + +### 2) Build & run / 构建与运行 +```bash +cd model +# 构建 CUDA12.4 + PyTorch 2.4.0 的模型服务镜像 +docker compose -f docker-compose.gpu.yml build +# 启动模型节点(FastAPI)和 GPU 版 Ollama +docker compose -f docker-compose.gpu.yml up -d +``` +- FastAPI 模型桩:`http://localhost:8000`(健康检查 `/health`,业务接口 `/llm/storyboard`、`/sd_generate`、`/img2vid`、`/tts`)。详见 [完整 API 文档](./API.md),或查看便于飞书分享的精简版接口说明 [docs/FEISHU_MODEL_API.md](../docs/FEISHU_MODEL_API.md)。 +- Ollama:`http://localhost:11434`。 + +### 3) Pull Qwen 模型(在容器内执行) +```bash +# 进入 ollama 容器,拉取 Qwen2.5 0.5B +docker compose -f docker-compose.gpu.yml exec ollama ollama pull qwen2.5:0.5b +``` + +### 4) 挂载与缓存 +- `./weights` 挂载到容器 `/models`,用于 HF/SD/SVD/CosyVoice 等权重缓存。 +- Ollama 权重持久化到 compose 中的 `ollama` 卷,可跨重启保留。 + +### 5) 接入提示 +- 将真实推理逻辑接到 `model/main.py` 中的 TODO(Qwen/Ollama、SD Turbo、SVD、CosyVoice)。 +- 如果需要对外暴露端口到公网,复用仓库 `frp/` 下的示例配置。 + +### 6) 本地自动化测试 +- 安装依赖(含 pytest):`pip install -r requirements.txt` +- 运行测试: + ```bash + cd model + pytest + ``` + +### 7) 镜像构建后如何测试 API +容器启动后,可直接在宿主机用 `curl`/Postman/浏览器验证: + +- 健康检查: + ```bash + curl http://localhost:8000/health + ``` +- 分镜/LLM: + ```bash + curl -X POST http://localhost:8000/llm/storyboard \ + -H "Content-Type: application/json" \ + -d '{"story": "夕阳下的海边散步", "style": "pixar"}' + ``` +- 文生图(Stable Diffusion Turbo 桩): + ```bash + curl -X POST http://localhost:8000/sd_generate \ + -H "Content-Type: application/json" \ + -d '{"prompt": "sunset beach cinematic", "style": "anime", "width": 1024, "height": 576}' + ``` +- 图生视频(Stable Video Diffusion 桩): + ```bash + curl -X POST http://localhost:8000/img2vid \ + -H "Content-Type: application/json" \ + -d '{"image_url": "https://example.com/keyframe.png", "duration_seconds": 3.0, "transition": "dissolve"}' + ``` +- 旁白 TTS(CosyVoice 桩): + ```bash + curl -X POST http://localhost:8000/tts \ + -H "Content-Type: application/json" \ + -d '{"text": "欢迎使用 StoryToVideo", "voice": "female"}' + ``` + +> 也可以打开 `http://localhost:8000/docs` 使用 FastAPI 提供的交互式 Swagger UI 进行可视化调试;出现错误时用 `docker compose -f docker-compose.gpu.yml logs -f model` 查看容器日志。 + +### 8) 将镜像推送到仓库(供前后端同学复用) +构建完成后,可将镜像推送到 Docker Hub 或 GHCR,便于前后端直接 `pull` 部署。 + +- **Docker Hub 示例** + ```bash + # 1) 构建(如果尚未构建) + docker compose -f docker-compose.gpu.yml build + + # 2) 登录 Docker Hub + docker login + + # 3) 打 tag 并推送(替换 yourname 为你的仓库命名空间) + docker tag storytovideo-model-node:cuda12.4 yourname/storytovideo-model-node:cuda12.4 + docker push yourname/storytovideo-model-node:cuda12.4 + ``` + +- **GitHub Container Registry (GHCR) 示例** + ```bash + export GH_USER="your-github-username" + echo "$GH_PAT" | docker login ghcr.io -u "$GH_USER" --password-stdin + + docker tag storytovideo-model-node:cuda12.4 ghcr.io/$GH_USER/storytovideo-model-node:cuda12.4 + docker push ghcr.io/$GH_USER/storytovideo-model-node:cuda12.4 + ``` + +- **前后端同学使用方法**:直接 `docker pull `,或在部署 compose 中用远端镜像替换本地构建: + ```yaml + services: + model-node: + image: ghcr.io/your-namespace/storytovideo-model-node:cuda12.4 + runtime: nvidia + ports: + - "8000:8000" + volumes: + - ./weights:/models + ``` + 若需要从私有仓库拉取,请提前配置 `docker login` 或在 CI/CD 中通过密钥注入。 + ## Deployment / 部署 - Run on local GPU; package models separately from server. / 本地 GPU 运行,独立包模型。 - Expose ports via `frpc` to cloud `frps`. / 用 frpc 将端口暴露给公网 frps。 diff --git a/model/__init__.py b/model/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/model/docker-compose.gpu.yml b/model/docker-compose.gpu.yml new file mode 100644 index 0000000..5a2df90 --- /dev/null +++ b/model/docker-compose.gpu.yml @@ -0,0 +1,47 @@ +version: "3.8" + +services: + model-node: + build: + context: . + dockerfile: Dockerfile + image: storytovideo-model-node:cuda12.4 + runtime: nvidia + environment: + - NVIDIA_VISIBLE_DEVICES=all + - NVIDIA_DRIVER_CAPABILITIES=compute,utility + - HF_HOME=/models/hf + - TRANSFORMERS_CACHE=/models/hf + - TORCH_CUDNN_V8_API_ENABLED=1 + volumes: + - ./weights:/models + ports: + - "8000:8000" + command: ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: [gpu] + + ollama: + image: ollama/ollama:latest + runtime: nvidia + environment: + - OLLAMA_HOST=0.0.0.0 + ports: + - "11434:11434" + volumes: + - ollama:/root/.ollama + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: [gpu] + +volumes: + ollama: diff --git a/model/main.py b/model/main.py new file mode 100644 index 0000000..1a0b34b --- /dev/null +++ b/model/main.py @@ -0,0 +1,103 @@ +"""Minimal FastAPI entrypoint for model node. + +This keeps the same endpoint names used in the docs and can be wired to +real pipelines (Qwen via Ollama, Stable Diffusion, Stable Video Diffusion, +CosyVoice, etc.). For now, it echoes the request to keep the container +lightweight while providing a health surface for integration. +""" +from datetime import datetime +from typing import List, Optional + +from fastapi import FastAPI +from pydantic import BaseModel, Field + + +class StoryboardRequest(BaseModel): + story: str = Field(..., description="User story text") + style: Optional[str] = Field(None, description="Tone or visual style hint") + + +class Shot(BaseModel): + title: str + prompt: str + narration: str + bgm: Optional[str] = None + + +class StoryboardResponse(BaseModel): + shots: List[Shot] + generated_at: datetime + + +class SDRequest(BaseModel): + prompt: str + style: Optional[str] = None + width: int = 1024 + height: int = 576 + + +class SDResponse(BaseModel): + url: str + note: Optional[str] = None + + +class Img2VidRequest(BaseModel): + image_url: str + duration_seconds: float = 3.0 + transition: Optional[str] = Field(None, description="e.g. dissolve, zoom") + + +class Img2VidResponse(BaseModel): + url: str + note: Optional[str] = None + + +class TTSRequest(BaseModel): + text: str + voice: Optional[str] = Field(None, description="voice name or speaker id") + + +class TTSResponse(BaseModel): + url: str + note: Optional[str] = None + + +app = FastAPI(title="StoryToVideo Model Node", version="0.1.0") + + +@app.get("/health") +def health() -> dict: + return {"status": "ok", "ts": datetime.utcnow().isoformat()} + + +@app.post("/llm/storyboard", response_model=StoryboardResponse) +def storyboard(req: StoryboardRequest) -> StoryboardResponse: + # TODO: wire to Qwen/Ollama + shot = Shot( + title="自动生成分镜", + prompt=f"{req.style or '默认风格'} | {req.story[:80]}", + narration=req.story, + bgm="lofi-chill" + ) + return StoryboardResponse(shots=[shot], generated_at=datetime.utcnow()) + + +@app.post("/sd_generate", response_model=SDResponse) +def sd_generate(req: SDRequest) -> SDResponse: + # TODO: wire to Stable Diffusion Turbo pipeline + note = f"Requested {req.width}x{req.height} image in style={req.style or 'default'}" + return SDResponse(url="https://example.com/keyframe.png", note=note) + + +@app.post("/img2vid", response_model=Img2VidResponse) +def img2vid(req: Img2VidRequest) -> Img2VidResponse: + # TODO: wire to Stable-Video-Diffusion-Img2Vid + note = f"duration={req.duration_seconds}s transition={req.transition or 'cut'}" + return Img2VidResponse(url="https://example.com/clip.mp4", note=note) + + +@app.post("/tts", response_model=TTSResponse) +def tts(req: TTSRequest) -> TTSResponse: + # TODO: wire to CosyVoice-mini + note = f"voice={req.voice or 'default'}" + return TTSResponse(url="https://example.com/narration.wav", note=note) diff --git a/model/requirements.txt b/model/requirements.txt new file mode 100644 index 0000000..b467ed1 --- /dev/null +++ b/model/requirements.txt @@ -0,0 +1,13 @@ +fastapi==0.111.0 +uvicorn[standard]==0.30.1 +pydantic==2.7.3 +transformers==4.42.4 +diffusers==0.29.2 +accelerate==0.31.0 +safetensors==0.4.3 +sentencepiece==0.2.0 +numpy==1.26.4 +pillow==10.3.0 +soundfile==0.12.1 +pytest==8.2.2 +httpx==0.27.0 diff --git a/model/tests/test_api.py b/model/tests/test_api.py new file mode 100644 index 0000000..303831c --- /dev/null +++ b/model/tests/test_api.py @@ -0,0 +1,62 @@ +from fastapi.testclient import TestClient + +from model import main + + +client = TestClient(main.app) + + +def test_health(): + resp = client.get("/health") + assert resp.status_code == 200 + payload = resp.json() + assert payload.get("status") == "ok" + assert "ts" in payload + + +def test_storyboard_endpoint(): + payload = {"story": "夕阳下的海边散步", "style": "pixar"} + resp = client.post("/llm/storyboard", json=payload) + assert resp.status_code == 200 + data = resp.json() + assert "shots" in data and isinstance(data["shots"], list) + assert data["shots"], "should return at least one shot" + first_shot = data["shots"][0] + assert first_shot["title"] + assert payload["story"] in first_shot["narration"] + + +def test_sd_generate_endpoint(): + payload = { + "prompt": "sunset beach cinematic", + "style": "anime", + "width": 1024, + "height": 576, + } + resp = client.post("/sd_generate", json=payload) + assert resp.status_code == 200 + data = resp.json() + assert data["url"].startswith("http") + assert "1024x576" in data["note"] + + +def test_img2vid_endpoint(): + payload = { + "image_url": "https://example.com/keyframe.png", + "duration_seconds": 3.0, + "transition": "dissolve", + } + resp = client.post("/img2vid", json=payload) + assert resp.status_code == 200 + data = resp.json() + assert data["url"].endswith(".mp4") + assert "duration=3.0" in data["note"] + + +def test_tts_endpoint(): + payload = {"text": "欢迎使用 StoryToVideo", "voice": "female"} + resp = client.post("/tts", json=payload) + assert resp.status_code == 200 + data = resp.json() + assert data["url"].endswith(".wav") + assert "voice=female" in data["note"]