Compare commits

..

11 Commits

Author SHA1 Message Date
1ab7bdef1c chore: remove obsolete test files for IndexTTS2 and grok-4 response format 2026-04-07 18:13:39 +08:00
6d93025453 fix: update gitignore paths from canto-backend/frontend to backend/frontend
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 18:13:26 +08:00
60489eab59 refactor: rename canto-backend → backend, canto-frontend → frontend
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 18:11:00 +08:00
2fa9c1fcb6 refactor: rename backend/frontend dirs and remove NovelWriter submodule
- Rename qwen3-tts-backend → canto-backend
- Rename qwen3-tts-frontend → canto-frontend
- Remove NovelWriter embedded repo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 18:03:29 +08:00
777a7ec006 feat: update genre and subgenre labels to Chinese localization 2026-04-07 14:51:30 +08:00
a144540cbe feat: update emotion handling and adjust alpha levels in TTS and LLM services 2026-04-07 14:17:29 +08:00
a8d6195cdb feat: enhance logging for character updates and voice cache management 2026-04-07 11:38:35 +08:00
b395cb0b98 Refactor localization files and remove Aliyun references 2026-04-07 11:37:47 +08:00
2662b494c5 feat: add regenerate all previews functionality and update localization strings 2026-04-07 11:03:11 +08:00
96b2eaf774 feat: add edit character dialog with localization support 2026-04-07 10:50:29 +08:00
d170ba3362 feat: add DEV_MODE configuration and implement dev-token endpoint for authentication 2026-04-07 10:39:07 +08:00
354 changed files with 880 additions and 2439 deletions

16
.gitignore vendored
View File

@@ -26,16 +26,16 @@ checkpoints/
docker/models/ docker/models/
docker/data/ docker/data/
docker/.env docker/.env
qwen3-tts-frontend/node_modules/ frontend/node_modules/
qwen3-tts-frontend/dist/ frontend/dist/
qwen3-tts-frontend/.env frontend/.env
qwen3-tts-frontend/.env.local frontend/.env.local
CLAUDE.md CLAUDE.md
样本.mp3 样本.mp3
aliyun.md aliyun.md
/nginx.conf /nginx.conf
deploy.md deploy.md
qwen3-tts-backend/scripts backend/scripts
qwen3-tts-backend/examples backend/examples
qwen3-tts-backend/qwen3-tts.service backend/canto.service
qwen3-tts-frontend/.env.production frontend/.env.production

View File

@@ -1,348 +0,0 @@
# Qwen3-TTS WebUI
> **⚠️ 注意:** 本项目由大量 AI 生成,目前处于不稳定状态。稳定版将在 [Releases](../../releases) 中发布。
**非官方** 基于 Qwen3-TTS 的文本转语音 Web 应用,支持自定义语音、语音设计和语音克隆,提供直观的 Web 界面。
> 这是一个非官方项目。如需查看官方 Qwen3-TTS 仓库,请访问 [QwenLM/Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS)。
[English Documentation](./README.md)
## 功能特性
- 自定义语音:预定义说话人语音
- 语音设计:自然语言描述创建语音
- 语音克隆:上传音频克隆语音
- **IndexTTS2**:高质量语音克隆,支持情感控制(高兴、愤怒、悲伤、恐惧、惊讶等),由 [IndexTTS2](https://github.com/iszhanjiawei/indexTTS2) 驱动
- 有声书生成:上传 EPUB 文件,通过 LLM 自动提取角色并分配语音,生成多角色有声书;支持为每个角色单独启用 IndexTTS2
- 双后端支持:支持本地模型和阿里云 TTS API 切换
- 多语言支持English、简体中文、繁體中文、日本語、한국어
- JWT 认证、异步任务、语音缓存、暗黑模式
## 界面预览
### 桌面端 - 亮色模式
![亮色模式](./images/lightmode-english.png)
### 桌面端 - 暗黑模式
![暗黑模式](./images/darkmode-chinese.png)
### 移动端
<table>
<tr>
<td width="50%"><img src="./images/mobile-lightmode-custom.png" alt="移动端亮色模式" /></td>
<td width="50%"><img src="./images/mobile-settings.png" alt="移动端设置" /></td>
</tr>
</table>
### 有声书生成
![有声书概览](./images/audiobook-overview.png)
<table>
<tr>
<td width="50%"><img src="./images/audiobook-characters.png" alt="有声书角色列表" /></td>
<td width="50%"><img src="./images/audiobook-chapters.png" alt="有声书章节列表" /></td>
</tr>
</table>
## 技术栈
**后端**: FastAPI + SQLAlchemy + PyTorch + JWT
- 使用 PyTorch 直接推理 Qwen3-TTS 模型
- 异步任务处理与批量优化
- 支持本地模型 + 阿里云 API 双后端
**前端**: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
## Docker 部署
预构建镜像已发布至 Docker Hub[bdim404/qwen3-tts-backend](https://hub.docker.com/r/bdim404/qwen3-tts-backend)、[bdim404/qwen3-tts-frontend](https://hub.docker.com/r/bdim404/qwen3-tts-frontend)
**前置要求**Docker、Docker Compose、NVIDIA GPU + [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)
```bash
git clone https://github.com/bdim404/Qwen3-TTS-WebUI.git
cd Qwen3-TTS-webUI
# 下载模型到 docker/models/(参见下方"安装部署 > 下载模型"
mkdir -p docker/models docker/data
# 配置
cp docker/.env.example docker/.env
# 编辑 docker/.env设置 SECRET_KEY
cd docker
# 拉取预构建镜像
docker compose pull
# 启动(仅 CPU
docker compose up -d
# 启动GPU 加速)
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
```
访问 `http://localhost`,默认账号:`admin` / `admin123456`
## 安装部署
### 环境要求
- Python 3.9+ 并支持 CUDA用于本地模型推理
- Node.js 18+(用于前端)
- Git
### 1. 克隆仓库
```bash
git clone https://github.com/bdim404/Qwen3-TTS-WebUI.git
cd Qwen3-TTS-webUI
```
### 2. 下载模型
**重要**: 模型**不会**自动下载,需要手动下载。
详细信息请访问官方仓库:[Qwen3-TTS 模型](https://github.com/QwenLM/Qwen3-TTS)
进入模型目录:
```bash
# Docker 部署
mkdir -p docker/models && cd docker/models
# 本地部署
cd qwen3-tts-backend && mkdir -p Qwen && cd Qwen
```
**方式一:通过 ModelScope 下载(推荐中国大陆用户)**
```bash
pip install -U modelscope
modelscope download --model Qwen/Qwen3-TTS-Tokenizer-12Hz --local_dir ./Qwen3-TTS-Tokenizer-12Hz
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local_dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./Qwen3-TTS-12Hz-1.7B-Base
```
可选的 0.6B 模型(更小、更快):
```bash
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./Qwen3-TTS-12Hz-0.6B-Base
```
**方式二:通过 Hugging Face 下载**
```bash
pip install -U "huggingface_hub[cli]"
hf download Qwen/Qwen3-TTS-Tokenizer-12Hz --local-dir ./Qwen3-TTS-Tokenizer-12Hz
hf download Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
hf download Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local-dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
hf download Qwen/Qwen3-TTS-12Hz-1.7B-Base --local-dir ./Qwen3-TTS-12Hz-1.7B-Base
```
可选的 0.6B 模型(更小、更快):
```bash
hf download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
hf download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./Qwen3-TTS-12Hz-0.6B-Base
```
**IndexTTS2 模型(可选,用于情感控制语音克隆)**
IndexTTS2 是可选功能。如需使用,在同一 `Qwen/` 目录下运行:
```bash
# 仅下载所需文件,无需下载完整仓库
hf download IndexTeam/IndexTTS-2 \
bpe.model config.yaml feat1.pt feat2.pt gpt.pth s2mel.pth wav2vec2bert_stats.pt \
--local-dir ./IndexTTS2
```
然后安装 indextts 包:
```bash
git clone https://github.com/iszhanjiawei/indexTTS2.git
cd indexTTS2
pip install -e . --no-deps
cd ..
```
**最终目录结构:**
Docker 部署(`docker/models/`
```
Qwen3-TTS-webUI/
└── docker/
└── models/
├── Qwen3-TTS-Tokenizer-12Hz/
├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
└── Qwen3-TTS-12Hz-1.7B-Base/
```
本地部署(`qwen3-tts-backend/Qwen/`
```
Qwen3-TTS-webUI/
└── qwen3-tts-backend/
└── Qwen/
├── Qwen3-TTS-Tokenizer-12Hz/
├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
├── Qwen3-TTS-12Hz-1.7B-Base/
└── IndexTTS2/ ← 可选,用于 IndexTTS2 功能
├── bpe.model
├── config.yaml
├── feat1.pt
├── feat2.pt
├── gpt.pth
├── s2mel.pth
└── wav2vec2bert_stats.pt
```
### 3. 后端配置
```bash
cd qwen3-tts-backend
# 创建虚拟环境
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 安装依赖
pip install -r requirements.txt
# 安装 Qwen3-TTS
pip install qwen-tts
# 创建配置文件
cp .env.example .env
# 编辑配置文件
# 本地模型:设置 MODEL_BASE_PATH=./Qwen
# 仅阿里云 API设置 DEFAULT_BACKEND=aliyun
nano .env # 或使用其他编辑器
```
**重要的后端配置** (`.env` 文件)
```env
MODEL_DEVICE=cuda:0 # 使用 GPU或 cpu 使用 CPU
MODEL_BASE_PATH=./Qwen # 已下载模型的路径
DEFAULT_BACKEND=local # 使用本地模型用 'local'API 用 'aliyun'
DATABASE_URL=sqlite:///./qwen_tts.db
SECRET_KEY=your-secret-key-here # 请修改此项!
```
启动后端服务:
```bash
# 使用 uvicorn 直接启动
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# 或使用 conda如果你喜欢
conda run -n qwen3-tts uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```
验证后端是否运行:
```bash
curl http://127.0.0.1:8000/health
```
### 4. 前端配置
```bash
cd qwen3-tts-frontend
# 安装依赖
npm install
# 创建配置文件
cp .env.example .env
# 启动开发服务器
npm run dev
```
### 5. 访问应用
在浏览器中打开:`http://localhost:5173`
**默认账号**
- 用户名:`admin`
- 密码:`admin123456`
- **重要**: 登录后请立即修改密码!
### 生产环境部署
用于生产环境:
```bash
# 后端:使用 gunicorn 或类似的 WSGI 服务器
cd qwen3-tts-backend
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000
# 前端:构建静态文件
cd qwen3-tts-frontend
npm run build
# 使用 nginx 或其他 Web 服务器提供 'dist' 文件夹
```
## 配置
### 后端配置
后端 `.env` 关键配置:
```env
SECRET_KEY=your-secret-key
MODEL_DEVICE=cuda:0
MODEL_BASE_PATH=../Qwen
DATABASE_URL=sqlite:///./qwen_tts.db
DEFAULT_BACKEND=local
ALIYUN_REGION=beijing
ALIYUN_MODEL_FLASH=qwen3-tts-flash-realtime
ALIYUN_MODEL_VC=qwen3-tts-vc-realtime-2026-01-15
ALIYUN_MODEL_VD=qwen3-tts-vd-realtime-2026-01-15
```
**后端选项:**
- `DEFAULT_BACKEND`: 默认 TTS 后端,可选值:`local``aliyun`
- **本地模式**: 使用本地 Qwen3-TTS 模型(需要配置 `MODEL_BASE_PATH`
- **阿里云模式**: 使用阿里云 TTS API需要用户在设置页面配置 API 密钥)
**阿里云配置:**
- 用户需要在 Web 界面的设置页面添加阿里云 API 密钥
- API 密钥经过加密后安全存储在数据库中
- 超级管理员可以控制是否为所有用户启用本地模型
- 获取阿里云 API 密钥,请访问 [阿里云控制台](https://dashscope.console.aliyun.com/)
## 使用说明
### 切换后端
1. 登录 Web 界面
2. 进入设置页面
3. 配置您偏好的后端:
- **本地模型**:选择"本地模型"(需要超级管理员启用本地模型)
- **阿里云 API**:选择"阿里云"并添加您的 API 密钥
4. 选择的后端将默认用于所有 TTS 操作
5. 也可以通过 API 的 `backend` 参数为单次请求指定不同的后端
### 管理阿里云 API 密钥
1. 在设置页面找到"阿里云 API 密钥"部分
2. 输入您的阿里云 API 密钥
3. 点击"更新密钥"保存并验证
4. 系统会在保存前验证密钥的有效性
5. 可随时使用删除按钮删除密钥
## 特别鸣谢
本项目基于阿里云 Qwen 团队开源的 [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) 官方仓库构建。特别感谢 Qwen 团队开源如此强大的文本转语音模型。
## 许可证
Apache-2.0 license

View File

@@ -69,7 +69,7 @@ def _char_to_response(c, db: Session) -> AudiobookCharacterResponse:
if vd: if vd:
vd_name = vd.name vd_name = vd.name
meta = vd.meta_data or {} meta = vd.meta_data or {}
vd_speaker = meta.get('speaker') or vd.aliyun_voice_id or vd.instruct or None vd_speaker = meta.get('speaker') or vd.instruct or None
return AudiobookCharacterResponse( return AudiobookCharacterResponse(
id=c.id, id=c.id,
project_id=c.project_id, project_id=c.project_id,
@@ -80,7 +80,7 @@ def _char_to_response(c, db: Session) -> AudiobookCharacterResponse:
voice_design_id=c.voice_design_id, voice_design_id=c.voice_design_id,
voice_design_name=vd_name, voice_design_name=vd_name,
voice_design_speaker=vd_speaker, voice_design_speaker=vd_speaker,
use_indextts2=c.use_indextts2 or False,
) )
@@ -561,7 +561,7 @@ async def regenerate_character_preview_endpoint(
from core.audiobook_service import generate_character_preview from core.audiobook_service import generate_character_preview
try: try:
await generate_character_preview(project_id, char_id, current_user, db) await generate_character_preview(project_id, char_id, current_user, db, force_recreate=True)
return {"message": "Preview generated successfully"} return {"message": "Preview generated successfully"}
except ValueError as e: except ValueError as e:
raise HTTPException(status_code=400, detail=str(e)) raise HTTPException(status_code=400, detail=str(e))
@@ -740,14 +740,18 @@ async def update_character(
description=data.description, description=data.description,
instruct=data.instruct, instruct=data.instruct,
voice_design_id=data.voice_design_id, voice_design_id=data.voice_design_id,
use_indextts2=data.use_indextts2,
) )
if data.instruct is not None and char.voice_design_id: if (data.instruct is not None or data.gender is not None) and char.voice_design_id:
voice_design = crud.get_voice_design(db, char.voice_design_id, current_user.id) voice_design = crud.get_voice_design(db, char.voice_design_id, current_user.id)
logger.info(f"update_character: char_id={char_id}, voice_design_id={char.voice_design_id}, found={voice_design is not None}")
if voice_design: if voice_design:
voice_design.instruct = data.instruct if data.instruct is not None:
voice_design.instruct = data.instruct
voice_design.voice_cache_id = None
db.commit() db.commit()
logger.info(f"update_character: cleared voice_cache_id for design {voice_design.id}")
return _char_to_response(char, db) return _char_to_response(char, db)

View File

@@ -1,5 +1,5 @@
from datetime import timedelta from datetime import timedelta
from typing import Annotated from typing import Annotated, Optional
from fastapi import APIRouter, Depends, HTTPException, status, Request from fastapi import APIRouter, Depends, HTTPException, status, Request
from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
@@ -14,26 +14,34 @@ from core.security import (
decode_access_token decode_access_token
) )
from db.database import get_db from db.database import get_db
from db.crud import get_user_by_username, get_user_by_email, create_user, change_user_password, get_user_preferences, update_user_preferences, can_user_use_local_model, can_user_use_nsfw, get_system_setting from db.crud import get_user_by_username, get_user_by_email, create_user, change_user_password, get_user_preferences, update_user_preferences, can_user_use_nsfw, get_system_setting
from schemas.user import User, UserCreate, Token, PasswordChange, AliyunKeyVerifyResponse, UserPreferences, UserPreferencesResponse from schemas.user import User, UserCreate, Token, PasswordChange, UserPreferences, UserPreferencesResponse
from schemas.audiobook import LLMConfigResponse from schemas.audiobook import LLMConfigResponse
router = APIRouter(prefix="/auth", tags=["authentication"]) router = APIRouter(prefix="/auth", tags=["authentication"])
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/auth/token") oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/auth/token", auto_error=not settings.DEV_MODE)
limiter = Limiter(key_func=get_remote_address) limiter = Limiter(key_func=get_remote_address)
async def get_current_user( async def get_current_user(
token: Annotated[str, Depends(oauth2_scheme)], token: Annotated[Optional[str], Depends(oauth2_scheme)],
db: Session = Depends(get_db) db: Session = Depends(get_db)
) -> User: ) -> User:
if settings.DEV_MODE and not token:
user = get_user_by_username(db, username="admin")
if user:
return user
credentials_exception = HTTPException( credentials_exception = HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED, status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials", detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"}, headers={"WWW-Authenticate": "Bearer"},
) )
if token is None:
raise credentials_exception
username = decode_access_token(token) username = decode_access_token(token)
if username is None: if username is None:
raise credentials_exception raise credentials_exception
@@ -99,6 +107,16 @@ async def login(
return {"access_token": access_token, "token_type": "bearer"} return {"access_token": access_token, "token_type": "bearer"}
@router.get("/dev-token", response_model=Token)
async def dev_token(db: Session = Depends(get_db)):
if not settings.DEV_MODE:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Not available outside DEV_MODE")
user = get_user_by_username(db, username="admin")
if not user:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Admin user not found")
access_token = create_access_token(data={"sub": user.username})
return {"access_token": access_token, "token_type": "bearer"}
@router.get("/me", response_model=User) @router.get("/me", response_model=User)
@limiter.limit("30/minute") @limiter.limit("30/minute")
async def get_current_user_info( async def get_current_user_info(
@@ -137,31 +155,6 @@ async def change_password(
return user return user
@router.get("/aliyun-key/verify", response_model=AliyunKeyVerifyResponse)
@limiter.limit("10/minute")
async def verify_aliyun_key(
request: Request,
current_user: Annotated[User, Depends(get_current_user)],
db: Session = Depends(get_db)
):
from core.security import decrypt_api_key
from core.tts_service import AliyunTTSBackend
encrypted = get_system_setting(db, "aliyun_api_key")
if not encrypted:
return AliyunKeyVerifyResponse(valid=False, message="No Aliyun API key configured")
api_key = decrypt_api_key(encrypted)
if not api_key:
return AliyunKeyVerifyResponse(valid=False, message="Failed to decrypt API key")
aliyun_backend = AliyunTTSBackend(api_key=api_key, region=settings.ALIYUN_REGION)
health = await aliyun_backend.health_check()
if health.get("available", False):
return AliyunKeyVerifyResponse(valid=True, message="Aliyun API key is valid and working")
return AliyunKeyVerifyResponse(valid=False, message="Aliyun API key is not working.")
@router.get("/preferences", response_model=UserPreferencesResponse) @router.get("/preferences", response_model=UserPreferencesResponse)
@limiter.limit("30/minute") @limiter.limit("30/minute")
async def get_preferences( async def get_preferences(
@@ -171,14 +164,10 @@ async def get_preferences(
): ):
prefs = get_user_preferences(db, current_user.id) prefs = get_user_preferences(db, current_user.id)
available_backends = ["aliyun"]
if can_user_use_local_model(current_user):
available_backends.append("local")
return { return {
"default_backend": prefs.get("default_backend", "aliyun"), "default_backend": "local",
"onboarding_completed": prefs.get("onboarding_completed", False), "onboarding_completed": prefs.get("onboarding_completed", False),
"available_backends": available_backends "available_backends": ["local"]
} }
@router.put("/preferences") @router.put("/preferences")
@@ -189,13 +178,6 @@ async def update_preferences(
current_user: Annotated[User, Depends(get_current_user)], current_user: Annotated[User, Depends(get_current_user)],
db: Session = Depends(get_db) db: Session = Depends(get_db)
): ):
if preferences.default_backend == "local":
if not can_user_use_local_model(current_user):
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Local model is not available. Please contact administrator."
)
updated_user = update_user_preferences( updated_user = update_user_preferences(
db, db,
current_user.id, current_user.id,

View File

@@ -70,14 +70,7 @@ async def process_custom_voice_job(
logger.info(f"Processing custom-voice job {job_id} with backend {backend_type}") logger.info(f"Processing custom-voice job {job_id} with backend {backend_type}")
user_api_key = None backend = await TTSServiceFactory.get_backend()
if backend_type == "aliyun":
from db.crud import get_system_setting
encrypted = get_system_setting(db, "aliyun_api_key")
if encrypted:
user_api_key = decrypt_api_key(encrypted)
backend = await TTSServiceFactory.get_backend(backend_type, user_api_key)
audio_bytes, sample_rate = await backend.generate_custom_voice(request_data) audio_bytes, sample_rate = await backend.generate_custom_voice(request_data)
@@ -133,19 +126,9 @@ async def process_voice_design_job(
logger.info(f"Processing voice-design job {job_id} with backend {backend_type}") logger.info(f"Processing voice-design job {job_id} with backend {backend_type}")
user_api_key = None backend = await TTSServiceFactory.get_backend()
if backend_type == "aliyun":
from db.crud import get_system_setting
encrypted = get_system_setting(db, "aliyun_api_key")
if encrypted:
user_api_key = decrypt_api_key(encrypted)
backend = await TTSServiceFactory.get_backend(backend_type, user_api_key) audio_bytes, sample_rate = await backend.generate_voice_design(request_data)
if backend_type == "aliyun" and saved_voice_id:
audio_bytes, sample_rate = await backend.generate_voice_design(request_data, saved_voice_id)
else:
audio_bytes, sample_rate = await backend.generate_voice_design(request_data)
timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S") timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
filename = f"{user_id}_{job_id}_{timestamp}.wav" filename = f"{user_id}_{job_id}_{timestamp}.wav"
@@ -200,14 +183,6 @@ async def process_voice_clone_job(
logger.info(f"Processing voice-clone job {job_id} with backend {backend_type}") logger.info(f"Processing voice-clone job {job_id} with backend {backend_type}")
from core.security import decrypt_api_key
user_api_key = None
if backend_type == "aliyun":
from db.crud import get_system_setting
encrypted = get_system_setting(db, "aliyun_api_key")
if encrypted:
user_api_key = decrypt_api_key(encrypted)
with open(ref_audio_path, 'rb') as f: with open(ref_audio_path, 'rb') as f:
ref_audio_data = f.read() ref_audio_data = f.read()
@@ -233,7 +208,7 @@ async def process_voice_clone_job(
ref_audio_data = f.read() ref_audio_data = f.read()
ref_audio_hash = cache_manager.get_audio_hash(ref_audio_data) ref_audio_hash = cache_manager.get_audio_hash(ref_audio_data)
if request_data.get('x_vector_only_mode', False) and backend_type == "local": if request_data.get('x_vector_only_mode', False):
x_vector = None x_vector = None
cache_id = None cache_id = None
@@ -287,9 +262,9 @@ async def process_voice_clone_job(
logger.info(f"Job {job_id} completed (x_vector_only_mode)") logger.info(f"Job {job_id} completed (x_vector_only_mode)")
return return
backend = await TTSServiceFactory.get_backend(backend_type, user_api_key) backend = await TTSServiceFactory.get_backend()
if voice_design_id and backend_type == "local": if voice_design_id:
from db.crud import get_voice_design from db.crud import get_voice_design
design = get_voice_design(db, voice_design_id, user_id) design = get_voice_design(db, voice_design_id, user_id)
cached = await cache_manager.get_cache_by_id(design.voice_cache_id, db) cached = await cache_manager.get_cache_by_id(design.voice_cache_id, db)
@@ -339,34 +314,20 @@ async def create_custom_voice_job(
current_user: User = Depends(get_current_user), current_user: User = Depends(get_current_user),
db: Session = Depends(get_db) db: Session = Depends(get_db)
): ):
from core.security import decrypt_api_key from db.crud import can_user_use_local_model
from db.crud import get_user_preferences, can_user_use_local_model
user_prefs = get_user_preferences(db, current_user.id) if not can_user_use_local_model(current_user):
preferred_backend = user_prefs.get("default_backend", "aliyun")
can_use_local = can_user_use_local_model(current_user)
backend_type = req_data.backend if hasattr(req_data, 'backend') and req_data.backend else preferred_backend
if backend_type == "local" and not can_use_local:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN, status_code=status.HTTP_403_FORBIDDEN,
detail="Local model is not available. Please contact administrator." detail="Local model is not available. Please contact administrator."
) )
if backend_type == "aliyun": backend_type = "local"
from db.crud import get_system_setting
if not get_system_setting(db, "aliyun_api_key"):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Aliyun API key not configured. Please contact administrator."
)
try: try:
validate_text_length(req_data.text) validate_text_length(req_data.text)
language = validate_language(req_data.language) language = validate_language(req_data.language)
speaker = validate_speaker(req_data.speaker, backend_type) speaker = validate_speaker(req_data.speaker)
params = validate_generation_params({ params = validate_generation_params({
'max_new_tokens': req_data.max_new_tokens, 'max_new_tokens': req_data.max_new_tokens,
@@ -430,48 +391,24 @@ async def create_voice_design_job(
current_user: User = Depends(get_current_user), current_user: User = Depends(get_current_user),
db: Session = Depends(get_db) db: Session = Depends(get_db)
): ):
from core.security import decrypt_api_key from db.crud import can_user_use_local_model, get_voice_design, update_voice_design_usage
from db.crud import get_user_preferences, can_user_use_local_model, get_voice_design, update_voice_design_usage
user_prefs = get_user_preferences(db, current_user.id) if not can_user_use_local_model(current_user):
preferred_backend = user_prefs.get("default_backend", "aliyun") raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Local model is not available. Please contact administrator."
)
can_use_local = can_user_use_local_model(current_user) backend_type = "local"
backend_type = req_data.backend if hasattr(req_data, 'backend') and req_data.backend else preferred_backend
saved_voice_id = None
if req_data.saved_design_id: if req_data.saved_design_id:
saved_design = get_voice_design(db, req_data.saved_design_id, current_user.id) saved_design = get_voice_design(db, req_data.saved_design_id, current_user.id)
if not saved_design: if not saved_design:
raise HTTPException(status_code=404, detail="Saved voice design not found") raise HTTPException(status_code=404, detail="Saved voice design not found")
if saved_design.backend_type != backend_type:
raise HTTPException(
status_code=400,
detail=f"Saved design backend ({saved_design.backend_type}) doesn't match current backend ({backend_type})"
)
req_data.instruct = saved_design.instruct req_data.instruct = saved_design.instruct
saved_voice_id = saved_design.aliyun_voice_id
update_voice_design_usage(db, req_data.saved_design_id, current_user.id) update_voice_design_usage(db, req_data.saved_design_id, current_user.id)
if backend_type == "local" and not can_use_local:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Local model is not available. Please contact administrator."
)
if backend_type == "aliyun":
from db.crud import get_system_setting
if not get_system_setting(db, "aliyun_api_key"):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Aliyun API key not configured. Please contact administrator."
)
try: try:
validate_text_length(req_data.text) validate_text_length(req_data.text)
language = validate_language(req_data.language) language = validate_language(req_data.language)
@@ -553,29 +490,15 @@ async def create_voice_clone_job(
current_user: User = Depends(get_current_user), current_user: User = Depends(get_current_user),
db: Session = Depends(get_db) db: Session = Depends(get_db)
): ):
from core.security import decrypt_api_key from db.crud import can_user_use_local_model, get_voice_design
from db.crud import get_user_preferences, can_user_use_local_model, get_voice_design
user_prefs = get_user_preferences(db, current_user.id) if not can_user_use_local_model(current_user):
preferred_backend = user_prefs.get("default_backend", "aliyun")
can_use_local = can_user_use_local_model(current_user)
backend_type = backend if backend else preferred_backend
if backend_type == "local" and not can_use_local:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN, status_code=status.HTTP_403_FORBIDDEN,
detail="Local model is not available. Please contact administrator." detail="Local model is not available. Please contact administrator."
) )
if backend_type == "aliyun": backend_type = "local"
from db.crud import get_system_setting
if not get_system_setting(db, "aliyun_api_key"):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Aliyun API key not configured. Please contact administrator."
)
ref_audio_data = None ref_audio_data = None
ref_audio_hash = None ref_audio_hash = None
@@ -600,9 +523,6 @@ async def create_voice_clone_job(
if not design: if not design:
raise ValueError("Voice design not found") raise ValueError("Voice design not found")
if design.backend_type != backend_type:
raise ValueError(f"Voice design backend ({design.backend_type}) doesn't match request backend ({backend_type})")
if not design.voice_cache_id: if not design.voice_cache_id:
raise ValueError("Voice design has no prepared clone prompt. Please call /voice-designs/{id}/prepare-clone first") raise ValueError("Voice design has no prepared clone prompt. Please call /voice-designs/{id}/prepare-clone first")

View File

@@ -5,7 +5,6 @@ from slowapi import Limiter
from slowapi.util import get_remote_address from slowapi.util import get_remote_address
from api.auth import get_current_user from api.auth import get_current_user
from config import settings
from core.security import get_password_hash from core.security import get_password_hash
from db.database import get_db from db.database import get_db
from db.crud import ( from db.crud import (
@@ -17,7 +16,7 @@ from db.crud import (
update_user, update_user,
delete_user delete_user
) )
from schemas.user import User, UserCreateByAdmin, UserUpdate, UserListResponse, AliyunKeyUpdate, AliyunKeyVerifyResponse from schemas.user import User, UserCreateByAdmin, UserUpdate, UserListResponse
from schemas.audiobook import LLMConfigUpdate, LLMConfigResponse, NsfwSynopsisGenerationRequest, NsfwScriptGenerationRequest from schemas.audiobook import LLMConfigUpdate, LLMConfigResponse, NsfwSynopsisGenerationRequest, NsfwScriptGenerationRequest
router = APIRouter(prefix="/users", tags=["users"]) router = APIRouter(prefix="/users", tags=["users"])
@@ -181,63 +180,6 @@ async def delete_user_by_id(
) )
@router.post("/system/aliyun-key")
@limiter.limit("5/minute")
async def set_system_aliyun_key(
request: Request,
key_data: AliyunKeyUpdate,
db: Session = Depends(get_db),
_: User = Depends(require_superuser)
):
from core.security import encrypt_api_key
from core.tts_service import AliyunTTSBackend
from db.crud import set_system_setting
api_key = key_data.api_key.strip()
aliyun_backend = AliyunTTSBackend(api_key=api_key, region=settings.ALIYUN_REGION)
health = await aliyun_backend.health_check()
if not health.get("available", False):
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid Aliyun API key.")
set_system_setting(db, "aliyun_api_key", encrypt_api_key(api_key))
return {"message": "Aliyun API key updated"}
@router.delete("/system/aliyun-key")
@limiter.limit("5/minute")
async def delete_system_aliyun_key(
request: Request,
db: Session = Depends(get_db),
_: User = Depends(require_superuser)
):
from db.crud import delete_system_setting
delete_system_setting(db, "aliyun_api_key")
return {"message": "Aliyun API key deleted"}
@router.get("/system/aliyun-key/verify", response_model=AliyunKeyVerifyResponse)
@limiter.limit("10/minute")
async def verify_system_aliyun_key(
request: Request,
db: Session = Depends(get_db),
_: User = Depends(require_superuser)
):
from core.security import decrypt_api_key
from core.tts_service import AliyunTTSBackend
from db.crud import get_system_setting
encrypted = get_system_setting(db, "aliyun_api_key")
if not encrypted:
return AliyunKeyVerifyResponse(valid=False, message="No Aliyun API key configured")
api_key = decrypt_api_key(encrypted)
if not api_key:
return AliyunKeyVerifyResponse(valid=False, message="Failed to decrypt API key")
aliyun_backend = AliyunTTSBackend(api_key=api_key, region=settings.ALIYUN_REGION)
health = await aliyun_backend.health_check()
if health.get("available", False):
return AliyunKeyVerifyResponse(valid=True, message="Aliyun API key is valid and working")
return AliyunKeyVerifyResponse(valid=False, message="Aliyun API key is not working.")
@router.put("/system/llm-config") @router.put("/system/llm-config")
@limiter.limit("10/minute") @limiter.limit("10/minute")
async def set_system_llm_config( async def set_system_llm_config(

View File

@@ -33,9 +33,7 @@ def to_voice_design_response(design) -> VoiceDesignResponse:
id=design.id, id=design.id,
user_id=design.user_id, user_id=design.user_id,
name=design.name, name=design.name,
backend_type=design.backend_type,
instruct=design.instruct, instruct=design.instruct,
aliyun_voice_id=design.aliyun_voice_id,
meta_data=meta_data, meta_data=meta_data,
preview_text=design.preview_text, preview_text=design.preview_text,
ref_audio_path=design.ref_audio_path, ref_audio_path=design.ref_audio_path,
@@ -58,8 +56,6 @@ async def save_voice_design(
user_id=current_user.id, user_id=current_user.id,
name=data.name, name=data.name,
instruct=data.instruct, instruct=data.instruct,
backend_type=data.backend_type,
aliyun_voice_id=data.aliyun_voice_id,
meta_data=data.meta_data, meta_data=data.meta_data,
preview_text=data.preview_text preview_text=data.preview_text
) )
@@ -153,7 +149,6 @@ async def prepare_and_create_voice_design(
user_id=current_user.id, user_id=current_user.id,
name=data.name, name=data.name,
instruct=data.instruct, instruct=data.instruct,
backend_type="local",
meta_data=data.meta_data, meta_data=data.meta_data,
preview_text=data.preview_text, preview_text=data.preview_text,
voice_cache_id=cache_id, voice_cache_id=cache_id,
@@ -200,12 +195,6 @@ async def prepare_voice_clone_prompt(
if not design: if not design:
raise HTTPException(status_code=404, detail="Voice design not found") raise HTTPException(status_code=404, detail="Voice design not found")
if design.backend_type != "local":
raise HTTPException(
status_code=400,
detail="Voice clone prompt preparation is only supported for local backend"
)
if not can_user_use_local_model(current_user): if not can_user_use_local_model(current_user):
raise HTTPException( raise HTTPException(
status_code=403, status_code=403,

View File

@@ -25,6 +25,7 @@ class Settings(BaseSettings):
WORKERS: int = Field(default=1) WORKERS: int = Field(default=1)
LOG_LEVEL: str = Field(default="info") LOG_LEVEL: str = Field(default="info")
LOG_FILE: str = Field(default="./app.log") LOG_FILE: str = Field(default="./app.log")
DEV_MODE: bool = Field(default=False)
RATE_LIMIT_PER_MINUTE: int = Field(default=50) RATE_LIMIT_PER_MINUTE: int = Field(default=50)
RATE_LIMIT_PER_HOUR: int = Field(default=1000) RATE_LIMIT_PER_HOUR: int = Field(default=1000)
@@ -36,12 +37,6 @@ class Settings(BaseSettings):
MAX_TEXT_LENGTH: int = Field(default=1000) MAX_TEXT_LENGTH: int = Field(default=1000)
MAX_AUDIO_SIZE_MB: int = Field(default=10) MAX_AUDIO_SIZE_MB: int = Field(default=10)
ALIYUN_REGION: str = Field(default="beijing")
ALIYUN_MODEL_FLASH: str = Field(default="qwen3-tts-flash-realtime")
ALIYUN_MODEL_VC: str = Field(default="qwen3-tts-vc-realtime-2026-01-15")
ALIYUN_MODEL_VD: str = Field(default="qwen3-tts-vd-realtime-2026-01-15")
DEFAULT_BACKEND: str = Field(default="local") DEFAULT_BACKEND: str = Field(default="local")
AUDIOBOOK_PARSE_CONCURRENCY: int = Field(default=3) AUDIOBOOK_PARSE_CONCURRENCY: int = Field(default=3)
@@ -60,7 +55,10 @@ class Settings(BaseSettings):
return v return v
def validate(self): def validate(self):
if self.SECRET_KEY == "your-secret-key-change-this-in-production": if self.DEV_MODE:
import warnings
warnings.warn("DEV_MODE is enabled — authentication is bypassed. Do NOT use in production.")
elif self.SECRET_KEY == "your-secret-key-change-this-in-production":
raise ValueError("Insecure default SECRET_KEY is not allowed. Please set a strong SECRET_KEY in environment.") raise ValueError("Insecure default SECRET_KEY is not allowed. Please set a strong SECRET_KEY in environment.")
Path(self.CACHE_DIR).mkdir(parents=True, exist_ok=True) Path(self.CACHE_DIR).mkdir(parents=True, exist_ok=True)

View File

@@ -335,7 +335,7 @@ async def generate_ai_script(project_id: int, user: User, db: Session) -> None:
crud.delete_audiobook_segments(db, project_id) crud.delete_audiobook_segments(db, project_id)
crud.delete_audiobook_characters(db, project_id) crud.delete_audiobook_characters(db, project_id)
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend_type = "local"
for char_data in characters_data: for char_data in characters_data:
name = char_data.get("name", "旁白") name = char_data.get("name", "旁白")
@@ -449,7 +449,7 @@ async def generate_ai_script_chapters(project_id: int, user: User, db: Session)
for c in db_characters for c in db_characters
] ]
char_map = {c.name: c for c in db_characters} char_map = {c.name: c for c in db_characters}
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend_type = "local"
ps.append_line(key, f"[AI剧本] 开始生成 {num_chapters} 章大纲...\n") ps.append_line(key, f"[AI剧本] 开始生成 {num_chapters} 章大纲...\n")
ps.append_line(key, "") ps.append_line(key, "")
@@ -618,7 +618,7 @@ async def continue_ai_script_chapters(project_id: int, additional_chapters: int,
for c in db_characters for c in db_characters
] ]
char_map = {c.name: c for c in db_characters} char_map = {c.name: c for c in db_characters}
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend_type = "local"
existing_chapters = crud.list_audiobook_chapters(db, project_id) existing_chapters = crud.list_audiobook_chapters(db, project_id)
existing_chapters_data = [ existing_chapters_data = [
@@ -839,7 +839,7 @@ async def analyze_project(project_id: int, user: User, db: Session, turbo: bool
crud.delete_audiobook_segments(db, project_id) crud.delete_audiobook_segments(db, project_id)
crud.delete_audiobook_characters(db, project_id) crud.delete_audiobook_characters(db, project_id)
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend_type = "local"
for char_data in characters_data: for char_data in characters_data:
name = char_data.get("name", "旁白") name = char_data.get("name", "旁白")
@@ -1437,7 +1437,7 @@ async def process_all(project_id: int, user: User, db: Session) -> None:
logger.info(f"process_all: project={project_id} complete") logger.info(f"process_all: project={project_id} complete")
async def generate_character_preview(project_id: int, char_id: int, user: User, db: Session) -> None: async def generate_character_preview(project_id: int, char_id: int, user: User, db: Session, force_recreate: bool = False) -> None:
"""Generate a short audio preview for a specific character.""" """Generate a short audio preview for a specific character."""
project = crud.get_audiobook_project(db, project_id, user.id) project = crud.get_audiobook_project(db, project_id, user.id)
if not project: if not project:
@@ -1470,21 +1470,17 @@ async def generate_character_preview(project_id: int, char_id: int, user: User,
preview_text = f"你好,我是{preview_name}{preview_desc}" preview_text = f"你好,我是{preview_name}{preview_desc}"
from core.tts_service import TTSServiceFactory from core.tts_service import TTSServiceFactory
from core.security import decrypt_api_key
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend = await TTSServiceFactory.get_backend()
user_api_key = None
if backend_type == "aliyun":
encrypted = crud.get_system_setting(db, "aliyun_api_key")
if encrypted:
user_api_key = decrypt_api_key(encrypted)
elif user.aliyun_api_key:
user_api_key = decrypt_api_key(user.aliyun_api_key)
backend = await TTSServiceFactory.get_backend(backend_type, user_api_key)
try: try:
if backend_type == "local" and not design.voice_cache_id: if force_recreate and design.voice_cache_id:
design.voice_cache_id = None
db.commit()
db.refresh(design)
logger.info(f"Cleared voice_cache_id for char {char_id} (force_recreate)")
if not design.voice_cache_id:
logger.info(f"Local voice cache missing for char {char_id}. Bootstrapping now...") logger.info(f"Local voice cache missing for char {char_id}. Bootstrapping now...")
from core.model_manager import ModelManager from core.model_manager import ModelManager
from core.cache_manager import VoiceCacheManager from core.cache_manager import VoiceCacheManager
@@ -1524,73 +1520,46 @@ async def generate_character_preview(project_id: int, char_id: int, user: User,
db.commit() db.commit()
logger.info(f"Bootstrapped local voice cache for preview: design_id={design.id}, cache_id={cache_id}") logger.info(f"Bootstrapped local voice cache for preview: design_id={design.id}, cache_id={cache_id}")
if backend_type == "aliyun" and not design.aliyun_voice_id: if design.voice_cache_id:
from core.tts_service import AliyunTTSBackend from core.cache_manager import VoiceCacheManager
if isinstance(backend, AliyunTTSBackend): cache_manager = await VoiceCacheManager.get_instance()
try: cache_result = await cache_manager.get_cache_by_id(design.voice_cache_id, db)
voice_id = await backend._create_voice_design( x_vector = cache_result['data'] if cache_result else None
instruct=_get_gendered_instruct(char.gender, design.instruct), if x_vector:
preview_text=preview_text, audio_bytes, _ = await backend.generate_voice_clone(
) {
design.aliyun_voice_id = voice_id
db.commit()
logger.info(f"Bootstrapped aliyun voice_id for preview: design_id={design.id}, voice_id={voice_id}")
except Exception as e:
logger.warning(f"Failed to bootstrap aliyun voice_id for preview, falling back to instruct: {e}")
if backend_type == "aliyun":
if design.aliyun_voice_id:
audio_bytes, _ = await backend.generate_voice_design(
{"text": preview_text, "language": "zh"},
saved_voice_id=design.aliyun_voice_id
)
else:
audio_bytes, _ = await backend.generate_voice_design({
"text": preview_text,
"language": "zh",
"instruct": _get_gendered_instruct(char.gender, design.instruct),
})
else:
if design.voice_cache_id:
from core.cache_manager import VoiceCacheManager
cache_manager = await VoiceCacheManager.get_instance()
cache_result = await cache_manager.get_cache_by_id(design.voice_cache_id, db)
x_vector = cache_result['data'] if cache_result else None
if x_vector:
audio_bytes, _ = await backend.generate_voice_clone(
{
"text": preview_text,
"language": "Auto",
"max_new_tokens": 512,
"temperature": 0.3,
"top_k": 10,
"top_p": 0.9,
"repetition_penalty": 1.05,
},
x_vector=x_vector
)
else:
audio_bytes, _ = await backend.generate_voice_design({
"text": preview_text, "text": preview_text,
"language": "Auto", "language": "Auto",
"instruct": _get_gendered_instruct(char.gender, design.instruct),
"max_new_tokens": 512, "max_new_tokens": 512,
"temperature": 0.3, "temperature": 0.3,
"top_k": 10, "top_k": 10,
"top_p": 0.9, "top_p": 0.9,
"repetition_penalty": 1.05, "repetition_penalty": 1.05,
}) },
x_vector=x_vector
)
else: else:
audio_bytes, _ = await backend.generate_voice_design({ audio_bytes, _ = await backend.generate_voice_design({
"text": preview_text, "text": preview_text,
"language": "Auto", "language": "Auto",
"instruct": design.instruct, "instruct": _get_gendered_instruct(char.gender, design.instruct),
"max_new_tokens": 512, "max_new_tokens": 512,
"temperature": 0.3, "temperature": 0.3,
"top_k": 10, "top_k": 10,
"top_p": 0.9, "top_p": 0.9,
"repetition_penalty": 1.05, "repetition_penalty": 1.05,
}) })
else:
audio_bytes, _ = await backend.generate_voice_design({
"text": preview_text,
"language": "Auto",
"instruct": design.instruct,
"max_new_tokens": 512,
"temperature": 0.3,
"top_k": 10,
"top_p": 0.9,
"repetition_penalty": 1.05,
})
with open(audio_path, "wb") as f: with open(audio_path, "wb") as f:
f.write(audio_bytes) f.write(audio_bytes)
@@ -1672,7 +1641,7 @@ async def generate_ai_script_nsfw(project_id: int, user: User, db: Session) -> N
crud.delete_audiobook_segments(db, project_id) crud.delete_audiobook_segments(db, project_id)
crud.delete_audiobook_characters(db, project_id) crud.delete_audiobook_characters(db, project_id)
backend_type = user.user_preferences.get("default_backend", "aliyun") if user.user_preferences else "aliyun" backend_type = "local"
for char_data in characters_data: for char_data in characters_data:
name = char_data.get("name", "旁白") name = char_data.get("name", "旁白")

View File

@@ -321,26 +321,17 @@ class LLMService:
@staticmethod @staticmethod
def _emotion_limits(violence_level: int, eroticism_level: int) -> tuple[str, str]: def _emotion_limits(violence_level: int, eroticism_level: int) -> tuple[str, str]:
v = violence_level / 10
e = eroticism_level / 10
female_happy = round(0.20 + 0.45 * e, 2)
angry = round(0.15 + 0.65 * v, 2)
sad = round(0.10 + 0.40 * v, 2)
fear = round(0.10 + 0.60 * v, 2)
hate = round(0.35 + 0.25 * max(v, e), 2)
low = round(0.35 + 0.45 * e, 2)
surprise= round(0.10 + 0.35 * max(v, e), 2)
limits = (
f"愤怒={angry}、悲伤={sad}、恐惧={fear}、厌恶={hate}、低沉={low}、惊讶={surprise}"
f"开心:男性角色上限=0.20,女性角色上限={female_happy}"
)
guidance_parts = [] guidance_parts = []
if violence_level >= 4: if violence_level >= 7:
guidance_parts.append(f"暴力程度{violence_level}/10台词中的愤怒恐惧悲伤情绪必须强烈外露,不得克制") guidance_parts.append(f"暴力程度{violence_level}/10激烈场景的愤怒/恐惧/悲伤强度应用7-10级,不得克制")
if eroticism_level >= 4: elif violence_level >= 4:
guidance_parts.append(f"色情程度{eroticism_level}/10女性台词中的开心、低沉、挑逗情绪应充分表达") guidance_parts.append(f"暴力程度{violence_level}/10台词中的愤怒/恐惧/悲伤情绪可用4-7级")
if eroticism_level >= 7:
guidance_parts.append(f"色情程度{eroticism_level}/10女性台词中的开心/低沉情绪应用7-10级充分表达")
elif eroticism_level >= 4:
guidance_parts.append(f"色情程度{eroticism_level}/10女性台词中的开心/低沉情绪可用4-7级")
guidance = "".join(guidance_parts) guidance = "".join(guidance_parts)
return limits, guidance return "", guidance
async def generate_chapter_script( async def generate_chapter_script(
self, self,
@@ -383,11 +374,9 @@ class LLMService:
" 【角色名】\"对话内容\"(情感词:强度)\n\n" " 【角色名】\"对话内容\"(情感词:强度)\n\n"
"情感标注规则:\n" "情感标注规则:\n"
"- 情感词可选:开心、愤怒、悲伤、恐惧、厌恶、低沉、惊讶\n" "- 情感词可选:开心、愤怒、悲伤、恐惧、厌恶、低沉、惊讶\n"
"- 单一情感:(情感词:强度),如(开心:0.5)、(悲伤:0.3\n" "- 每行只允许标注一个情感词,格式:(情感词:强度级别强度为110的整数10最强\n"
"- 混合情感情感1:比重+情感2:比重),如(开心:0.6+悲伤:0.2)、(愤怒:0.3+恐惧:0.4\n" "- 示例:(开心:6悲伤:3愤怒:8\n"
"- 混合情感时每个情感的比重独立设定,反映各自对情绪的贡献\n" "- 鼓励使用低值13表达微弱、内敛或一闪而过的情绪无需非强即无\n"
f"- 各情感比重上限(严格不超过):{limits_str}\n"
"- 鼓励使用低值0.050.10)表达微弱、内敛或一闪而过的情绪,无需非强即无\n"
"- 确实没有任何情绪色彩时可省略整个括号\n" "- 确实没有任何情绪色彩时可省略整个括号\n"
+ char_personality_str + char_personality_str
+ narrator_rule + narrator_rule
@@ -468,18 +457,15 @@ class LLMService:
"所有非对话的叙述文字归属于旁白角色。\n" "所有非对话的叙述文字归属于旁白角色。\n"
"同时根据语境为每个片段判断是否有明显情绪,有则在 emo_text 中标注,无则留空。\n" "同时根据语境为每个片段判断是否有明显情绪,有则在 emo_text 中标注,无则留空。\n"
"可选情绪词:开心、愤怒、悲伤、恐惧、厌恶、低沉、惊讶。\n" "可选情绪词:开心、愤怒、悲伤、恐惧、厌恶、低沉、惊讶。\n"
"emo_text 格式规则:\n" "emo_text 只允许单一情感词emo_alpha 为110的整数表示强度10最强完全无情绪色彩时 emo_text 置空emo_alpha 为 0。\n"
" 单一情感:直接填情感词,用 emo_alpha 设置强度,如 emo_text=\"开心\", emo_alpha=0.3\n" "鼓励用低值13表达微弱或内敛的情绪不要非强即无。\n"
" 混合情感:用 情感词:比重 格式拼接emo_alpha 设为 1.0,如 emo_text=\"开心:0.6+悲伤:0.2\", emo_alpha=1.0\n"
"各情感比重上限(严格不超过):开心=0.20、愤怒=0.15、悲伤=0.1、恐惧=0.1、厌恶=0.35、低沉=0.35、惊讶=0.10。\n"
"鼓励用低值0.050.10)表达微弱或内敛的情绪,不要非强即无;完全无情绪色彩时 emo_text 置空。\n"
+ personality_str + personality_str
+ "同一角色的连续台词,情绪应尽量保持一致或仅有微弱变化,避免相邻片段间情绪跳跃。\n" + "同一角色的连续台词,情绪应尽量保持一致或仅有微弱变化,避免相邻片段间情绪跳跃。\n"
"只输出JSON数组不要有其他文字格式如下\n" "只输出JSON数组不要有其他文字格式如下\n"
'[{"character": "旁白", "text": "叙述文字", "emo_text": "", "emo_alpha": 0}, ' '[{"character": "旁白", "text": "叙述文字", "emo_text": "", "emo_alpha": 0}, '
'{"character": "角色名", "text": "淡淡的问候", "emo_text": "开心", "emo_alpha": 0.08}, ' '{"character": "角色名", "text": "淡淡的问候", "emo_text": "开心", "emo_alpha": 3}, '
'{"character": "角色名", "text": "激动的欢呼", "emo_text": "开心", "emo_alpha": 0.18}, ' '{"character": "角色名", "text": "激动的欢呼", "emo_text": "开心", "emo_alpha": 8}, '
'{"character": "角色名", "text": "含泪的笑", "emo_text": "开心:0.12+悲伤:0.08", "emo_alpha": 1.0}]' '{"character": "角色名", "text": "愤怒的质问", "emo_text": "愤怒", "emo_alpha": 7}]'
) )
user_message = f"请解析以下章节文本:\n\n{chapter_text}" user_message = f"请解析以下章节文本:\n\n{chapter_text}"
result = await self.stream_chat_json(system_prompt, user_message, on_token, max_tokens=16384, usage_callback=usage_callback) result = await self.stream_chat_json(system_prompt, user_message, on_token, max_tokens=16384, usage_callback=usage_callback)

286
backend/core/tts_service.py Normal file
View File

@@ -0,0 +1,286 @@
import asyncio
import functools
import logging
from abc import ABC, abstractmethod
from typing import Tuple, Optional
logger = logging.getLogger(__name__)
class TTSBackend(ABC):
@abstractmethod
async def generate_custom_voice(self, params: dict) -> Tuple[bytes, int]:
pass
@abstractmethod
async def generate_voice_design(self, params: dict) -> Tuple[bytes, int]:
pass
@abstractmethod
async def generate_voice_clone(self, params: dict, ref_audio_bytes: bytes) -> Tuple[bytes, int]:
pass
@abstractmethod
async def health_check(self) -> dict:
pass
class LocalTTSBackend(TTSBackend):
def __init__(self):
self.model_manager = None
# Add a lock to prevent concurrent VRAM contention and CUDA errors on local GPU models
self._gpu_lock = asyncio.Lock()
async def initialize(self):
from core.model_manager import ModelManager
self.model_manager = await ModelManager.get_instance()
async def generate_custom_voice(self, params: dict) -> Tuple[bytes, int]:
await self.model_manager.load_model("custom-voice")
_, tts = await self.model_manager.get_current_model()
loop = asyncio.get_event_loop()
async with self._gpu_lock:
result = await loop.run_in_executor(
None,
functools.partial(
tts.generate_custom_voice,
text=params['text'],
language=params['language'],
speaker=params['speaker'],
instruct=params.get('instruct', ''),
max_new_tokens=params['max_new_tokens'],
temperature=params['temperature'],
top_k=params['top_k'],
top_p=params['top_p'],
repetition_penalty=params['repetition_penalty'],
)
)
import numpy as np
wavs, sample_rate = result if isinstance(result, tuple) else (result, 24000)
audio_data = wavs[0] if isinstance(wavs, list) else wavs
return self._numpy_to_bytes(audio_data), sample_rate
async def generate_voice_design(self, params: dict) -> Tuple[bytes, int]:
await self.model_manager.load_model("voice-design")
_, tts = await self.model_manager.get_current_model()
loop = asyncio.get_event_loop()
async with self._gpu_lock:
result = await loop.run_in_executor(
None,
functools.partial(
tts.generate_voice_design,
text=params['text'],
language=params['language'],
instruct=params['instruct'],
max_new_tokens=params['max_new_tokens'],
temperature=params['temperature'],
top_k=params['top_k'],
top_p=params['top_p'],
repetition_penalty=params['repetition_penalty'],
)
)
import numpy as np
wavs, sample_rate = result if isinstance(result, tuple) else (result, 24000)
audio_data = wavs[0] if isinstance(wavs, list) else wavs
return self._numpy_to_bytes(audio_data), sample_rate
async def generate_voice_clone(self, params: dict, ref_audio_bytes: bytes = None, x_vector=None) -> Tuple[bytes, int]:
from utils.audio import process_ref_audio
await self.model_manager.load_model("base")
_, tts = await self.model_manager.get_current_model()
loop = asyncio.get_event_loop()
async with self._gpu_lock:
if x_vector is None:
if ref_audio_bytes is None:
raise ValueError("Either ref_audio_bytes or x_vector must be provided")
ref_audio_array, ref_sr = process_ref_audio(ref_audio_bytes)
x_vector = await loop.run_in_executor(
None,
functools.partial(
tts.create_voice_clone_prompt,
ref_audio=(ref_audio_array, ref_sr),
ref_text=params.get('ref_text', ''),
x_vector_only_mode=False,
)
)
wavs, sample_rate = await loop.run_in_executor(
None,
functools.partial(
tts.generate_voice_clone,
text=params['text'],
language=params['language'],
voice_clone_prompt=x_vector,
max_new_tokens=params['max_new_tokens'],
temperature=params['temperature'],
top_k=params['top_k'],
top_p=params['top_p'],
repetition_penalty=params['repetition_penalty'],
)
)
import numpy as np
audio_data = wavs[0] if isinstance(wavs, list) else wavs
if isinstance(audio_data, list):
audio_data = np.array(audio_data)
return self._numpy_to_bytes(audio_data), sample_rate
async def health_check(self) -> dict:
return {
"available": self.model_manager is not None,
"current_model": self.model_manager.current_model_name if self.model_manager else None
}
@staticmethod
def _numpy_to_bytes(audio_array) -> bytes:
import numpy as np
import io
import wave
if isinstance(audio_array, list):
audio_array = np.array(audio_array)
audio_array = np.clip(audio_array, -1.0, 1.0)
audio_int16 = (audio_array * 32767).astype(np.int16)
buffer = io.BytesIO()
with wave.open(buffer, 'wb') as wav_file:
wav_file.setnchannels(1)
wav_file.setsampwidth(2)
wav_file.setframerate(24000)
wav_file.writeframes(audio_int16.tobytes())
buffer.seek(0)
return buffer.read()
class IndexTTS2Backend:
_gpu_lock = asyncio.Lock()
# Level 10 = these raw weights. Scale linearly: level N → N/10 * max
EMO_LEVEL_MAX: dict[str, float] = {
"开心": 0.75, "happy": 0.75,
"愤怒": 0.08, "angry": 0.08,
"悲伤": 0.90, "sad": 0.90,
"恐惧": 0.10, "fear": 0.10,
"厌恶": 0.50, "hate": 0.50,
"低沉": 0.35, "low": 0.35,
"惊讶": 0.35, "surprise": 0.35,
}
# Emotion keyword → index mapping
# Order: [happy, angry, sad, fear, hate, low, surprise, neutral]
_EMO_KEYWORDS = [
['', '开心', '快乐', '高兴', '欢乐', '愉快', 'happy', '热情', '兴奋', '愉悦', '激动'],
['', '愤怒', '生气', '', 'angry', '气愤', '愤慨'],
['', '悲伤', '难过', '忧郁', '伤心', '', 'sad', '感慨', '沉重', '沉痛', ''],
['', '恐惧', '害怕', '', 'fear', '担心', '紧张'],
['厌恶', '', 'hate', '讨厌', '反感'],
['低落', '沮丧', '消沉', 'low', '抑郁', '颓废'],
['惊喜', '惊讶', '意外', 'surprise', '', '吃惊', '震惊'],
]
@staticmethod
def _emo_text_to_vector(emo_text: str) -> Optional[list]:
tokens = [t.strip() for t in emo_text.split('+') if t.strip()]
matched = []
for tok in tokens:
if ':' in tok:
name_part, w_str = tok.rsplit(':', 1)
try:
weight: Optional[float] = float(w_str)
except ValueError:
weight = None
else:
name_part = tok
weight = None
name_lower = name_part.lower().strip()
for idx, words in enumerate(IndexTTS2Backend._EMO_KEYWORDS):
for word in words:
if word in name_lower:
matched.append((idx, weight))
break
if not matched:
return None
vec = [0.0] * 8
has_explicit = any(w is not None for _, w in matched)
if has_explicit:
for idx, w in matched:
vec[idx] = w if w is not None else 0.5
else:
score = 0.8 if len(matched) == 1 else 0.5
for idx, _ in matched:
vec[idx] = 0.2 if idx == 1 else score
return vec
async def generate(
self,
text: str,
spk_audio_prompt: str,
output_path: str,
emo_text: str = None,
emo_alpha: float = 0.6,
) -> bytes:
from core.model_manager import IndexTTS2ModelManager
manager = await IndexTTS2ModelManager.get_instance()
tts = await manager.get_model()
loop = asyncio.get_event_loop()
emo_vector = None
if emo_text and len(emo_text.strip()) > 0:
resolved_emo_text = emo_text
resolved_emo_alpha = emo_alpha
if emo_alpha is not None and emo_alpha > 1:
level = min(10, max(1, round(emo_alpha)))
name = emo_text.strip()
max_val = self.EMO_LEVEL_MAX.get(name)
if max_val is None:
name_lower = name.lower()
for key, val in self.EMO_LEVEL_MAX.items():
if key in name_lower or name_lower in key:
max_val = val
break
if max_val is None:
max_val = 0.20
weight = round(level / 10 * max_val, 4)
resolved_emo_text = f"{name}:{weight}"
resolved_emo_alpha = 1.0
raw_vector = self._emo_text_to_vector(resolved_emo_text)
if raw_vector is not None:
emo_vector = [v * resolved_emo_alpha for v in raw_vector]
logger.info(f"IndexTTS2 emo_text={repr(emo_text)} emo_alpha={emo_alpha} → resolved={repr(resolved_emo_text)} emo_vector={emo_vector}")
async with IndexTTS2Backend._gpu_lock:
await loop.run_in_executor(
None,
functools.partial(
tts.infer,
spk_audio_prompt=spk_audio_prompt,
text=text,
output_path=output_path,
emo_vector=emo_vector,
emo_alpha=1.0,
)
)
with open(output_path, 'rb') as f:
return f.read()
class TTSServiceFactory:
_local_backend: Optional[LocalTTSBackend] = None
@classmethod
async def get_backend(cls, backend_type: str = None, user_api_key: Optional[str] = None) -> TTSBackend:
if cls._local_backend is None:
cls._local_backend = LocalTTSBackend()
await cls._local_backend.initialize()
return cls._local_backend

View File

@@ -114,21 +114,6 @@ def change_user_password(
db.refresh(user) db.refresh(user)
return user return user
def update_user_aliyun_key(
db: Session,
user_id: int,
encrypted_api_key: Optional[str]
) -> Optional[User]:
user = get_user_by_id(db, user_id)
if not user:
return None
user.aliyun_api_key = encrypted_api_key
user.updated_at = datetime.utcnow()
db.commit()
db.refresh(user)
return user
def create_job(db: Session, user_id: int, job_type: str, input_data: Dict[str, Any]) -> Job: def create_job(db: Session, user_id: int, job_type: str, input_data: Dict[str, Any]) -> Job:
job = Job( job = Job(
user_id=user_id, user_id=user_id,
@@ -244,8 +229,11 @@ def delete_cache_entry(db: Session, cache_id: int, user_id: int) -> bool:
def get_user_preferences(db: Session, user_id: int) -> dict: def get_user_preferences(db: Session, user_id: int) -> dict:
user = get_user_by_id(db, user_id) user = get_user_by_id(db, user_id)
if not user or not user.user_preferences: if not user or not user.user_preferences:
return {"default_backend": "aliyun", "onboarding_completed": False} return {"default_backend": "local", "onboarding_completed": False}
return user.user_preferences prefs = dict(user.user_preferences)
if prefs.get("default_backend") == "aliyun":
prefs["default_backend"] = "local"
return prefs
def update_user_preferences(db: Session, user_id: int, preferences: dict) -> Optional[User]: def update_user_preferences(db: Session, user_id: int, preferences: dict) -> Optional[User]:
user = get_user_by_id(db, user_id) user = get_user_by_id(db, user_id)
@@ -276,7 +264,7 @@ def update_system_setting(db: Session, key: str, value: dict) -> SystemSettings:
return setting return setting
def can_user_use_local_model(user: User) -> bool: def can_user_use_local_model(user: User) -> bool:
return user.is_superuser or user.can_use_local_model return True
def can_user_use_nsfw(user: User) -> bool: def can_user_use_nsfw(user: User) -> bool:
return user.is_superuser or user.can_use_nsfw return user.is_superuser or user.can_use_nsfw
@@ -286,8 +274,6 @@ def create_voice_design(
user_id: int, user_id: int,
name: str, name: str,
instruct: str, instruct: str,
backend_type: str,
aliyun_voice_id: Optional[str] = None,
meta_data: Optional[Dict[str, Any]] = None, meta_data: Optional[Dict[str, Any]] = None,
preview_text: Optional[str] = None, preview_text: Optional[str] = None,
voice_cache_id: Optional[int] = None, voice_cache_id: Optional[int] = None,
@@ -297,9 +283,7 @@ def create_voice_design(
design = VoiceDesign( design = VoiceDesign(
user_id=user_id, user_id=user_id,
name=name, name=name,
backend_type=backend_type,
instruct=instruct, instruct=instruct,
aliyun_voice_id=aliyun_voice_id,
meta_data=meta_data, meta_data=meta_data,
preview_text=preview_text, preview_text=preview_text,
voice_cache_id=voice_cache_id, voice_cache_id=voice_cache_id,
@@ -331,8 +315,6 @@ def list_voice_designs(
VoiceDesign.user_id == user_id, VoiceDesign.user_id == user_id,
VoiceDesign.is_active == True VoiceDesign.is_active == True
) )
if backend_type:
query = query.filter(VoiceDesign.backend_type == backend_type)
return query.order_by(VoiceDesign.last_used.desc()).offset(skip).limit(limit).all() return query.order_by(VoiceDesign.last_used.desc()).offset(skip).limit(limit).all()
def count_voice_designs( def count_voice_designs(
@@ -340,13 +322,10 @@ def count_voice_designs(
user_id: int, user_id: int,
backend_type: Optional[str] = None backend_type: Optional[str] = None
) -> int: ) -> int:
query = db.query(VoiceDesign).filter( return db.query(VoiceDesign).filter(
VoiceDesign.user_id == user_id, VoiceDesign.user_id == user_id,
VoiceDesign.is_active == True VoiceDesign.is_active == True
) ).count()
if backend_type:
query = query.filter(VoiceDesign.backend_type == backend_type)
return query.count()
def delete_voice_design(db: Session, design_id: int, user_id: int) -> bool: def delete_voice_design(db: Session, design_id: int, user_id: int) -> bool:
design = get_voice_design(db, design_id, user_id) design = get_voice_design(db, design_id, user_id)
@@ -609,7 +588,6 @@ def update_audiobook_character(
description: Optional[str] = None, description: Optional[str] = None,
instruct: Optional[str] = None, instruct: Optional[str] = None,
voice_design_id: Optional[int] = None, voice_design_id: Optional[int] = None,
use_indextts2: Optional[bool] = None,
) -> Optional[AudiobookCharacter]: ) -> Optional[AudiobookCharacter]:
char = db.query(AudiobookCharacter).filter(AudiobookCharacter.id == char_id).first() char = db.query(AudiobookCharacter).filter(AudiobookCharacter.id == char_id).first()
if not char: if not char:
@@ -624,8 +602,6 @@ def update_audiobook_character(
char.instruct = instruct char.instruct = instruct
if voice_design_id is not None: if voice_design_id is not None:
char.voice_design_id = voice_design_id char.voice_design_id = voice_design_id
if use_indextts2 is not None:
char.use_indextts2 = use_indextts2
db.commit() db.commit()
db.refresh(char) db.refresh(char)
return char return char

View File

@@ -34,13 +34,12 @@ class User(Base):
hashed_password = Column(String(255), nullable=False) hashed_password = Column(String(255), nullable=False)
is_active = Column(Boolean, default=True, nullable=False) is_active = Column(Boolean, default=True, nullable=False)
is_superuser = Column(Boolean, default=False, nullable=False) is_superuser = Column(Boolean, default=False, nullable=False)
aliyun_api_key = Column(Text, nullable=True)
llm_api_key = Column(Text, nullable=True) llm_api_key = Column(Text, nullable=True)
llm_base_url = Column(String(500), nullable=True) llm_base_url = Column(String(500), nullable=True)
llm_model = Column(String(200), nullable=True) llm_model = Column(String(200), nullable=True)
can_use_local_model = Column(Boolean, default=False, nullable=False) can_use_local_model = Column(Boolean, default=False, nullable=False)
can_use_nsfw = Column(Boolean, default=False, nullable=False) can_use_nsfw = Column(Boolean, default=False, nullable=False)
user_preferences = Column(JSON, nullable=True, default=lambda: {"default_backend": "aliyun", "onboarding_completed": False}) user_preferences = Column(JSON, nullable=True, default=lambda: {"default_backend": "local", "onboarding_completed": False})
created_at = Column(DateTime, default=datetime.utcnow, nullable=False) created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False) updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False)
@@ -105,9 +104,7 @@ class VoiceDesign(Base):
id = Column(Integer, primary_key=True, index=True) id = Column(Integer, primary_key=True, index=True)
user_id = Column(Integer, ForeignKey("users.id"), nullable=False, index=True) user_id = Column(Integer, ForeignKey("users.id"), nullable=False, index=True)
name = Column(String(100), nullable=False) name = Column(String(100), nullable=False)
backend_type = Column(String(20), nullable=False, index=True)
instruct = Column(Text, nullable=False) instruct = Column(Text, nullable=False)
aliyun_voice_id = Column(String(255), nullable=True)
meta_data = Column(JSON, nullable=True) meta_data = Column(JSON, nullable=True)
preview_text = Column(Text, nullable=True) preview_text = Column(Text, nullable=True)
ref_audio_path = Column(String(500), nullable=True) ref_audio_path = Column(String(500), nullable=True)
@@ -121,7 +118,6 @@ class VoiceDesign(Base):
user = relationship("User", back_populates="voice_designs") user = relationship("User", back_populates="voice_designs")
__table_args__ = ( __table_args__ = (
Index('idx_user_backend', 'user_id', 'backend_type'),
Index('idx_user_active', 'user_id', 'is_active'), Index('idx_user_active', 'user_id', 'is_active'),
) )
@@ -176,8 +172,6 @@ class AudiobookCharacter(Base):
description = Column(Text, nullable=True) description = Column(Text, nullable=True)
instruct = Column(Text, nullable=True) instruct = Column(Text, nullable=True)
voice_design_id = Column(Integer, ForeignKey("voice_designs.id"), nullable=True) voice_design_id = Column(Integer, ForeignKey("voice_designs.id"), nullable=True)
use_indextts2 = Column(Boolean, default=False, nullable=False)
project = relationship("AudiobookProject", back_populates="characters") project = relationship("AudiobookProject", back_populates="characters")
voice_design = relationship("VoiceDesign") voice_design = relationship("VoiceDesign")
segments = relationship("AudiobookSegment", back_populates="character") segments = relationship("AudiobookSegment", back_populates="character")

View File

@@ -1,4 +1,4 @@
upstream qwen_tts_backend { upstream canto_backend {
server 127.0.0.1:8000; server 127.0.0.1:8000;
} }
@@ -13,7 +13,7 @@ server {
proxy_send_timeout 300s; proxy_send_timeout 300s;
location / { location / {
proxy_pass http://qwen_tts_backend; proxy_pass http://canto_backend;
proxy_set_header Host $host; proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
@@ -34,20 +34,20 @@ server {
} }
location /outputs/ { location /outputs/ {
alias /opt/qwen3-tts-backend/outputs/; alias /opt/canto-backend/outputs/;
autoindex off; autoindex off;
add_header Cache-Control "public, max-age=3600"; add_header Cache-Control "public, max-age=3600";
add_header Content-Disposition "attachment"; add_header Content-Disposition "attachment";
} }
location /health { location /health {
proxy_pass http://qwen_tts_backend/health; proxy_pass http://canto_backend/health;
proxy_set_header Host $host; proxy_set_header Host $host;
access_log off; access_log off;
} }
location /metrics { location /metrics {
proxy_pass http://qwen_tts_backend/metrics; proxy_pass http://canto_backend/metrics;
proxy_set_header Host $host; proxy_set_header Host $host;
allow 127.0.0.1; allow 127.0.0.1;
deny all; deny all;

View File

@@ -1,15 +1,15 @@
[Unit] [Unit]
Description=Qwen3-TTS Backend API Service Description=Canto Backend API Service
After=network.target After=network.target
[Service] [Service]
Type=simple Type=simple
User=qwen-tts User=qwen-tts
Group=qwen-tts Group=qwen-tts
WorkingDirectory=/opt/qwen3-tts-backend WorkingDirectory=/opt/canto-backend
Environment="PATH=/opt/conda/envs/qwen3-tts/bin:/usr/local/bin:/usr/bin:/bin" Environment="PATH=/opt/conda/envs/canto/bin:/usr/local/bin:/usr/bin:/bin"
EnvironmentFile=/opt/qwen3-tts-backend/.env EnvironmentFile=/opt/canto-backend/.env
ExecStart=/opt/conda/envs/qwen3-tts/bin/python main.py ExecStart=/opt/conda/envs/canto/bin/python main.py
Restart=on-failure Restart=on-failure
RestartSec=10s RestartSec=10s
StandardOutput=append:/var/log/qwen-tts/app.log StandardOutput=append:/var/log/qwen-tts/app.log

Some files were not shown because too many files have changed in this diff Show More