feat: Enhance README with project description, installation instructions, and acknowledgments
This commit is contained in:
168
README.md
168
README.md
@@ -1,6 +1,8 @@
|
|||||||
# Qwen3-TTS WebUI
|
# Qwen3-TTS WebUI
|
||||||
|
|
||||||
A text-to-speech web application based on Qwen3-TTS, supporting custom voice, voice design, and voice cloning.
|
**Unofficial** text-to-speech web application based on Qwen3-TTS, supporting custom voice, voice design, and voice cloning with an intuitive interface.
|
||||||
|
|
||||||
|
> This is an unofficial project. For the official Qwen3-TTS repository, please visit [QwenLM/Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS).
|
||||||
|
|
||||||
[中文文档](./README.zh.md)
|
[中文文档](./README.zh.md)
|
||||||
|
|
||||||
@@ -48,41 +50,173 @@ A text-to-speech web application based on Qwen3-TTS, supporting custom voice, vo
|
|||||||
|
|
||||||
## Tech Stack
|
## Tech Stack
|
||||||
|
|
||||||
Backend: FastAPI + SQLAlchemy + PyTorch + JWT
|
**Backend**: FastAPI + SQLAlchemy + PyTorch + JWT
|
||||||
Frontend: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
|
- Direct PyTorch inference with Qwen3-TTS models
|
||||||
|
- Async task processing with batch optimization
|
||||||
|
- Local model support + Aliyun API integration
|
||||||
|
|
||||||
## Quick Start
|
**Frontend**: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
|
||||||
|
|
||||||
### Backend
|
## Installation
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
- Python 3.9+ with CUDA support (for local model inference)
|
||||||
|
- Node.js 18+ (for frontend)
|
||||||
|
- Git
|
||||||
|
|
||||||
|
### 1. Clone Repository
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/yourusername/Qwen3-TTS-webUI.git
|
||||||
|
cd Qwen3-TTS-webUI
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Download Models
|
||||||
|
|
||||||
|
**Important**: Models are **NOT** automatically downloaded. You need to manually download them first.
|
||||||
|
|
||||||
|
For more details, visit the official repository: [Qwen3-TTS Models](https://github.com/QwenLM/Qwen3-TTS)
|
||||||
|
|
||||||
|
Navigate to the backend directory:
|
||||||
|
```bash
|
||||||
|
cd qwen3-tts-backend
|
||||||
|
mkdir -p Qwen && cd Qwen
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 1: Download through ModelScope (Recommended for users in Mainland China)**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -U modelscope
|
||||||
|
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-Tokenizer-12Hz --local_dir ./Qwen3-TTS-Tokenizer-12Hz
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local_dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./Qwen3-TTS-12Hz-1.7B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
Optional 0.6B models (smaller, faster):
|
||||||
|
```bash
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./Qwen3-TTS-12Hz-0.6B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2: Download through Hugging Face**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -U "huggingface_hub[cli]"
|
||||||
|
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-Tokenizer-12Hz --local-dir ./Qwen3-TTS-Tokenizer-12Hz
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local-dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base --local-dir ./Qwen3-TTS-12Hz-1.7B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
Optional 0.6B models (smaller, faster):
|
||||||
|
```bash
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./Qwen3-TTS-12Hz-0.6B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
**Final directory structure:**
|
||||||
|
```
|
||||||
|
Qwen3-TTS-webUI/
|
||||||
|
├── qwen3-tts-backend/
|
||||||
|
│ └── Qwen/
|
||||||
|
│ ├── Qwen3-TTS-Tokenizer-12Hz/
|
||||||
|
│ ├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
|
||||||
|
│ ├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
|
||||||
|
│ └── Qwen3-TTS-12Hz-1.7B-Base/
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Backend Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd qwen3-tts-backend
|
cd qwen3-tts-backend
|
||||||
|
|
||||||
|
# Create virtual environment
|
||||||
python -m venv venv
|
python -m venv venv
|
||||||
source venv/bin/activate
|
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Install Qwen3-TTS
|
||||||
|
pip install qwen-tts
|
||||||
|
|
||||||
|
# Create configuration file
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
# Edit .env to configure MODEL_BASE_PATH and DEFAULT_BACKEND
|
|
||||||
# For local model: Ensure MODEL_BASE_PATH points to Qwen model directory
|
# Edit .env file
|
||||||
# For Aliyun: Set DEFAULT_BACKEND=aliyun and configure API key in web settings
|
# For local model: Set MODEL_BASE_PATH=./Qwen
|
||||||
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
# For Aliyun API only: Set DEFAULT_BACKEND=aliyun
|
||||||
|
nano .env # or use your preferred editor
|
||||||
```
|
```
|
||||||
|
|
||||||
### Frontend
|
**Important Backend Configuration** (`.env`):
|
||||||
|
```env
|
||||||
|
MODEL_DEVICE=cuda:0 # Use GPU (or cpu for CPU-only)
|
||||||
|
MODEL_BASE_PATH=./Qwen # Path to your downloaded models
|
||||||
|
DEFAULT_BACKEND=local # Use 'local' for local models, 'aliyun' for API
|
||||||
|
DATABASE_URL=sqlite:///./qwen_tts.db
|
||||||
|
SECRET_KEY=your-secret-key-here # Change this!
|
||||||
|
```
|
||||||
|
|
||||||
|
Start the backend server:
|
||||||
|
```bash
|
||||||
|
# Using uvicorn directly
|
||||||
|
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
|
||||||
|
# Or using conda (if you prefer)
|
||||||
|
conda run -n qwen3-tts uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify backend is running:
|
||||||
|
```bash
|
||||||
|
curl http://127.0.0.1:8000/health
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Frontend Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd qwen3-tts-frontend
|
cd qwen3-tts-frontend
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
npm install
|
npm install
|
||||||
|
|
||||||
|
# Create configuration file
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
# Edit .env to configure VITE_API_URL
|
|
||||||
|
# Edit .env to set backend URL
|
||||||
|
echo "VITE_API_URL=http://localhost:8000" > .env
|
||||||
|
|
||||||
|
# Start development server
|
||||||
npm run dev
|
npm run dev
|
||||||
```
|
```
|
||||||
|
|
||||||
Visit `http://localhost:5173`
|
### 5. Access the Application
|
||||||
|
|
||||||
**First Time Setup**: On first run, a default superuser account will be automatically created:
|
Open your browser and visit: `http://localhost:5173`
|
||||||
|
|
||||||
|
**Default Credentials**:
|
||||||
- Username: `admin`
|
- Username: `admin`
|
||||||
- Password: `admin123456`
|
- Password: `admin123456`
|
||||||
- **IMPORTANT**: Please change the password immediately after first login for security!
|
- **IMPORTANT**: Change the password immediately after first login!
|
||||||
|
|
||||||
|
### Production Build
|
||||||
|
|
||||||
|
For production deployment:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Backend: Use gunicorn or similar WSGI server
|
||||||
|
cd qwen3-tts-backend
|
||||||
|
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000
|
||||||
|
|
||||||
|
# Frontend: Build static files
|
||||||
|
cd qwen3-tts-frontend
|
||||||
|
npm run build
|
||||||
|
# Serve the 'dist' folder with nginx or another web server
|
||||||
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
@@ -164,6 +298,10 @@ All TTS endpoints support an optional `backend` parameter to specify the TTS bac
|
|||||||
- `backend: "aliyun"` - Use Aliyun TTS API
|
- `backend: "aliyun"` - Use Aliyun TTS API
|
||||||
- If not specified, uses the user's default backend setting
|
- If not specified, uses the user's default backend setting
|
||||||
|
|
||||||
|
## Acknowledgments
|
||||||
|
|
||||||
|
This project is built upon the excellent work of the official [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) repository by the Qwen Team at Alibaba Cloud. Special thanks to the Qwen Team for open-sourcing such a powerful text-to-speech model.
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
Apache-2.0 license
|
Apache-2.0 license
|
||||||
|
|||||||
168
README.zh.md
168
README.zh.md
@@ -1,6 +1,8 @@
|
|||||||
# Qwen3-TTS WebUI
|
# Qwen3-TTS WebUI
|
||||||
|
|
||||||
基于 Qwen3-TTS 的文本转语音 Web 应用,支持自定义语音、语音设计和语音克隆。
|
**非官方** 基于 Qwen3-TTS 的文本转语音 Web 应用,支持自定义语音、语音设计和语音克隆,提供直观的 Web 界面。
|
||||||
|
|
||||||
|
> 这是一个非官方项目。如需查看官方 Qwen3-TTS 仓库,请访问 [QwenLM/Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS)。
|
||||||
|
|
||||||
[English Documentation](./README.md)
|
[English Documentation](./README.md)
|
||||||
|
|
||||||
@@ -48,41 +50,173 @@
|
|||||||
|
|
||||||
## 技术栈
|
## 技术栈
|
||||||
|
|
||||||
后端:FastAPI + SQLAlchemy + PyTorch + JWT
|
**后端**: FastAPI + SQLAlchemy + PyTorch + JWT
|
||||||
前端:React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
|
- 使用 PyTorch 直接推理 Qwen3-TTS 模型
|
||||||
|
- 异步任务处理与批量优化
|
||||||
|
- 支持本地模型 + 阿里云 API 双后端
|
||||||
|
|
||||||
## 快速开始
|
**前端**: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
|
||||||
|
|
||||||
### 后端
|
## 安装部署
|
||||||
|
|
||||||
|
### 环境要求
|
||||||
|
|
||||||
|
- Python 3.9+ 并支持 CUDA(用于本地模型推理)
|
||||||
|
- Node.js 18+(用于前端)
|
||||||
|
- Git
|
||||||
|
|
||||||
|
### 1. 克隆仓库
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/yourusername/Qwen3-TTS-webUI.git
|
||||||
|
cd Qwen3-TTS-webUI
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. 下载模型
|
||||||
|
|
||||||
|
**重要**: 模型**不会**自动下载,需要手动下载。
|
||||||
|
|
||||||
|
详细信息请访问官方仓库:[Qwen3-TTS 模型](https://github.com/QwenLM/Qwen3-TTS)
|
||||||
|
|
||||||
|
进入后端目录:
|
||||||
|
```bash
|
||||||
|
cd qwen3-tts-backend
|
||||||
|
mkdir -p Qwen && cd Qwen
|
||||||
|
```
|
||||||
|
|
||||||
|
**方式一:通过 ModelScope 下载(推荐中国大陆用户)**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -U modelscope
|
||||||
|
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-Tokenizer-12Hz --local_dir ./Qwen3-TTS-Tokenizer-12Hz
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local_dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./Qwen3-TTS-12Hz-1.7B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
可选的 0.6B 模型(更小、更快):
|
||||||
|
```bash
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
|
||||||
|
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./Qwen3-TTS-12Hz-0.6B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
**方式二:通过 Hugging Face 下载**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -U "huggingface_hub[cli]"
|
||||||
|
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-Tokenizer-12Hz --local-dir ./Qwen3-TTS-Tokenizer-12Hz
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local-dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base --local-dir ./Qwen3-TTS-12Hz-1.7B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
可选的 0.6B 模型(更小、更快):
|
||||||
|
```bash
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
|
||||||
|
huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./Qwen3-TTS-12Hz-0.6B-Base
|
||||||
|
```
|
||||||
|
|
||||||
|
**最终目录结构:**
|
||||||
|
```
|
||||||
|
Qwen3-TTS-webUI/
|
||||||
|
├── qwen3-tts-backend/
|
||||||
|
│ └── Qwen/
|
||||||
|
│ ├── Qwen3-TTS-Tokenizer-12Hz/
|
||||||
|
│ ├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
|
||||||
|
│ ├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
|
||||||
|
│ └── Qwen3-TTS-12Hz-1.7B-Base/
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. 后端配置
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd qwen3-tts-backend
|
cd qwen3-tts-backend
|
||||||
|
|
||||||
|
# 创建虚拟环境
|
||||||
python -m venv venv
|
python -m venv venv
|
||||||
source venv/bin/activate
|
source venv/bin/activate # Windows: venv\Scripts\activate
|
||||||
|
|
||||||
|
# 安装依赖
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# 安装 Qwen3-TTS
|
||||||
|
pip install qwen-tts
|
||||||
|
|
||||||
|
# 创建配置文件
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
# 编辑 .env 配置 MODEL_BASE_PATH 和 DEFAULT_BACKEND
|
|
||||||
# 本地模型:确保 MODEL_BASE_PATH 指向 Qwen 模型目录
|
# 编辑配置文件
|
||||||
# 阿里云:设置 DEFAULT_BACKEND=aliyun 并在 Web 设置页面配置 API 密钥
|
# 本地模型:设置 MODEL_BASE_PATH=./Qwen
|
||||||
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
# 仅阿里云 API:设置 DEFAULT_BACKEND=aliyun
|
||||||
|
nano .env # 或使用其他编辑器
|
||||||
```
|
```
|
||||||
|
|
||||||
### 前端
|
**重要的后端配置** (`.env` 文件):
|
||||||
|
```env
|
||||||
|
MODEL_DEVICE=cuda:0 # 使用 GPU(或 cpu 使用 CPU)
|
||||||
|
MODEL_BASE_PATH=./Qwen # 已下载模型的路径
|
||||||
|
DEFAULT_BACKEND=local # 使用本地模型用 'local',API 用 'aliyun'
|
||||||
|
DATABASE_URL=sqlite:///./qwen_tts.db
|
||||||
|
SECRET_KEY=your-secret-key-here # 请修改此项!
|
||||||
|
```
|
||||||
|
|
||||||
|
启动后端服务:
|
||||||
|
```bash
|
||||||
|
# 使用 uvicorn 直接启动
|
||||||
|
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
|
||||||
|
# 或使用 conda(如果你喜欢)
|
||||||
|
conda run -n qwen3-tts uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
```
|
||||||
|
|
||||||
|
验证后端是否运行:
|
||||||
|
```bash
|
||||||
|
curl http://127.0.0.1:8000/health
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. 前端配置
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd qwen3-tts-frontend
|
cd qwen3-tts-frontend
|
||||||
|
|
||||||
|
# 安装依赖
|
||||||
npm install
|
npm install
|
||||||
|
|
||||||
|
# 创建配置文件
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
# 编辑 .env 配置 VITE_API_URL
|
|
||||||
|
# 编辑 .env 设置后端地址
|
||||||
|
echo "VITE_API_URL=http://localhost:8000" > .env
|
||||||
|
|
||||||
|
# 启动开发服务器
|
||||||
npm run dev
|
npm run dev
|
||||||
```
|
```
|
||||||
|
|
||||||
访问 `http://localhost:5173`
|
### 5. 访问应用
|
||||||
|
|
||||||
**首次运行**: 第一次运行时会自动初始化一个超级管理员账户:
|
在浏览器中打开:`http://localhost:5173`
|
||||||
|
|
||||||
|
**默认账号**:
|
||||||
- 用户名:`admin`
|
- 用户名:`admin`
|
||||||
- 密码:`admin123456`
|
- 密码:`admin123456`
|
||||||
- **重要**: 强烈建议登录后立刻修改密码!
|
- **重要**: 登录后请立即修改密码!
|
||||||
|
|
||||||
|
### 生产环境部署
|
||||||
|
|
||||||
|
用于生产环境:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 后端:使用 gunicorn 或类似的 WSGI 服务器
|
||||||
|
cd qwen3-tts-backend
|
||||||
|
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000
|
||||||
|
|
||||||
|
# 前端:构建静态文件
|
||||||
|
cd qwen3-tts-frontend
|
||||||
|
npm run build
|
||||||
|
# 使用 nginx 或其他 Web 服务器提供 'dist' 文件夹
|
||||||
|
```
|
||||||
|
|
||||||
## 配置
|
## 配置
|
||||||
|
|
||||||
@@ -164,6 +298,10 @@ GET /jobs/{id}/download - 下载结果
|
|||||||
- `backend: "aliyun"` - 使用阿里云 TTS API
|
- `backend: "aliyun"` - 使用阿里云 TTS API
|
||||||
- 如果不指定,则使用用户的默认后端设置
|
- 如果不指定,则使用用户的默认后端设置
|
||||||
|
|
||||||
|
## 特别鸣谢
|
||||||
|
|
||||||
|
本项目基于阿里云 Qwen 团队开源的 [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) 官方仓库构建。特别感谢 Qwen 团队开源如此强大的文本转语音模型。
|
||||||
|
|
||||||
## 许可证
|
## 许可证
|
||||||
|
|
||||||
Apache-2.0 license
|
Apache-2.0 license
|
||||||
|
|||||||
Reference in New Issue
Block a user