feat: Integrate IndexTTS2 model and update related schemas and frontend components

This commit is contained in:
2026-03-12 13:30:53 +08:00
parent e5b5a16364
commit 8aec4f6f44
151 changed files with 40077 additions and 85 deletions

View File

@@ -13,7 +13,8 @@
- Custom Voice: Predefined speaker voices
- Voice Design: Create voices from natural language descriptions
- Voice Cloning: Clone voices from uploaded audio
- Audiobook Generation: Upload EPUB files and generate multi-character audiobooks with LLM-powered character extraction and voice assignment
- **IndexTTS2**: High-quality voice cloning with emotion control (happy, angry, sad, fear, surprise, etc.) powered by [IndexTTS2](https://github.com/iszhanjiawei/indexTTS2)
- Audiobook Generation: Upload EPUB files and generate multi-character audiobooks with LLM-powered character extraction and voice assignment; supports IndexTTS2 per character
- Dual Backend Support: Switch between local model and Aliyun TTS API
- Multi-language Support: English, 简体中文, 繁體中文, 日本語, 한국어
- JWT auth, async tasks, voice cache, dark mode
@@ -148,6 +149,25 @@ hf download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-0.
hf download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./Qwen3-TTS-12Hz-0.6B-Base
```
**IndexTTS2 Model (optional, for emotion-controlled voice cloning)**
IndexTTS2 is an optional feature. Only download these files if you want to use it. Navigate to the same `Qwen/` directory and run:
```bash
# Only the required files — no need to download the full repository
hf download IndexTeam/IndexTTS-2 \
bpe.model config.yaml feat1.pt feat2.pt gpt.pth s2mel.pth wav2vec2bert_stats.pt \
--local-dir ./IndexTTS2
```
Then install the indextts package:
```bash
git clone https://github.com/iszhanjiawei/indexTTS2.git
cd indexTTS2
pip install -e . --no-deps
cd ..
```
**Final directory structure:**
Docker deployment (`docker/models/`):
@@ -169,7 +189,15 @@ Qwen3-TTS-webUI/
├── Qwen3-TTS-Tokenizer-12Hz/
├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
── Qwen3-TTS-12Hz-1.7B-Base/
── Qwen3-TTS-12Hz-1.7B-Base/
└── IndexTTS2/ ← optional, for IndexTTS2 feature
├── bpe.model
├── config.yaml
├── feat1.pt
├── feat2.pt
├── gpt.pth
├── s2mel.pth
└── wav2vec2bert_stats.pt
```
### 3. Backend Setup