bdim/Canto

Go to file

bdim 9cd268db8a Add Apache License 2.0

Added the Apache License 2.0 to the project.

2026-01-26 15:37:54 +08:00

qwen3-tts-backend

init commit

2026-01-26 15:34:31 +08:00

qwen3-tts-frontend

init commit

2026-01-26 15:34:31 +08:00

qwen_tts

init commit

2026-01-26 15:34:31 +08:00

.gitignore

init commit

2026-01-26 15:34:31 +08:00

LICENSE

Add Apache License 2.0

2026-01-26 15:37:54 +08:00

README.md

init commit

2026-01-26 15:34:31 +08:00

README.zh.md

init commit

2026-01-26 15:34:31 +08:00

README.md

Qwen3-TTS WebUI

A text-to-speech web application based on Qwen3-TTS, supporting custom voice, voice design, and voice cloning.

中文文档

Features

Custom Voice: Predefined speaker voices
Voice Design: Create voices from natural language descriptions
Voice Cloning: Clone voices from uploaded audio
JWT auth, async tasks, voice cache, dark mode

Tech Stack

Backend: FastAPI + SQLAlchemy + PyTorch + JWT Frontend: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui

Quick Start

Backend

cd qwen3-tts-backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env to configure MODEL_BASE_PATH etc.
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Frontend

cd qwen3-tts-frontend
npm install
cp .env.example .env
# Edit .env to configure VITE_API_URL
npm run dev

Visit http://localhost:5173

Configuration

Backend .env key settings:

SECRET_KEY=your-secret-key
MODEL_DEVICE=cuda:0
MODEL_BASE_PATH=../Qwen
DATABASE_URL=sqlite:///./qwen_tts.db

Frontend .env:

VITE_API_URL=http://localhost:8000

API

POST /auth/register          - Register
POST /auth/token             - Login
POST /tts/custom-voice       - Custom voice
POST /tts/voice-design       - Voice design
POST /tts/voice-clone        - Voice cloning
GET  /jobs                   - Job list
GET  /jobs/{id}/download     - Download result

License

MIT

Languages

Python 86.4%

TypeScript 12.5%

Cuda 0.4%

C 0.3%

CSS 0.2%

Other 0.1%