Canto/README.md

# Qwen3-TTS WebUI

A text-to-speech web application based on Qwen3-TTS, supporting custom voice, voice design, and voice cloning.

[中文文档](./README.zh.md)

## Features

- Custom Voice: Predefined speaker voices
- Voice Design: Create voices from natural language descriptions
- Voice Cloning: Clone voices from uploaded audio
- JWT auth, async tasks, voice cache, dark mode

## Tech Stack

Backend: FastAPI + SQLAlchemy + PyTorch + JWT
Frontend: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui

## Quick Start

### Backend

```bash
cd qwen3-tts-backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env to configure MODEL_BASE_PATH etc.
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```

### Frontend

```bash
cd qwen3-tts-frontend
npm install
cp .env.example .env
# Edit .env to configure VITE_API_URL
npm run dev
```

Visit `http://localhost:5173`

## Configuration

Backend `.env` key settings:

```env
SECRET_KEY=your-secret-key
MODEL_DEVICE=cuda:0
MODEL_BASE_PATH=../Qwen
DATABASE_URL=sqlite:///./qwen_tts.db
```

Frontend `.env`:

```env
VITE_API_URL=http://localhost:8000
```

## API

```
POST /auth/register          - Register
POST /auth/token             - Login
POST /tts/custom-voice       - Custom voice
POST /tts/voice-design       - Voice design
POST /tts/voice-clone        - Voice cloning
GET  /jobs                   - Job list
GET  /jobs/{id}/download     - Download result
```

## License

MIT