Compare commits

...

2 Commits

Author SHA1 Message Date
d12c1223f9 chore: update dev.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 10:29:18 +08:00
1cb8122b93 feat: strip down to audiobook-only, remove TTS/voice pages
Remove Home, VoiceManagement pages and all related components
(TTS forms, voice clone, history sidebar, onboarding), contexts
(App, History, Job), and hooks. Route / now redirects to /audiobook.
Also drop README, GitHub Actions workflows, screenshots, and add dev.sh.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 10:26:34 +08:00
30 changed files with 12 additions and 3532 deletions

3
.github/FUNDING.yml vendored
View File

@@ -1,3 +0,0 @@
# These are supported funding model platforms
github: bdim404

View File

@@ -1,34 +0,0 @@
name: Publish Backend Image
on:
push:
branches: [main]
paths:
- 'qwen3-tts-backend/**'
- 'qwen_tts/**'
- 'docker/backend/**'
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: docker/backend/Dockerfile
push: true
tags: bdim404/qwen3-tts-backend:latest
cache-from: type=gha,scope=backend
cache-to: type=gha,mode=max,scope=backend

View File

@@ -1,33 +0,0 @@
name: Publish Frontend Image
on:
push:
branches: [main]
paths:
- 'qwen3-tts-frontend/**'
- 'docker/frontend/**'
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: docker/frontend/Dockerfile
push: true
tags: bdim404/qwen3-tts-frontend:latest
cache-from: type=gha,scope=frontend
cache-to: type=gha,mode=max,scope=frontend

348
README.md
View File

@@ -1,348 +0,0 @@
# Qwen3-TTS WebUI
> **⚠️ Notice:** This project is largely AI-generated and is currently in an unstable state. Stable releases will be published in the [Releases](../../releases) section.
**Unofficial** text-to-speech web application based on Qwen3-TTS, supporting custom voice, voice design, and voice cloning with an intuitive interface.
> This is an unofficial project. For the official Qwen3-TTS repository, please visit [QwenLM/Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS).
[中文文档](./README.zh.md)
## Features
- Custom Voice: Predefined speaker voices
- Voice Design: Create voices from natural language descriptions
- Voice Cloning: Clone voices from uploaded audio
- **IndexTTS2**: High-quality voice cloning with emotion control (happy, angry, sad, fear, surprise, etc.) powered by [IndexTTS2](https://github.com/iszhanjiawei/indexTTS2)
- Audiobook Generation: Upload EPUB files and generate multi-character audiobooks with LLM-powered character extraction and voice assignment; supports IndexTTS2 per character
- Dual Backend Support: Switch between local model and Aliyun TTS API
- Multi-language Support: English, 简体中文, 繁體中文, 日本語, 한국어
- JWT auth, async tasks, voice cache, dark mode
## Interface Preview
### Desktop - Light Mode
![Light Mode](./images/lightmode-english.png)
### Desktop - Dark Mode
![Dark Mode](./images/darkmode-chinese.png)
### Mobile
<table>
<tr>
<td width="50%"><img src="./images/mobile-lightmode-custom.png" alt="Mobile Light Mode" /></td>
<td width="50%"><img src="./images/mobile-settings.png" alt="Mobile Settings" /></td>
</tr>
</table>
### Audiobook Generation
![Audiobook Overview](./images/audiobook-overview.png)
<table>
<tr>
<td width="50%"><img src="./images/audiobook-characters.png" alt="Audiobook Characters" /></td>
<td width="50%"><img src="./images/audiobook-chapters.png" alt="Audiobook Chapters" /></td>
</tr>
</table>
## Tech Stack
**Backend**: FastAPI + SQLAlchemy + PyTorch + JWT
- Direct PyTorch inference with Qwen3-TTS models
- Async task processing with batch optimization
- Local model support + Aliyun API integration
**Frontend**: React 19 + TypeScript + Vite + Tailwind + Shadcn/ui
## Docker Deployment
Pre-built images are available on Docker Hub: [bdim404/qwen3-tts-backend](https://hub.docker.com/r/bdim404/qwen3-tts-backend), [bdim404/qwen3-tts-frontend](https://hub.docker.com/r/bdim404/qwen3-tts-frontend)
**Prerequisites**: Docker, Docker Compose, NVIDIA GPU + [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)
```bash
git clone https://github.com/bdim404/Qwen3-TTS-WebUI.git
cd Qwen3-TTS-webUI
# Download models to docker/models/ (see Installation > Download Models below)
mkdir -p docker/models docker/data
# Configure
cp docker/.env.example docker/.env
# Edit docker/.env and set SECRET_KEY
cd docker
# Pull pre-built images
docker compose pull
# Start (CPU only)
docker compose up -d
# Start (with GPU)
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
```
Access the application at `http://localhost`. Default credentials: `admin` / `admin123456`
## Installation
### Prerequisites
- Python 3.9+ with CUDA support (for local model inference)
- Node.js 18+ (for frontend)
- Git
### 1. Clone Repository
```bash
git clone https://github.com/bdim404/Qwen3-TTS-WebUI.git
cd Qwen3-TTS-webUI
```
### 2. Download Models
**Important**: Models are **NOT** automatically downloaded. You need to manually download them first.
For more details, visit the official repository: [Qwen3-TTS Models](https://github.com/QwenLM/Qwen3-TTS)
Navigate to the models directory:
```bash
# Docker deployment
mkdir -p docker/models && cd docker/models
# Local deployment
cd qwen3-tts-backend && mkdir -p Qwen && cd Qwen
```
**Option 1: Download through ModelScope (Recommended for users in Mainland China)**
```bash
pip install -U modelscope
modelscope download --model Qwen/Qwen3-TTS-Tokenizer-12Hz --local_dir ./Qwen3-TTS-Tokenizer-12Hz
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local_dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./Qwen3-TTS-12Hz-1.7B-Base
```
Optional 0.6B models (smaller, faster):
```bash
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local_dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./Qwen3-TTS-12Hz-0.6B-Base
```
**Option 2: Download through Hugging Face**
```bash
pip install -U "huggingface_hub[cli]"
hf download Qwen/Qwen3-TTS-Tokenizer-12Hz --local-dir ./Qwen3-TTS-Tokenizer-12Hz
hf download Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-1.7B-CustomVoice
hf download Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --local-dir ./Qwen3-TTS-12Hz-1.7B-VoiceDesign
hf download Qwen/Qwen3-TTS-12Hz-1.7B-Base --local-dir ./Qwen3-TTS-12Hz-1.7B-Base
```
Optional 0.6B models (smaller, faster):
```bash
hf download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir ./Qwen3-TTS-12Hz-0.6B-CustomVoice
hf download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./Qwen3-TTS-12Hz-0.6B-Base
```
**IndexTTS2 Model (optional, for emotion-controlled voice cloning)**
IndexTTS2 is an optional feature. Only download these files if you want to use it. Navigate to the same `Qwen/` directory and run:
```bash
# Only the required files — no need to download the full repository
hf download IndexTeam/IndexTTS-2 \
bpe.model config.yaml feat1.pt feat2.pt gpt.pth s2mel.pth wav2vec2bert_stats.pt \
--local-dir ./IndexTTS2
```
Then install the indextts package:
```bash
git clone https://github.com/iszhanjiawei/indexTTS2.git
cd indexTTS2
pip install -e . --no-deps
cd ..
```
**Final directory structure:**
Docker deployment (`docker/models/`):
```
Qwen3-TTS-webUI/
└── docker/
└── models/
├── Qwen3-TTS-Tokenizer-12Hz/
├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
└── Qwen3-TTS-12Hz-1.7B-Base/
```
Local deployment (`qwen3-tts-backend/Qwen/`):
```
Qwen3-TTS-webUI/
└── qwen3-tts-backend/
└── Qwen/
├── Qwen3-TTS-Tokenizer-12Hz/
├── Qwen3-TTS-12Hz-1.7B-CustomVoice/
├── Qwen3-TTS-12Hz-1.7B-VoiceDesign/
├── Qwen3-TTS-12Hz-1.7B-Base/
└── IndexTTS2/ ← optional, for IndexTTS2 feature
├── bpe.model
├── config.yaml
├── feat1.pt
├── feat2.pt
├── gpt.pth
├── s2mel.pth
└── wav2vec2bert_stats.pt
```
### 3. Backend Setup
```bash
cd qwen3-tts-backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install Qwen3-TTS
pip install qwen-tts
# Create configuration file
cp .env.example .env
# Edit .env file
# For local model: Set MODEL_BASE_PATH=./Qwen
# For Aliyun API only: Set DEFAULT_BACKEND=aliyun
nano .env # or use your preferred editor
```
**Important Backend Configuration** (`.env`):
```env
MODEL_DEVICE=cuda:0 # Use GPU (or cpu for CPU-only)
MODEL_BASE_PATH=./Qwen # Path to your downloaded models
DEFAULT_BACKEND=local # Use 'local' for local models, 'aliyun' for API
DATABASE_URL=sqlite:///./qwen_tts.db
SECRET_KEY=your-secret-key-here # Change this!
```
Start the backend server:
```bash
# Using uvicorn directly
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Or using conda (if you prefer)
conda run -n qwen3-tts uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```
Verify backend is running:
```bash
curl http://127.0.0.1:8000/health
```
### 4. Frontend Setup
```bash
cd qwen3-tts-frontend
# Install dependencies
npm install
# Create configuration file
cp .env.example .env
# Start development server
npm run dev
```
### 5. Access the Application
Open your browser and visit: `http://localhost:5173`
**Default Credentials**:
- Username: `admin`
- Password: `admin123456`
- **IMPORTANT**: Change the password immediately after first login!
### Production Build
For production deployment:
```bash
# Backend: Use gunicorn or similar WSGI server
cd qwen3-tts-backend
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000
# Frontend: Build static files
cd qwen3-tts-frontend
npm run build
# Serve the 'dist' folder with nginx or another web server
```
## Configuration
### Backend Configuration
Backend `.env` key settings:
```env
SECRET_KEY=your-secret-key
MODEL_DEVICE=cuda:0
MODEL_BASE_PATH=../Qwen
DATABASE_URL=sqlite:///./qwen_tts.db
DEFAULT_BACKEND=local
ALIYUN_REGION=beijing
ALIYUN_MODEL_FLASH=qwen3-tts-flash-realtime
ALIYUN_MODEL_VC=qwen3-tts-vc-realtime-2026-01-15
ALIYUN_MODEL_VD=qwen3-tts-vd-realtime-2026-01-15
```
**Backend Options:**
- `DEFAULT_BACKEND`: Default TTS backend, options: `local` or `aliyun`
- **Local Mode**: Uses local Qwen3-TTS model (requires `MODEL_BASE_PATH` configuration)
- **Aliyun Mode**: Uses Aliyun TTS API (requires users to configure their API keys in settings)
**Aliyun Configuration:**
- Users need to add their Aliyun API keys in the web interface settings page
- API keys are encrypted and stored securely in the database
- Superuser can enable/disable local model access for all users
- To obtain an Aliyun API key, visit the [Aliyun Console](https://dashscope.console.aliyun.com/)
## Usage
### Switching Between Backends
1. Log in to the web interface
2. Navigate to Settings page
3. Configure your preferred backend:
- **Local Model**: Select "本地模型" (requires local model to be enabled by superuser)
- **Aliyun API**: Select "阿里云" and add your API key
4. The selected backend will be used for all TTS operations by default
5. You can also specify a different backend per request using the `backend` parameter in the API
### Managing Aliyun API Key
1. In Settings page, find the "阿里云 API 密钥" section
2. Enter your Aliyun API key
3. Click "更新密钥" to save and validate
4. The system will verify the key before saving
5. You can delete the key anytime using the delete button
## Acknowledgments
This project is built upon the excellent work of the official [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) repository by the Qwen Team at Alibaba Cloud. Special thanks to the Qwen Team for open-sourcing such a powerful text-to-speech model.
## License
Apache-2.0 license

7
dev.sh Executable file
View File

@@ -0,0 +1,7 @@
#!/bin/bash
trap 'kill 0' EXIT
(cd qwen3-tts-backend && /home/bdim/miniconda3/envs/qwen3-tts/bin/uvicorn main:app --host 0.0.0.0 --port 8000 --reload 2>&1 | sed 's/^/[backend] /') &
(cd qwen3-tts-frontend && npm run dev 2>&1 | sed 's/^/[frontend] /') &
wait

Binary file not shown.

Before

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 188 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 209 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 317 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 356 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 113 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 127 KiB

View File

@@ -4,18 +4,13 @@ import { Toaster } from 'sonner'
import { ThemeProvider } from '@/contexts/ThemeContext'
import { AuthProvider, useAuth } from '@/contexts/AuthContext'
import { UserPreferencesProvider } from '@/contexts/UserPreferencesContext'
import { AppProvider } from '@/contexts/AppContext'
import { JobProvider } from '@/contexts/JobContext'
import { HistoryProvider } from '@/contexts/HistoryContext'
import ErrorBoundary from '@/components/ErrorBoundary'
import LoadingScreen from '@/components/LoadingScreen'
import { SuperAdminRoute } from '@/components/SuperAdminRoute'
const Login = lazy(() => import('@/pages/Login'))
const Home = lazy(() => import('@/pages/Home'))
const Settings = lazy(() => import('@/pages/Settings'))
const UserManagement = lazy(() => import('@/pages/UserManagement'))
const VoiceManagement = lazy(() => import('@/pages/VoiceManagement'))
const Audiobook = lazy(() => import('@/pages/Audiobook'))
const AdminStats = lazy(() => import('@/pages/AdminStats'))
@@ -49,7 +44,7 @@ function PublicRoute({ children }: { children: React.ReactNode }) {
}
if (isAuthenticated) {
return <Navigate to="/" replace />
return <Navigate to="/audiobook" replace />
}
return <>{children}</>
@@ -73,20 +68,7 @@ function App() {
</PublicRoute>
}
/>
<Route
path="/"
element={
<ProtectedRoute>
<AppProvider>
<HistoryProvider>
<JobProvider>
<Home />
</JobProvider>
</HistoryProvider>
</AppProvider>
</ProtectedRoute>
}
/>
<Route path="/" element={<Navigate to="/audiobook" replace />} />
<Route
path="/settings"
element={
@@ -103,14 +85,6 @@ function App() {
</SuperAdminRoute>
}
/>
<Route
path="/voices"
element={
<ProtectedRoute>
<VoiceManagement />
</ProtectedRoute>
}
/>
<Route
path="/audiobook"
element={

View File

@@ -1,44 +0,0 @@
import { useState } from 'react'
import { Tabs, TabsContent, TabsList, TabsTrigger } from '@/components/ui/tabs'
import { Upload, Mic } from 'lucide-react'
import { FileUploader } from '@/components/FileUploader'
import { AudioRecorder } from '@/components/AudioRecorder'
interface AudioInputSelectorProps {
value: File | null
onChange: (file: File | null) => void
error?: string
}
export function AudioInputSelector({ value, onChange, error }: AudioInputSelectorProps) {
const [activeTab, setActiveTab] = useState<string>('upload')
const handleTabChange = (newTab: string) => {
onChange(null)
setActiveTab(newTab)
}
return (
<Tabs value={activeTab} onValueChange={handleTabChange} className="w-full">
<TabsList className="grid w-full grid-cols-2">
<TabsTrigger value="upload" className="flex items-center gap-2">
<Upload className="h-4 w-4" />
</TabsTrigger>
<TabsTrigger value="record" className="flex items-center gap-2">
<Mic className="h-4 w-4" />
</TabsTrigger>
</TabsList>
<TabsContent value="upload" className="mt-4">
<FileUploader value={value} onChange={onChange} error={error} />
</TabsContent>
<TabsContent value="record" className="mt-4">
<AudioRecorder onChange={onChange} />
{error && <p className="text-sm text-destructive mt-2">{error}</p>}
</TabsContent>
</Tabs>
)
}

View File

@@ -1,178 +0,0 @@
import { useEffect, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { Button } from '@/components/ui/button'
import { Mic, Trash2, RotateCcw, FileAudio } from 'lucide-react'
import { toast } from 'sonner'
import { useAudioRecorder } from '@/hooks/useAudioRecorder'
import { useAudioValidation } from '@/hooks/useAudioValidation'
interface AudioRecorderProps {
onChange: (file: File | null) => void
}
export function AudioRecorder({ onChange }: AudioRecorderProps) {
const { t } = useTranslation('voice')
const {
isRecording,
recordingDuration,
audioBlob,
error: recorderError,
isSupported,
startRecording,
stopRecording,
clearRecording,
} = useAudioRecorder()
const { validateAudioFile } = useAudioValidation()
const [audioInfo, setAudioInfo] = useState<{ duration: number; size: number } | null>(null)
const [validationError, setValidationError] = useState<string | null>(null)
useEffect(() => {
if (recorderError) {
toast.error(recorderError)
}
}, [recorderError])
useEffect(() => {
if (audioBlob) {
handleValidateRecording(audioBlob)
}
}, [audioBlob])
const handleValidateRecording = async (blob: Blob) => {
const file = new File([blob], 'recording.wav', { type: 'audio/wav' })
const result = await validateAudioFile(file)
console.log('录音验证结果:', {
valid: result.valid,
duration: result.duration,
recordingDuration: recordingDuration,
error: result.error
})
if (result.valid && result.duration) {
onChange(file)
setAudioInfo({ duration: result.duration, size: file.size })
setValidationError(null)
} else {
setValidationError(result.error || t('recordingValidationFailed'))
clearRecording()
onChange(null)
}
}
const handleMouseDown = () => {
if (!isRecording && !audioBlob) {
startRecording()
}
}
const handleMouseUp = () => {
if (isRecording) {
stopRecording()
}
}
const handleReset = (e: React.MouseEvent) => {
e.preventDefault()
e.stopPropagation()
clearRecording()
setAudioInfo(null)
setValidationError(null)
onChange(null)
}
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === ' ' && !isRecording && !audioBlob) {
e.preventDefault()
startRecording()
}
}
const handleKeyUp = (e: React.KeyboardEvent) => {
if (e.key === ' ' && isRecording) {
e.preventDefault()
stopRecording()
}
}
if (!isSupported) {
return (
<div className="p-4 border rounded bg-muted text-muted-foreground text-sm">
{t('browserNotSupported')}
</div>
)
}
if (audioBlob && audioInfo) {
return (
<div className="space-y-2">
<div className="flex items-center gap-2 p-3 border rounded">
<FileAudio className="h-5 w-5 text-muted-foreground" />
<div className="flex-1 min-w-0">
<p className="text-sm font-medium">{t('recordingComplete')}</p>
<p className="text-xs text-muted-foreground">
{(audioInfo.size / 1024 / 1024).toFixed(2)} MB · {audioInfo.duration.toFixed(1)} {t('seconds')}
</p>
</div>
<Button
type="button"
variant="ghost"
size="icon"
onClick={handleReset}
onMouseDown={(e) => e.stopPropagation()}
onTouchStart={(e) => e.stopPropagation()}
>
<Trash2 className="h-4 w-4" />
</Button>
</div>
</div>
)
}
return (
<div className="space-y-2">
<Button
type="button"
variant={isRecording ? 'default' : 'outline'}
className={`w-full h-24 select-none ${isRecording ? 'animate-pulse' : ''}`}
onMouseDown={handleMouseDown}
onMouseUp={handleMouseUp}
onMouseLeave={handleMouseUp}
onTouchStart={handleMouseDown}
onTouchEnd={handleMouseUp}
onKeyDown={handleKeyDown}
onKeyUp={handleKeyUp}
>
<div className="flex flex-col items-center gap-2">
<Mic className="h-8 w-8" />
{isRecording ? (
<>
<span className="text-lg font-semibold">{recordingDuration.toFixed(1)}s</span>
<span className="text-xs">{t('releaseToFinish')}</span>
</>
) : (
<span>{t('holdToRecord')}</span>
)}
</div>
</Button>
{validationError && (
<div className="flex items-center justify-between p-2 border border-destructive rounded bg-destructive/10">
<p className="text-sm text-destructive">{validationError}</p>
<Button
type="button"
variant="ghost"
size="sm"
onClick={handleReset}
onMouseDown={(e) => e.stopPropagation()}
onTouchStart={(e) => e.stopPropagation()}
>
<RotateCcw className="h-4 w-4" />
</Button>
</div>
)}
</div>
)
}

View File

@@ -1,174 +0,0 @@
import { memo, useState } from 'react'
import { useTranslation } from 'react-i18next'
import type { Job } from '@/types/job'
import { Badge } from '@/components/ui/badge'
import { Button } from '@/components/ui/button'
import {
AlertDialog,
AlertDialogAction,
AlertDialogCancel,
AlertDialogContent,
AlertDialogDescription,
AlertDialogFooter,
AlertDialogHeader,
AlertDialogTitle,
AlertDialogTrigger,
} from '@/components/ui/alert-dialog'
import { Trash2, AlertCircle, Loader2, Clock, Eye } from 'lucide-react'
import { getRelativeTime, cn } from '@/lib/utils'
import { JobDetailDialog } from '@/components/JobDetailDialog'
interface HistoryItemProps {
job: Job
onDelete: (id: number) => void
}
const jobTypeBadgeVariant = {
custom_voice: 'default' as const,
voice_design: 'secondary' as const,
voice_clone: 'outline' as const,
}
const HistoryItem = memo(({ job, onDelete }: HistoryItemProps) => {
const { t } = useTranslation('job')
const { t: tCommon } = useTranslation('common')
const [detailDialogOpen, setDetailDialogOpen] = useState(false)
const jobTypeLabel = {
custom_voice: t('typeCustomVoice'),
voice_design: t('typeVoiceDesign'),
voice_clone: t('typeVoiceClone'),
}
const getLanguageDisplay = (lang: string | undefined) => {
if (!lang || lang === 'Auto') return t('autoDetect')
return lang
}
const handleCardClick = (e: React.MouseEvent) => {
if ((e.target as HTMLElement).closest('button')) return
setDetailDialogOpen(true)
}
return (
<div
className={cn(
"relative border rounded-lg p-4 pb-14 space-y-3 hover:bg-accent/50 transition-colors cursor-pointer",
job.status === 'failed' && "border-destructive/50"
)}
onClick={handleCardClick}
>
<div className="flex items-start justify-between gap-2">
<Badge variant={jobTypeBadgeVariant[job.type]}>
{jobTypeLabel[job.type]}
</Badge>
<div className="flex items-center gap-1.5 text-xs text-muted-foreground whitespace-nowrap">
<span>{getRelativeTime(job.created_at)}</span>
<Eye className="w-3.5 h-3.5" />
</div>
</div>
<div className="space-y-2 text-sm">
{job.parameters?.text && (
<div>
<span className="text-muted-foreground">{t('synthesisText')}: </span>
<span className="line-clamp-2">{job.parameters.text}</span>
</div>
)}
<div className="text-muted-foreground">
{t('language')}{getLanguageDisplay(job.parameters?.language)}
</div>
{job.type === 'custom_voice' && job.parameters?.speaker && (
<div className="text-muted-foreground">
{t('speaker')}{job.parameters.speaker}
</div>
)}
{job.type === 'voice_design' && job.parameters?.instruct && (
<div>
<span className="text-muted-foreground">{t('voiceDescription')}: </span>
<span className="text-xs line-clamp-2">{job.parameters.instruct}</span>
</div>
)}
{job.type === 'voice_clone' && job.parameters?.ref_text && (
<div>
<span className="text-muted-foreground">{t('referenceText')}: </span>
<span className="text-xs line-clamp-1">{job.parameters.ref_text}</span>
</div>
)}
</div>
{job.status === 'processing' && (
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<Loader2 className="w-4 h-4 animate-spin" />
<span>{t('statusProcessing')}</span>
</div>
)}
{job.status === 'pending' && (
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<Clock className="w-4 h-4" />
<span>{t('statusPending')}</span>
</div>
)}
{job.status === 'failed' && job.error_message && (
<div className="flex items-start gap-2 p-2 bg-destructive/10 rounded-md">
<AlertCircle className="w-4 h-4 text-destructive mt-0.5 shrink-0" />
<span className="text-sm text-destructive">{job.error_message}</span>
</div>
)}
<div className="absolute bottom-3 right-3">
<AlertDialog>
<AlertDialogTrigger asChild>
<Button
variant="ghost"
size="sm"
className="min-h-[44px] md:min-h-[36px] text-muted-foreground hover:[&_svg]:text-destructive"
>
<Trash2 className="w-4 h-4" />
</Button>
</AlertDialogTrigger>
<AlertDialogContent>
<AlertDialogHeader>
<AlertDialogTitle>{t('deleteJob')}</AlertDialogTitle>
<AlertDialogDescription>
{t('deleteJobConfirm')}
</AlertDialogDescription>
</AlertDialogHeader>
<AlertDialogFooter>
<AlertDialogCancel>{tCommon('cancel')}</AlertDialogCancel>
<AlertDialogAction
onClick={() => onDelete(job.id)}
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
>
{tCommon('delete')}
</AlertDialogAction>
</AlertDialogFooter>
</AlertDialogContent>
</AlertDialog>
</div>
<JobDetailDialog
job={job}
open={detailDialogOpen}
onOpenChange={setDetailDialogOpen}
/>
</div>
)
}, (prevProps, nextProps) => {
return (
prevProps.job.id === nextProps.job.id &&
prevProps.job.status === nextProps.job.status &&
prevProps.job.updated_at === nextProps.job.updated_at &&
prevProps.job.error_message === nextProps.job.error_message
)
})
HistoryItem.displayName = 'HistoryItem'
export { HistoryItem }

View File

@@ -1,110 +0,0 @@
import { useRef, useEffect } from 'react'
import { useTranslation } from 'react-i18next'
import { Link } from 'react-router-dom'
import { useHistoryContext } from '@/contexts/HistoryContext'
import { HistoryItem } from '@/components/HistoryItem'
import { ScrollArea } from '@/components/ui/scroll-area'
import { Sheet, SheetContent } from '@/components/ui/sheet'
import { Button } from '@/components/ui/button'
import { Loader2, FileAudio, RefreshCw } from 'lucide-react'
interface HistorySidebarProps {
open: boolean
onOpenChange: (open: boolean) => void
}
function HistorySidebarContent() {
const { t } = useTranslation('job')
const { jobs, loading, loadingMore, hasMore, loadMore, deleteJob, error, retry } = useHistoryContext()
const observerTarget = useRef<HTMLDivElement>(null)
useEffect(() => {
const observer = new IntersectionObserver(
(entries) => {
if (entries[0].isIntersecting && hasMore && !loadingMore) {
loadMore()
}
},
{ threshold: 0.5 }
)
if (observerTarget.current) {
observer.observe(observerTarget.current)
}
return () => observer.disconnect()
}, [hasMore, loadingMore, loadMore])
return (
<div className="flex flex-col h-full">
<div className="px-4 pt-4 pb-3">
<Link to="/" className="flex items-center gap-2 mb-6">
<img src="/qwen.svg" alt="Qwen" className="h-6 w-6" />
<h1 className="text-xl font-bold cursor-pointer hover:opacity-80 transition-opacity">
Qwen3-TTS-WebUI
</h1>
</Link>
<h2 className="text-lg font-semibold">{t('historyTitle')}</h2>
<p className="text-sm text-muted-foreground">{t('historyCount', { count: jobs.length })}</p>
</div>
<ScrollArea className="flex-1">
<div className="p-4 space-y-4">
{loading ? (
<div className="flex items-center justify-center py-8">
<Loader2 className="w-6 h-6 animate-spin text-muted-foreground" />
</div>
) : error ? (
<div className="flex flex-col items-center justify-center py-8 space-y-4">
<p className="text-sm text-destructive text-center">{error}</p>
<Button onClick={retry} variant="outline" size="sm">
<RefreshCw className="w-4 h-4 mr-2" />
{t('retry')}
</Button>
</div>
) : jobs.length === 0 ? (
<div className="flex flex-col items-center justify-center py-12 space-y-3">
<FileAudio className="w-12 h-12 text-muted-foreground/50" />
<p className="text-sm font-medium text-muted-foreground">{t('noHistory')}</p>
<p className="text-xs text-muted-foreground text-center">
{t('historyDescription')}
</p>
</div>
) : (
<>
{jobs.map((job) => (
<HistoryItem
key={job.id}
job={job}
onDelete={deleteJob}
/>
))}
{hasMore && (
<div ref={observerTarget} className="py-4 flex justify-center">
<Loader2 className="w-5 h-5 animate-spin text-muted-foreground" />
</div>
)}
</>
)}
</div>
</ScrollArea>
</div>
)
}
export function HistorySidebar({ open, onOpenChange }: HistorySidebarProps) {
return (
<>
<aside className="hidden lg:block w-[320px] h-full bg-muted/30">
<HistorySidebarContent />
</aside>
<Sheet open={open} onOpenChange={onOpenChange}>
<SheetContent side="left" className="w-full sm:max-w-md p-0">
<HistorySidebarContent />
</SheetContent>
</Sheet>
</>
)
}

View File

@@ -1,5 +1,5 @@
import { Menu, LogOut, Users, Settings, Globe, Home, Mic, BookOpen, BarChart2 } from 'lucide-react'
import { Link, useLocation } from 'react-router-dom'
import { LogOut, Users, Settings, Globe, BookOpen, BarChart2 } from 'lucide-react'
import { Link } from 'react-router-dom'
import { useTranslation } from 'react-i18next'
import { Button } from '@/components/ui/button'
import {
@@ -12,43 +12,13 @@ import { ThemeToggle } from '@/components/ThemeToggle'
import { useAuth } from '@/contexts/AuthContext'
import { useUserPreferences } from '@/contexts/UserPreferencesContext'
interface NavbarProps {
onToggleSidebar?: () => void
}
export function Navbar({ onToggleSidebar }: NavbarProps) {
export function Navbar() {
const { logout, user } = useAuth()
const { changeLanguage } = useUserPreferences()
const { t, i18n } = useTranslation(['nav', 'constants'])
const location = useLocation()
return (
<nav className="h-16 flex items-center justify-end px-4 gap-2">
{onToggleSidebar && (
<Button
variant="ghost"
size="icon"
onClick={onToggleSidebar}
className="lg:hidden mr-auto"
>
<Menu className="h-5 w-5" />
</Button>
)}
{location.pathname !== '/' && (
<Link to="/">
<Button variant="ghost" size="icon">
<Home className="h-5 w-5" />
</Button>
</Link>
)}
<Link to="/voices">
<Button variant="ghost" size="icon">
<Mic className="h-5 w-5" />
</Button>
</Link>
<Link to="/audiobook">
<Button variant="ghost" size="icon">
<BookOpen className="h-5 w-5" />

View File

@@ -1,207 +0,0 @@
import { useState } from 'react'
import { useForm } from 'react-hook-form'
import { zodResolver } from '@hookform/resolvers/zod'
import * as z from 'zod'
import { toast } from 'sonner'
import { useTranslation } from 'react-i18next'
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle,
} from '@/components/ui/dialog'
import { Button } from '@/components/ui/button'
import {
Form,
FormControl,
FormField,
FormItem,
FormLabel,
FormMessage,
} from '@/components/ui/form'
import { Input } from '@/components/ui/input'
import { RadioGroup, RadioGroupItem } from '@/components/ui/radio-group'
import { Label } from '@/components/ui/label'
import { authApi } from '@/lib/api'
import { useUserPreferences } from '@/contexts/UserPreferencesContext'
const createApiKeySchema = (t: (key: string) => string) => z.object({
api_key: z.string().min(1, t('auth:validation.apiKeyRequired')),
})
interface OnboardingDialogProps {
open: boolean
onComplete: () => void
}
export function OnboardingDialog({ open, onComplete }: OnboardingDialogProps) {
const { t } = useTranslation(['onboarding', 'auth', 'common'])
const [step, setStep] = useState(1)
const [selectedBackend, setSelectedBackend] = useState<'local' | 'aliyun'>('aliyun')
const [isLoading, setIsLoading] = useState(false)
const { updatePreferences, refetchPreferences, isBackendAvailable } = useUserPreferences()
const apiKeySchema = createApiKeySchema(t)
type ApiKeyFormValues = z.infer<typeof apiKeySchema>
const form = useForm<ApiKeyFormValues>({
resolver: zodResolver(apiKeySchema),
defaultValues: {
api_key: '',
},
})
const handleSkip = async () => {
try {
await updatePreferences({
default_backend: 'local',
onboarding_completed: true,
})
toast.success(t('onboarding:skipSuccess'))
onComplete()
} catch (error) {
toast.error(t('onboarding:operationFailed'))
}
}
const handleNextStep = () => {
if (selectedBackend === 'local') {
handleComplete('local')
} else {
setStep(2)
}
}
const handleComplete = async (backend: 'local' | 'aliyun') => {
try {
setIsLoading(true)
await updatePreferences({
default_backend: backend,
onboarding_completed: true,
})
toast.success(backend === 'local' ? t('onboarding:configComplete') : t('onboarding:configCompleteAliyun'))
onComplete()
} catch (error) {
toast.error(t('onboarding:saveFailed'))
} finally {
setIsLoading(false)
}
}
const handleVerifyAndComplete = async (data: ApiKeyFormValues) => {
try {
setIsLoading(true)
await authApi.setAliyunKey(data.api_key)
await refetchPreferences()
await handleComplete('aliyun')
} catch (error: any) {
toast.error(error.message || t('onboarding:verifyFailed'))
} finally {
setIsLoading(false)
}
}
return (
<Dialog open={open} onOpenChange={() => {}}>
<DialogContent className="sm:max-w-[500px]" onInteractOutside={(e) => e.preventDefault()}>
<DialogHeader>
<DialogTitle>
{step === 1 ? t('onboarding:welcome') : t('onboarding:configureApiKey')}
</DialogTitle>
<DialogDescription>
{step === 1
? t('onboarding:selectBackendDescription')
: t('onboarding:enterApiKeyDescription')}
</DialogDescription>
</DialogHeader>
{step === 1 && (
<>
<div className="space-y-4 py-4">
<RadioGroup value={selectedBackend} onValueChange={(v) => setSelectedBackend(v as 'local' | 'aliyun')}>
<div className={`flex items-center space-x-3 border rounded-lg p-4 ${isBackendAvailable('local') ? 'hover:bg-accent/50 cursor-pointer' : 'opacity-50 cursor-not-allowed'}`}>
<RadioGroupItem value="local" id="local" disabled={!isBackendAvailable('local')} />
<Label htmlFor="local" className={`flex-1 ${isBackendAvailable('local') ? 'cursor-pointer' : 'cursor-not-allowed'}`}>
<div className="font-medium">{t('onboarding:localModel')}</div>
<div className="text-sm text-muted-foreground">
{isBackendAvailable('local') ? t('onboarding:localModelDescription') : t('onboarding:localModelNoPermission')}
</div>
</Label>
</div>
<div className="flex items-center space-x-3 border rounded-lg p-4 hover:bg-accent/50 cursor-pointer">
<RadioGroupItem value="aliyun" id="aliyun" />
<Label htmlFor="aliyun" className="flex-1 cursor-pointer">
<div className="font-medium">{t('onboarding:aliyunApi')}<span className="ml-2 text-xs text-primary">{t('onboarding:aliyunApiRecommended')}</span></div>
<div className="text-sm text-muted-foreground">{t('onboarding:aliyunApiDescription')}</div>
</Label>
</div>
</RadioGroup>
</div>
<DialogFooter>
{isBackendAvailable('local') && (
<Button type="button" variant="outline" onClick={handleSkip}>
{t('onboarding:skipConfig')}
</Button>
)}
<Button type="button" onClick={handleNextStep}>
{t('onboarding:nextStep')}
</Button>
</DialogFooter>
</>
)}
{step === 2 && (
<Form {...form}>
<form onSubmit={form.handleSubmit(handleVerifyAndComplete)} className="space-y-4">
<FormField
control={form.control}
name="api_key"
render={({ field }) => (
<FormItem>
<FormLabel>{t('onboarding:apiKey')}</FormLabel>
<FormControl>
<Input
type="password"
placeholder="sk-xxxxxxxxxxxxxxxx"
disabled={isLoading}
{...field}
/>
</FormControl>
<FormMessage />
<p className="text-sm text-muted-foreground mt-2">
<a
href="https://help.aliyun.com/zh/model-studio/qwen-tts-realtime?spm=a2ty_o06.30285417.0.0.2994c921szHZj2"
target="_blank"
rel="noopener noreferrer"
className="text-primary hover:underline"
>
{t('onboarding:howToGetApiKey')}
</a>
</p>
</FormItem>
)}
/>
<DialogFooter>
<Button
type="button"
variant="outline"
onClick={() => setStep(1)}
disabled={isLoading}
>
{t('onboarding:back')}
</Button>
<Button type="submit" disabled={isLoading}>
{isLoading ? t('onboarding:verifying') : t('onboarding:verifyAndComplete')}
</Button>
</DialogFooter>
</form>
</Form>
)}
</DialogContent>
</Dialog>
)
}

View File

@@ -1,55 +0,0 @@
import { memo, useMemo } from 'react'
import { Button } from '@/components/ui/button'
import { Shuffle } from 'lucide-react'
interface Preset {
label: string
[key: string]: any
}
interface PresetSelectorProps<T extends Preset> {
presets: readonly T[]
onSelect: (preset: T) => void
}
const PresetSelectorInner = <T extends Preset>({ presets, onSelect }: PresetSelectorProps<T>) => {
const presetButtons = useMemo(() => {
return presets.map((preset, index) => (
<Button
key={`${preset.label}-${index}`}
type="button"
variant="outline"
size="sm"
onClick={() => onSelect(preset)}
className="text-xs md:text-sm px-2 py-0.5 h-5"
>
{preset.label}
</Button>
))
}, [presets, onSelect])
const handleRandomSelect = () => {
const randomIndex = Math.floor(Math.random() * presets.length)
onSelect(presets[randomIndex])
}
return (
<div className="flex items-center gap-1.5 mt-1">
<div className="flex flex-wrap gap-1 flex-1">
{presetButtons}
</div>
<Button
type="button"
variant="ghost"
size="icon"
onClick={handleRandomSelect}
className="h-6 w-6 flex-shrink-0"
title="随机选择"
>
<Shuffle className="h-3 w-3" />
</Button>
</div>
)
}
export const PresetSelector = memo(PresetSelectorInner) as typeof PresetSelectorInner

View File

@@ -1,665 +0,0 @@
import { useForm } from 'react-hook-form'
import { zodResolver } from '@hookform/resolvers/zod'
import * as z from 'zod'
import { useEffect, useState, forwardRef, useImperativeHandle, useMemo } from 'react'
import { useTranslation } from 'react-i18next'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Textarea } from '@/components/ui/textarea'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue, SelectGroup, SelectLabel } from '@/components/ui/select'
import { Dialog, DialogContent, DialogDescription, DialogHeader, DialogTitle, DialogTrigger, DialogFooter } from '@/components/ui/dialog'
import { Label } from '@/components/ui/label'
import { Globe2, User, Type, Sparkles, Play, Settings, Zap } from 'lucide-react'
import { toast } from 'sonner'
import { IconLabel } from '@/components/IconLabel'
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from '@/components/ui/tooltip'
import { ttsApi, jobApi, voiceDesignApi } from '@/lib/api'
import { useJobPolling } from '@/hooks/useJobPolling'
import { useHistoryContext } from '@/contexts/HistoryContext'
import { useUserPreferences } from '@/contexts/UserPreferencesContext'
import { LoadingState } from '@/components/LoadingState'
import { AudioPlayer } from '@/components/AudioPlayer'
import { PresetSelector } from '@/components/PresetSelector'
import type { Language, UnifiedSpeakerItem } from '@/types/tts'
type FormData = {
text: string
language: string
speaker: string
instruct?: string
max_new_tokens?: number
temperature?: number
top_k?: number
top_p?: number
repetition_penalty?: number
}
export interface CustomVoiceFormHandle {
loadParams: (params: any) => void
}
const CustomVoiceForm = forwardRef<CustomVoiceFormHandle>((_props, ref) => {
const { t } = useTranslation('tts')
const { t: tCommon } = useTranslation('common')
const { t: tErrors } = useTranslation('errors')
const { t: tConstants } = useTranslation('constants')
const PRESET_INSTRUCTS = useMemo(() => tConstants('presetInstructs', { returnObjects: true }) as Array<{ label: string; instruct: string; text: string }>, [tConstants])
const formSchema = z.object({
text: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.text') })).max(1000, tErrors('validation.maxLength', { field: tErrors('fieldNames.text'), max: 1000 })),
language: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.language') })),
speaker: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.speaker') })),
instruct: z.string().optional(),
max_new_tokens: z.number().min(128).max(4096).optional(),
temperature: z.number().min(0.1).max(2).optional(),
top_k: z.number().min(1).max(100).optional(),
top_p: z.number().min(0).max(1).optional(),
repetition_penalty: z.number().min(1).max(2).optional(),
})
const [languages, setLanguages] = useState<Language[]>([])
const [unifiedSpeakers, setUnifiedSpeakers] = useState<UnifiedSpeakerItem[]>([])
const [selectedSpeakerId, setSelectedSpeakerId] = useState<string>('')
const [isLoading, setIsLoading] = useState(false)
const [advancedOpen, setAdvancedOpen] = useState(false)
const [indexTTS2Open, setIndexTTS2Open] = useState(false)
const [selectedEmotion, setSelectedEmotion] = useState('none')
const [emoText, setEmoText] = useState('')
const [emoAlpha, setEmoAlpha] = useState(0.6)
const EMOTION_PRESETS = [
{ value: 'none', label: '不使用情感控制', emo_text: '', emo_alpha: 0.5 },
{ value: 'happy', label: '开心', emo_text: '开心', emo_alpha: 0.6 },
{ value: 'angry', label: '愤怒', emo_text: '愤怒', emo_alpha: 0.15 },
{ value: 'sad', label: '悲伤', emo_text: '悲伤', emo_alpha: 0.4 },
{ value: 'fear', label: '恐惧', emo_text: '恐惧', emo_alpha: 0.4 },
{ value: 'hate', label: '厌恶', emo_text: '厌恶', emo_alpha: 0.6 },
{ value: 'low', label: '低沉', emo_text: '低沉', emo_alpha: 0.6 },
{ value: 'surprise', label: '惊讶', emo_text: '惊讶', emo_alpha: 0.3 },
{ value: 'neutral', label: '中性', emo_text: '中性', emo_alpha: 0.5 },
]
const [isIndexTTS2Loading, setIsIndexTTS2Loading] = useState(false)
const [tempAdvancedParams, setTempAdvancedParams] = useState({
max_new_tokens: 2048,
temperature: 0.9,
top_k: 50,
top_p: 1.0,
repetition_penalty: 1.05
})
const { currentJob, isPolling, isCompleted, startPolling, elapsedTime } = useJobPolling()
const { refresh } = useHistoryContext()
const { preferences } = useUserPreferences()
const selectedSpeaker = useMemo(() =>
unifiedSpeakers.find(s => s.id === selectedSpeakerId),
[unifiedSpeakers, selectedSpeakerId]
)
const isInstructDisabled = selectedSpeaker?.source === 'saved-design'
const {
register,
handleSubmit,
setValue,
watch,
formState: { errors },
} = useForm<FormData>({
resolver: zodResolver(formSchema),
defaultValues: {
text: '',
language: 'Auto',
speaker: '',
instruct: '',
max_new_tokens: 2048,
temperature: 0.9,
top_k: 50,
top_p: 1.0,
repetition_penalty: 1.05,
},
})
useImperativeHandle(ref, () => ({
loadParams: (params: any) => {
setValue('text', params.text || '')
setValue('language', params.language || 'Auto')
setValue('speaker', params.speaker || '')
if (params.speaker) {
const item = unifiedSpeakers.find(s =>
s.source === 'builtin' && s.id === params.speaker
)
if (item) {
setSelectedSpeakerId(item.id)
}
}
setValue('instruct', params.instruct || '')
setValue('max_new_tokens', params.max_new_tokens || 2048)
setValue('temperature', params.temperature || 0.9)
setValue('top_k', params.top_k || 50)
setValue('top_p', params.top_p || 1.0)
setValue('repetition_penalty', params.repetition_penalty || 1.05)
}
}))
useEffect(() => {
const fetchData = async () => {
try {
const backend = preferences?.default_backend || 'local'
const [langs, builtinSpeakers, savedDesigns] = await Promise.all([
ttsApi.getLanguages(),
ttsApi.getSpeakers(backend),
voiceDesignApi.list(backend)
])
const designItems: UnifiedSpeakerItem[] = savedDesigns.designs.map(d => ({
id: `design-${d.id}`,
displayName: `${d.name}`,
description: d.instruct.substring(0, 60) + (d.instruct.length > 60 ? '...' : ''),
source: 'saved-design',
designId: d.id,
instruct: d.instruct,
backendType: d.backend_type,
hasRefAudio: !!d.ref_audio_path,
}))
const builtinItems: UnifiedSpeakerItem[] = builtinSpeakers.map(s => ({
id: s.name,
displayName: s.name,
description: s.description,
source: 'builtin'
}))
setLanguages(langs)
setUnifiedSpeakers([...designItems, ...builtinItems])
} catch (error) {
toast.error(t('loadDataFailed'))
}
}
fetchData()
}, [preferences?.default_backend, t])
useEffect(() => {
if (selectedSpeaker?.source === 'saved-design' && selectedSpeaker.instruct) {
setValue('instruct', selectedSpeaker.instruct)
}
}, [selectedSpeakerId, selectedSpeaker, setValue])
const onSubmit = async (data: FormData) => {
setIsLoading(true)
try {
const selectedItem = unifiedSpeakers.find(s => s.id === selectedSpeakerId)
let result
if (selectedItem?.source === 'saved-design') {
if (selectedItem.backendType === 'local') {
const formData = new FormData()
formData.append('text', data.text)
formData.append('language', data.language)
formData.append('voice_design_id', String(selectedItem.designId))
formData.append('max_new_tokens', String(data.max_new_tokens ?? 2048))
formData.append('temperature', String(data.temperature ?? 0.9))
formData.append('backend', 'local')
const token = localStorage.getItem('token')
const baseURL = import.meta.env.VITE_API_URL || ''
const response = await fetch(`${baseURL}/tts/voice-clone`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
},
body: formData,
})
if (!response.ok) {
throw new Error('Failed to create voice clone job')
}
result = await response.json()
} else {
result = await ttsApi.createVoiceDesignJob({
text: data.text,
language: data.language,
saved_design_id: selectedItem.designId,
max_new_tokens: data.max_new_tokens ?? 2048,
temperature: data.temperature ?? 0.9,
})
}
} else {
result = await ttsApi.createCustomVoiceJob({
text: data.text,
language: data.language,
speaker: data.speaker,
instruct: data.instruct,
max_new_tokens: data.max_new_tokens ?? 2048,
temperature: data.temperature ?? 0.9,
})
}
toast.success(t('taskCreated'))
startPolling(result.job_id)
try {
await refresh()
} catch {}
} catch (error) {
toast.error(t('taskCreateFailed'))
} finally {
setIsLoading(false)
}
}
const handleIndexTTS2Submit = async () => {
const selectedItem = unifiedSpeakers.find(s => s.id === selectedSpeakerId)
if (!selectedItem?.designId) return
setIsIndexTTS2Loading(true)
try {
const text = watch('text')
if (!text) { toast.error(t('textPlaceholder')); return }
const result = await ttsApi.indextts2FromDesign({
voice_design_id: selectedItem.designId,
text,
emo_text: emoText || undefined,
emo_alpha: emoAlpha,
})
toast.success(t('taskCreated'))
setIndexTTS2Open(false)
startPolling(result.job_id)
try { await refresh() } catch {}
} catch {
toast.error(t('taskCreateFailed'))
} finally {
setIsIndexTTS2Loading(false)
}
}
const memoizedAudioUrl = useMemo(() => {
if (!currentJob) return ''
return jobApi.getAudioUrl(currentJob.id, currentJob.audio_url)
}, [currentJob?.id, currentJob?.audio_url])
return (
<form onSubmit={handleSubmit(onSubmit)} className="space-y-2">
<div className="space-y-0.5">
<IconLabel icon={Globe2} tooltip={t('languageLabel')} required />
<Select
value={watch('language')}
onValueChange={(value: string) => setValue('language', value)}
>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
<SelectContent>
{languages.map((lang) => (
<SelectItem key={lang.code} value={lang.code}>
{tConstants(`languages.${lang.code}`, { defaultValue: lang.name })}
</SelectItem>
))}
</SelectContent>
</Select>
{errors.language && (
<p className="text-sm text-destructive">{errors.language.message}</p>
)}
</div>
<div className="space-y-0.5">
<IconLabel icon={User} tooltip={t('speakerLabel')} required />
<Select
value={selectedSpeakerId}
onValueChange={(value: string) => {
const newSpeaker = unifiedSpeakers.find(s => s.id === value)
const previousSource = selectedSpeaker?.source
if (newSpeaker) {
setSelectedSpeakerId(value)
setValue('speaker', newSpeaker.id)
if (newSpeaker.source === 'builtin' && previousSource === 'saved-design') {
setValue('instruct', '')
}
}
}}
>
<SelectTrigger>
<SelectValue placeholder={t('speakerPlaceholder')}>
{selectedSpeakerId && (() => {
const item = unifiedSpeakers.find(s => s.id === selectedSpeakerId)
if (!item) return null
if (item.source === 'saved-design') {
return item.displayName
}
return `${item.displayName} - ${item.description}`
})()}
</SelectValue>
</SelectTrigger>
<SelectContent>
{unifiedSpeakers.filter(s => s.source === 'saved-design').length > 0 && (
<SelectGroup>
<SelectLabel className="text-xs text-muted-foreground">{t('myVoiceDesigns')}</SelectLabel>
{unifiedSpeakers
.filter(s => s.source === 'saved-design')
.map(item => (
<SelectItem key={item.id} value={item.id}>
<div className="flex flex-col">
<span className="font-medium">{item.displayName}</span>
<span className="text-xs text-muted-foreground">{item.description}</span>
</div>
</SelectItem>
))}
</SelectGroup>
)}
<SelectGroup>
<SelectLabel className="text-xs text-muted-foreground">{t('builtinSpeakers')}</SelectLabel>
{unifiedSpeakers
.filter(s => s.source === 'builtin')
.map(item => (
<SelectItem key={item.id} value={item.id}>
{item.displayName} - {item.description}
</SelectItem>
))}
</SelectGroup>
</SelectContent>
</Select>
{errors.speaker && (
<p className="text-sm text-destructive">{errors.speaker.message}</p>
)}
</div>
<div className="space-y-0.5">
<IconLabel icon={Type} tooltip={t('textLabel')} required />
<Textarea
{...register('text')}
placeholder={t('textPlaceholder')}
className="min-h-[40px] md:min-h-[60px]"
/>
{errors.text && (
<p className="text-sm text-destructive">{errors.text.message}</p>
)}
</div>
<div className="space-y-0.5">
<IconLabel icon={Sparkles} tooltip={t('instructLabel')} />
<Textarea
{...register('instruct')}
placeholder={isInstructDisabled
? t('instructPlaceholderDesign')
: t('instructPlaceholderDefault')
}
className="min-h-[40px] md:min-h-[60px]"
disabled={isInstructDisabled}
/>
{!isInstructDisabled && (
<PresetSelector
presets={PRESET_INSTRUCTS}
onSelect={(preset) => {
setValue('instruct', preset.instruct)
if (preset.text) {
setValue('text', preset.text)
}
}}
/>
)}
{errors.instruct && (
<p className="text-sm text-destructive">{errors.instruct.message}</p>
)}
</div>
<Dialog open={advancedOpen} onOpenChange={(open) => {
if (open) {
setTempAdvancedParams({
max_new_tokens: watch('max_new_tokens') || 2048,
temperature: watch('temperature') || 0.9,
top_k: watch('top_k') || 50,
top_p: watch('top_p') || 1.0,
repetition_penalty: watch('repetition_penalty') || 1.05
})
}
setAdvancedOpen(open)
}}>
<DialogTrigger asChild>
<Button type="button" variant="outline" className="w-full">
<Settings className="mr-2 h-4 w-4" />
{t('advancedOptions')}
</Button>
</DialogTrigger>
<DialogContent className="sm:max-w-[500px]">
<DialogHeader>
<DialogTitle>{t('advancedOptionsTitle')}</DialogTitle>
<DialogDescription>{t('advancedOptionsDescription')}</DialogDescription>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="space-y-2">
<Label htmlFor="dialog-max_new_tokens">
{t('advancedParams.maxNewTokens.label')}
</Label>
<Input
id="dialog-max_new_tokens"
type="number"
min={128}
max={4096}
value={tempAdvancedParams.max_new_tokens}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
max_new_tokens: parseInt(e.target.value) || 2048
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.maxNewTokens.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-temperature">
{t('advancedParams.temperature.label')}
</Label>
<Input
id="dialog-temperature"
type="number"
min={0.1}
max={2}
step={0.1}
value={tempAdvancedParams.temperature}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
temperature: parseFloat(e.target.value) || 0.9
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.temperature.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-top_k">
{t('advancedParams.topK.label')}
</Label>
<Input
id="dialog-top_k"
type="number"
min={1}
max={100}
value={tempAdvancedParams.top_k}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
top_k: parseInt(e.target.value) || 20
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.topK.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-top_p">
{t('advancedParams.topP.label')}
</Label>
<Input
id="dialog-top_p"
type="number"
min={0}
max={1}
step={0.1}
value={tempAdvancedParams.top_p}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
top_p: parseFloat(e.target.value) || 0.7
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.topP.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-repetition_penalty">
{t('advancedParams.repetitionPenalty.label')}
</Label>
<Input
id="dialog-repetition_penalty"
type="number"
min={0}
max={2}
step={0.01}
value={tempAdvancedParams.repetition_penalty}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
repetition_penalty: parseFloat(e.target.value) || 1.05
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.repetitionPenalty.description')}
</p>
</div>
</div>
<DialogFooter>
<Button
type="button"
variant="outline"
onClick={() => {
setTempAdvancedParams({
max_new_tokens: watch('max_new_tokens') || 2048,
temperature: watch('temperature') || 0.3,
top_k: watch('top_k') || 20,
top_p: watch('top_p') || 0.7,
repetition_penalty: watch('repetition_penalty') || 1.05
})
setAdvancedOpen(false)
}}
>
{tCommon('cancel')}
</Button>
<Button
type="button"
onClick={() => {
setValue('max_new_tokens', tempAdvancedParams.max_new_tokens)
setValue('temperature', tempAdvancedParams.temperature)
setValue('top_k', tempAdvancedParams.top_k)
setValue('top_p', tempAdvancedParams.top_p)
setValue('repetition_penalty', tempAdvancedParams.repetition_penalty)
setAdvancedOpen(false)
}}
>
{tCommon('ok')}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
{selectedSpeaker?.source === 'saved-design' && selectedSpeaker.hasRefAudio && (
<Dialog open={indexTTS2Open} onOpenChange={setIndexTTS2Open}>
<DialogTrigger asChild>
<Button type="button" variant="outline" className="w-full">
<Zap className="mr-2 h-4 w-4" />
</Button>
</DialogTrigger>
<DialogContent className="sm:max-w-[480px]">
<DialogHeader>
<DialogTitle></DialogTitle>
<DialogDescription>使 IndexTTS2 </DialogDescription>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="space-y-2">
<Label></Label>
<Select
value={selectedEmotion}
onValueChange={(value) => {
const preset = EMOTION_PRESETS.find(p => p.value === value)
if (preset) {
setSelectedEmotion(value)
setEmoText(preset.emo_text)
setEmoAlpha(preset.emo_alpha)
}
}}
>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
<SelectContent>
{EMOTION_PRESETS.map(p => (
<SelectItem key={p.value} value={p.value}>
{p.label}
{p.value !== 'none' && (
<span className="ml-2 text-xs text-muted-foreground"> {p.emo_alpha}</span>
)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
{selectedEmotion !== 'none' && (
<div className="space-y-2">
<Label>{emoAlpha.toFixed(2)}</Label>
<Input
type="range"
min={0}
max={1}
step={0.05}
value={emoAlpha}
onChange={e => setEmoAlpha(parseFloat(e.target.value))}
/>
{(selectedEmotion === 'angry') && (
<p className="text-xs text-muted-foreground"> 0.2</p>
)}
</div>
)}
</div>
<DialogFooter>
<Button type="button" variant="outline" onClick={() => setIndexTTS2Open(false)}></Button>
<Button type="button" onClick={handleIndexTTS2Submit} disabled={isIndexTTS2Loading || isPolling}>
<Zap className="mr-2 h-4 w-4" />
{isIndexTTS2Loading ? t('creating') : '合成'}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
)}
<TooltipProvider>
<Tooltip>
<TooltipTrigger asChild>
<Button type="submit" className="w-full" disabled={isLoading || isPolling}>
<Play className="mr-2 h-4 w-4" />
{isLoading ? t('creating') : t('generate')}
</Button>
</TooltipTrigger>
<TooltipContent>
<p>{t('generate')}</p>
</TooltipContent>
</Tooltip>
</TooltipProvider>
{isPolling && <LoadingState elapsedTime={elapsedTime} />}
{isCompleted && currentJob && (
<div className="space-y-4 pt-4 border-t">
<AudioPlayer
audioUrl={memoizedAudioUrl}
jobId={currentJob.id}
text={currentJob.parameters?.text}
/>
</div>
)}
</form>
)
})
export default CustomVoiceForm

View File

@@ -1,433 +0,0 @@
import { useForm, Controller } from 'react-hook-form'
import { zodResolver } from '@hookform/resolvers/zod'
import * as z from 'zod'
import { useEffect, useState, useMemo } from 'react'
import { useTranslation } from 'react-i18next'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Textarea } from '@/components/ui/textarea'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
import { Dialog, DialogContent, DialogDescription, DialogHeader, DialogTitle, DialogTrigger, DialogFooter } from '@/components/ui/dialog'
import { Checkbox } from '@/components/ui/checkbox'
import { Label } from '@/components/ui/label'
import { Settings, Globe2, Type, Play, FileText, Mic, ArrowRight, ArrowLeft } from 'lucide-react'
import { toast } from 'sonner'
import { IconLabel } from '@/components/IconLabel'
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from '@/components/ui/tooltip'
import { ttsApi, jobApi } from '@/lib/api'
import { useJobPolling } from '@/hooks/useJobPolling'
import { useHistoryContext } from '@/contexts/HistoryContext'
import { LoadingState } from '@/components/LoadingState'
import { AudioPlayer } from '@/components/AudioPlayer'
import { FileUploader } from '@/components/FileUploader'
import { AudioRecorder } from '@/components/AudioRecorder'
import { PresetSelector } from '@/components/PresetSelector'
import type { Language } from '@/types/tts'
import { Tabs, TabsContent, TabsList, TabsTrigger } from '@/components/ui/tabs'
type FormData = {
text: string
language?: string
ref_audio: File
ref_text?: string
use_cache?: boolean
x_vector_only_mode?: boolean
max_new_tokens?: number
temperature?: number
top_k?: number
top_p?: number
repetition_penalty?: number
}
function VoiceCloneForm() {
const { t } = useTranslation('tts')
const { t: tCommon } = useTranslation('common')
const { t: tVoice } = useTranslation('voice')
const { t: tErrors } = useTranslation('errors')
const { t: tConstants } = useTranslation('constants')
const PRESET_REF_TEXTS = useMemo(() => tConstants('presetRefTexts', { returnObjects: true }) as Array<{ label: string; text: string }>, [tConstants])
const formSchema = z.object({
text: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.text') })).max(1000, tErrors('validation.maxLength', { field: tErrors('fieldNames.text'), max: 1000 })),
language: z.string().optional(),
ref_audio: z.instanceof(File, { message: tErrors('validation.required', { field: tErrors('fieldNames.reference_audio') }) }),
ref_text: z.string().optional(),
use_cache: z.boolean().optional(),
x_vector_only_mode: z.boolean().optional(),
max_new_tokens: z.number().min(128).max(4096).optional(),
temperature: z.number().min(0.1).max(2).optional(),
top_k: z.number().min(1).max(100).optional(),
top_p: z.number().min(0).max(1).optional(),
repetition_penalty: z.number().min(1).max(2).optional(),
})
const [languages, setLanguages] = useState<Language[]>([])
const [isLoading, setIsLoading] = useState(false)
const [advancedOpen, setAdvancedOpen] = useState(false)
const [step, setStep] = useState<1 | 2>(1)
const [inputTab, setInputTab] = useState<'upload' | 'record'>('upload')
const [tempAdvancedParams, setTempAdvancedParams] = useState({
max_new_tokens: 2048
})
const { currentJob, isPolling, isCompleted, startPolling, elapsedTime } = useJobPolling()
const { refresh } = useHistoryContext()
const {
register,
handleSubmit,
setValue,
watch,
control,
trigger,
formState: { errors },
} = useForm<FormData>({
resolver: zodResolver(formSchema),
defaultValues: {
text: '',
language: 'Auto',
ref_text: '',
use_cache: true,
x_vector_only_mode: false,
max_new_tokens: 2048,
temperature: 0.9,
top_k: 50,
top_p: 1.0,
repetition_penalty: 1.05,
} as Partial<FormData>,
})
useEffect(() => {
const fetchData = async () => {
try {
const langs = await ttsApi.getLanguages()
setLanguages(langs)
} catch (error) {
toast.error(t('loadDataFailed'))
}
}
fetchData()
}, [t])
useEffect(() => {
if (inputTab === 'record' && PRESET_REF_TEXTS.length > 0) {
setValue('ref_text', PRESET_REF_TEXTS[0].text)
} else if (inputTab === 'upload') {
setValue('ref_text', '')
}
}, [inputTab, setValue])
const handleNextStep = async () => {
// Validate step 1 fields
const valid = await trigger(['ref_audio', 'ref_text'])
if (valid) {
setStep(2)
}
}
const onSubmit = async (data: FormData) => {
setIsLoading(true)
try {
const result = await ttsApi.createVoiceCloneJob({
...data,
ref_audio: data.ref_audio,
})
toast.success(t('taskCreated'))
startPolling(result.job_id)
try {
await refresh()
} catch { }
} catch (error) {
toast.error(t('taskCreateFailed'))
} finally {
setIsLoading(false)
}
}
const memoizedAudioUrl = useMemo(() => {
if (!currentJob) return ''
return jobApi.getAudioUrl(currentJob.id, currentJob.audio_url)
}, [currentJob?.id, currentJob?.audio_url])
return (
<form onSubmit={handleSubmit(onSubmit)} className="space-y-4">
{/* Steps Indicator */}
<div className="flex items-center justify-center space-x-4 mb-6">
<div className={`flex items-center space-x-2 ${step === 1 ? 'text-primary' : 'text-muted-foreground'}`}>
<div className={`w-8 h-8 rounded-full flex items-center justify-center border-2 ${step === 1 ? 'border-primary bg-primary/10' : 'border-muted'}`}>1</div>
<span className="text-sm font-medium">{tVoice('step1Title')}</span>
</div>
<div className="w-8 h-[2px] bg-muted" />
<div className={`flex items-center space-x-2 ${step === 2 ? 'text-primary' : 'text-muted-foreground'}`}>
<div className={`w-8 h-8 rounded-full flex items-center justify-center border-2 ${step === 2 ? 'border-primary bg-primary/10' : 'border-muted'}`}>2</div>
<span className="text-sm font-medium">{tVoice('step2Title')}</span>
</div>
</div>
<div className={step === 1 ? 'block' : 'hidden'}>
{/* Step 1: Input Selection */}
<Tabs value={inputTab} onValueChange={(v) => setInputTab(v as any)} className="w-full">
<TabsList className="grid w-full grid-cols-2">
<TabsTrigger value="upload" className="flex items-center gap-2">
<FileText className="h-4 w-4" />
{tVoice('uploadTab')}
</TabsTrigger>
<TabsTrigger value="record" className="flex items-center gap-2">
<Mic className="h-4 w-4" />
{tVoice('recordTab')}
</TabsTrigger>
</TabsList>
<TabsContent value="upload" className="space-y-4 mt-4">
<div className="space-y-0.5">
<Label>{tVoice('refAudioLabel')}</Label>
<Controller
name="ref_audio"
control={control}
render={({ field }) => (
<FileUploader
value={field.value}
onChange={field.onChange}
error={errors.ref_audio?.message}
/>
)}
/>
</div>
<div className="space-y-0.5">
<Label>{tVoice('refTextLabel')}</Label>
<Textarea
{...register('ref_text')}
placeholder={tVoice('refTextPlaceholder')}
className="min-h-[100px]"
/>
<PresetSelector
presets={PRESET_REF_TEXTS}
onSelect={(preset) => setValue('ref_text', preset.text)}
/>
</div>
<Button type="button" className="w-full mt-6" onClick={handleNextStep}>
{tVoice('nextStep')}
<ArrowRight className="ml-2 h-4 w-4" />
</Button>
</TabsContent>
<TabsContent value="record" className="space-y-4 mt-4">
<div className="space-y-2">
<Label className="text-base font-medium">{tVoice('readPrompt')}</Label>
<div className="grid grid-cols-3 gap-2">
{PRESET_REF_TEXTS.map((preset, i) => {
const isSelected = watch('ref_text') === preset.text
return (
<div
key={i}
className={`p-3 border rounded-lg hover:bg-accent cursor-pointer transition-colors text-sm text-center ${
isSelected ? 'border-primary bg-primary/10' : ''
}`}
onClick={() => setValue('ref_text', preset.text)}
>
<div className="font-medium">{preset.label}</div>
</div>
)
})}
</div>
<div className="space-y-0.5 pt-2">
<Label>{tVoice('currentRefText')}</Label>
<Textarea
{...register('ref_text')}
placeholder={tVoice('currentRefTextPlaceholder')}
className="min-h-[80px]"
/>
</div>
</div>
{/* Mobile-friendly Bottom Recorder Area */}
<div className="fixed bottom-0 left-0 right-0 p-4 bg-background border-t z-50 md:relative md:border-t-0 md:bg-transparent md:p-0 md:z-0">
<div className="space-y-3">
{watch('ref_audio') && (
<Button type="button" className="w-full" onClick={handleNextStep}>
{tVoice('nextStep')}
<ArrowRight className="ml-2 h-4 w-4" />
</Button>
)}
<Controller
name="ref_audio"
control={control}
render={({ field }) => (
<AudioRecorder
onChange={field.onChange}
/>
)}
/>
{errors.ref_audio && (
<p className="text-sm text-destructive mt-2 text-center md:text-left">{errors.ref_audio.message}</p>
)}
</div>
</div>
{/* Spacer for mobile to prevent content being hidden behind fixed footer */}
<div className="h-24 md:hidden" />
</TabsContent>
</Tabs>
</div>
<div className={step === 2 ? 'block space-y-4' : 'hidden'}>
{/* Step 2: Synthesis Options */}
<div className="space-y-0.5">
<IconLabel icon={Globe2} tooltip={tVoice('languageOptional')} />
<Select
value={watch('language')}
onValueChange={(value: string) => setValue('language', value)}
>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
<SelectContent>
{languages.map((lang) => (
<SelectItem key={lang.code} value={lang.code}>
{tConstants(`languages.${lang.code}`, { defaultValue: lang.name })}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
<div className="space-y-0.5">
<IconLabel icon={Type} tooltip={t('textLabel')} required />
<Textarea
{...register('text')}
placeholder={t('textPlaceholder')}
className="min-h-[120px]"
/>
<PresetSelector
presets={PRESET_REF_TEXTS}
onSelect={(preset) => setValue('text', preset.text)}
/>
{errors.text && (
<p className="text-sm text-destructive">{errors.text.message}</p>
)}
</div>
<div className="flex flex-col sm:flex-row gap-4 pt-2">
<div className="flex items-center space-x-2">
<Checkbox
id="x_vector_only_mode"
checked={watch('x_vector_only_mode')}
onCheckedChange={(c) => setValue('x_vector_only_mode', c as boolean)}
/>
<Label htmlFor="x_vector_only_mode" className="text-sm font-normal cursor-pointer">
{tVoice('fastMode')}
</Label>
</div>
<div className="flex items-center space-x-2">
<Checkbox
id="use_cache"
checked={watch('use_cache')}
onCheckedChange={(c) => setValue('use_cache', c as boolean)}
/>
<Label htmlFor="use_cache" className="text-sm font-normal cursor-pointer">
{tVoice('useCache')}
</Label>
</div>
</div>
<Dialog open={advancedOpen} onOpenChange={(open) => {
if (open) {
setTempAdvancedParams({
max_new_tokens: watch('max_new_tokens') || 2048
})
}
setAdvancedOpen(open)
}}>
<DialogTrigger asChild>
<Button type="button" variant="outline" className="w-full">
<Settings className="mr-2 h-4 w-4" />
{t('advancedOptions')}
</Button>
</DialogTrigger>
<DialogContent className="sm:max-w-[500px]">
<DialogHeader>
<DialogTitle>{t('advancedOptionsTitle')}</DialogTitle>
<DialogDescription>{t('advancedOptionsDescription')}</DialogDescription>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="space-y-2">
<Label htmlFor="dialog-max_new_tokens">
{t('advancedParams.maxNewTokens.label')}
</Label>
<Input
id="dialog-max_new_tokens"
type="number"
min={128}
max={4096}
value={tempAdvancedParams.max_new_tokens}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
max_new_tokens: parseInt(e.target.value) || 2048
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.maxNewTokens.description')}
</p>
</div>
</div>
<DialogFooter>
<Button
type="button"
variant="outline"
onClick={() => {
setAdvancedOpen(false)
}}
>
{tCommon('cancel')}
</Button>
<Button
type="button"
onClick={() => {
setValue('max_new_tokens', tempAdvancedParams.max_new_tokens)
setAdvancedOpen(false)
}}
>
{tCommon('ok')}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
<div className="flex gap-3 pt-4">
<Button type="button" variant="outline" onClick={() => setStep(1)} className="w-1/3">
<ArrowLeft className="mr-2 h-4 w-4" />
{tVoice('prevStep')}
</Button>
<TooltipProvider>
<Tooltip>
<TooltipTrigger asChild>
<Button type="submit" className="flex-1" disabled={isLoading || isPolling}>
<Play className="mr-2 h-4 w-4" />
{isLoading ? t('creating') : t('generate')}
</Button>
</TooltipTrigger>
<TooltipContent>
<p>{t('generate')}</p>
</TooltipContent>
</Tooltip>
</TooltipProvider>
</div>
</div>
{isPolling && <LoadingState elapsedTime={elapsedTime} />}
{isCompleted && currentJob && (
<div className="space-y-4 pt-4 border-t">
<AudioPlayer
audioUrl={memoizedAudioUrl}
jobId={currentJob.id}
text={currentJob.parameters?.text}
/>
</div>
)}
</form>
)
}
export default VoiceCloneForm

View File

@@ -1,476 +0,0 @@
import { useForm } from 'react-hook-form'
import { zodResolver } from '@hookform/resolvers/zod'
import * as z from 'zod'
import { useEffect, useState, forwardRef, useImperativeHandle, useMemo } from 'react'
import { useTranslation } from 'react-i18next'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Textarea } from '@/components/ui/textarea'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
import { Dialog, DialogContent, DialogDescription, DialogHeader, DialogTitle, DialogTrigger, DialogFooter } from '@/components/ui/dialog'
import { Label } from '@/components/ui/label'
import { Settings, Globe2, Type, Play, Palette, Save } from 'lucide-react'
import { toast } from 'sonner'
import { IconLabel } from '@/components/IconLabel'
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from '@/components/ui/tooltip'
import { ttsApi, jobApi, voiceDesignApi } from '@/lib/api'
import { useJobPolling } from '@/hooks/useJobPolling'
import { useHistoryContext } from '@/contexts/HistoryContext'
import { useUserPreferences } from '@/contexts/UserPreferencesContext'
import { LoadingState } from '@/components/LoadingState'
import { AudioPlayer } from '@/components/AudioPlayer'
import { PresetSelector } from '@/components/PresetSelector'
import type { Language } from '@/types/tts'
type FormData = {
text: string
language: string
instruct: string
max_new_tokens?: number
temperature?: number
top_k?: number
top_p?: number
repetition_penalty?: number
}
export interface VoiceDesignFormHandle {
loadParams: (params: any) => void
}
const VoiceDesignForm = forwardRef<VoiceDesignFormHandle>((_props, ref) => {
const { t } = useTranslation('tts')
const { t: tCommon } = useTranslation('common')
const { t: tErrors } = useTranslation('errors')
const { t: tConstants } = useTranslation('constants')
const PRESET_VOICE_DESIGNS = useMemo(() => tConstants('presetVoiceDesigns', { returnObjects: true }) as Array<{ label: string; instruct: string; text: string }>, [tConstants])
const formSchema = z.object({
text: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.text') })).max(1000, tErrors('validation.maxLength', { field: tErrors('fieldNames.text'), max: 1000 })),
language: z.string().min(1, tErrors('validation.required', { field: tErrors('fieldNames.language') })),
instruct: z.string().min(10, tErrors('validation.minLength', { field: tErrors('fieldNames.instruct'), min: 10 })).max(500, tErrors('validation.maxLength', { field: tErrors('fieldNames.instruct'), max: 500 })),
max_new_tokens: z.number().min(128).max(4096).optional(),
temperature: z.number().min(0.1).max(2).optional(),
top_k: z.number().min(1).max(100).optional(),
top_p: z.number().min(0).max(1).optional(),
repetition_penalty: z.number().min(1).max(2).optional(),
})
const [languages, setLanguages] = useState<Language[]>([])
const [isLoading, setIsLoading] = useState(false)
const [advancedOpen, setAdvancedOpen] = useState(false)
const [tempAdvancedParams, setTempAdvancedParams] = useState({
max_new_tokens: 2048,
temperature: 0.3,
top_k: 20,
top_p: 0.7,
repetition_penalty: 1.05
})
const [showSaveDialog, setShowSaveDialog] = useState(false)
const [saveDesignName, setSaveDesignName] = useState('')
const [isPreparing, setIsPreparing] = useState(false)
const { currentJob, isPolling, isCompleted, startPolling, elapsedTime } = useJobPolling()
const { refresh } = useHistoryContext()
const { preferences } = useUserPreferences()
const {
register,
handleSubmit,
setValue,
watch,
formState: { errors },
} = useForm<FormData>({
resolver: zodResolver(formSchema),
defaultValues: {
text: '',
language: 'Auto',
instruct: '',
max_new_tokens: 2048,
temperature: 0.3,
top_k: 20,
top_p: 0.7,
repetition_penalty: 1.05,
},
})
useImperativeHandle(ref, () => ({
loadParams: (params: any) => {
setValue('text', params.text || '')
setValue('language', params.language || 'Auto')
setValue('instruct', params.instruct || '')
setValue('max_new_tokens', params.max_new_tokens || 2048)
setValue('temperature', params.temperature || 0.3)
setValue('top_k', params.top_k || 20)
setValue('top_p', params.top_p || 0.7)
setValue('repetition_penalty', params.repetition_penalty || 1.05)
}
}))
useEffect(() => {
const fetchData = async () => {
try {
const langs = await ttsApi.getLanguages()
setLanguages(langs)
} catch (error) {
toast.error(t('loadDataFailed'))
}
}
fetchData()
}, [t])
const onSubmit = async (data: FormData) => {
setIsLoading(true)
try {
const result = await ttsApi.createVoiceDesignJob(data)
toast.success(t('taskCreated'))
startPolling(result.job_id)
try {
await refresh()
} catch {}
} catch (error) {
toast.error(t('taskCreateFailed'))
} finally {
setIsLoading(false)
}
}
const handleSaveDesign = async () => {
const instruct = watch('instruct')
if (!instruct || instruct.length < 10) {
toast.error(t('fillDesignDescription'))
return
}
if (!saveDesignName.trim()) {
toast.error(t('fillDesignName'))
return
}
const backend = preferences?.default_backend || 'local'
const text = watch('text')
const designData = {
name: saveDesignName,
instruct: instruct,
backend_type: backend,
preview_text: text || `${saveDesignName}的声音`
}
try {
if (backend === 'local') {
setIsPreparing(true)
try {
await voiceDesignApi.prepareAndCreate(designData)
toast.success(t('designSaved'))
} finally {
setIsPreparing(false)
}
} else {
await voiceDesignApi.create(designData)
toast.success(t('designSaved'))
}
setShowSaveDialog(false)
setSaveDesignName('')
} catch (error) {
toast.error(t('saveFailed'))
}
}
const memoizedAudioUrl = useMemo(() => {
if (!currentJob) return ''
return jobApi.getAudioUrl(currentJob.id, currentJob.audio_url)
}, [currentJob?.id, currentJob?.audio_url])
return (
<form onSubmit={handleSubmit(onSubmit)} className="space-y-2">
<div className="space-y-0.5">
<IconLabel icon={Globe2} tooltip={t('languageLabel')} required />
<Select
value={watch('language')}
onValueChange={(value: string) => setValue('language', value)}
>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
<SelectContent>
{languages.map((lang) => (
<SelectItem key={lang.code} value={lang.code}>
{tConstants(`languages.${lang.code}`, { defaultValue: lang.name })}
</SelectItem>
))}
</SelectContent>
</Select>
{errors.language && (
<p className="text-sm text-destructive">{errors.language.message}</p>
)}
</div>
<div className="space-y-0.5">
<IconLabel icon={Type} tooltip={t('textLabel')} required />
<Textarea
{...register('text')}
placeholder={t('textPlaceholder')}
className="min-h-[40px] md:min-h-[60px]"
/>
{errors.text && (
<p className="text-sm text-destructive">{errors.text.message}</p>
)}
</div>
<div className="space-y-0.5">
<IconLabel icon={Palette} tooltip={t('designDescriptionLabel')} required />
<Textarea
{...register('instruct')}
placeholder={t('designDescriptionPlaceholder')}
className="min-h-[40px] md:min-h-[60px]"
/>
<PresetSelector
presets={PRESET_VOICE_DESIGNS}
onSelect={(preset) => {
setValue('instruct', preset.instruct)
if (preset.text) {
setValue('text', preset.text)
}
}}
/>
{errors.instruct && (
<p className="text-sm text-destructive">{errors.instruct.message}</p>
)}
</div>
<Dialog open={showSaveDialog} onOpenChange={setShowSaveDialog}>
<DialogContent className="sm:max-w-[425px]">
<DialogHeader>
<DialogTitle>{t('saveDesignTitle')}</DialogTitle>
<DialogDescription>{t('saveDesignDescription')}</DialogDescription>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="space-y-2">
<Label htmlFor="design-name">{t('designNameLabel')}</Label>
<Input
id="design-name"
placeholder={t('designNamePlaceholder')}
value={saveDesignName}
onChange={(e) => setSaveDesignName(e.target.value)}
onKeyDown={(e) => {
if (e.key === 'Enter') {
e.preventDefault()
handleSaveDesign()
}
}}
/>
</div>
<div className="space-y-2">
<Label>{t('designDescriptionLabel')}</Label>
<p className="text-sm text-muted-foreground">{watch('instruct')}</p>
</div>
</div>
<DialogFooter>
<Button type="button" variant="outline" onClick={() => {
setShowSaveDialog(false)
setSaveDesignName('')
}}>
{tCommon('cancel')}
</Button>
<Button type="button" onClick={handleSaveDesign} disabled={isPreparing}>
{isPreparing ? t('preparing') : tCommon('save')}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
<Dialog open={advancedOpen} onOpenChange={(open) => {
if (open) {
setTempAdvancedParams({
max_new_tokens: watch('max_new_tokens') || 2048,
temperature: watch('temperature') || 0.3,
top_k: watch('top_k') || 20,
top_p: watch('top_p') || 0.7,
repetition_penalty: watch('repetition_penalty') || 1.05
})
}
setAdvancedOpen(open)
}}>
<DialogTrigger asChild>
<Button type="button" variant="outline" className="w-full">
<Settings className="mr-2 h-4 w-4" />
{t('advancedOptions')}
</Button>
</DialogTrigger>
<DialogContent className="sm:max-w-[500px]">
<DialogHeader>
<DialogTitle>{t('advancedOptionsTitle')}</DialogTitle>
<DialogDescription>{t('advancedOptionsDescription')}</DialogDescription>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="space-y-2">
<Label htmlFor="dialog-max_new_tokens">
{t('advancedParams.maxNewTokens.label')}
</Label>
<Input
id="dialog-max_new_tokens"
type="number"
min={128}
max={4096}
value={tempAdvancedParams.max_new_tokens}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
max_new_tokens: parseInt(e.target.value) || 2048
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.maxNewTokens.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-temperature">
{t('advancedParams.temperature.label')}
</Label>
<Input
id="dialog-temperature"
type="number"
min={0.1}
max={2}
step={0.1}
value={tempAdvancedParams.temperature}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
temperature: parseFloat(e.target.value) || 0.3
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.temperature.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-top_k">
{t('advancedParams.topK.label')}
</Label>
<Input
id="dialog-top_k"
type="number"
min={1}
max={100}
value={tempAdvancedParams.top_k}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
top_k: parseInt(e.target.value) || 20
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.topK.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-top_p">
{t('advancedParams.topP.label')}
</Label>
<Input
id="dialog-top_p"
type="number"
min={0}
max={1}
step={0.1}
value={tempAdvancedParams.top_p}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
top_p: parseFloat(e.target.value) || 0.7
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.topP.description')}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="dialog-repetition_penalty">
{t('advancedParams.repetitionPenalty.label')}
</Label>
<Input
id="dialog-repetition_penalty"
type="number"
min={0}
max={2}
step={0.01}
value={tempAdvancedParams.repetition_penalty}
onChange={(e) => setTempAdvancedParams({
...tempAdvancedParams,
repetition_penalty: parseFloat(e.target.value) || 1.05
})}
/>
<p className="text-sm text-muted-foreground">
{t('advancedParams.repetitionPenalty.description')}
</p>
</div>
</div>
<DialogFooter>
<Button
type="button"
variant="outline"
onClick={() => {
setTempAdvancedParams({
max_new_tokens: watch('max_new_tokens') || 2048,
temperature: watch('temperature') || 0.3,
top_k: watch('top_k') || 20,
top_p: watch('top_p') || 0.7,
repetition_penalty: watch('repetition_penalty') || 1.05
})
setAdvancedOpen(false)
}}
>
{tCommon('cancel')}
</Button>
<Button
type="button"
onClick={() => {
setValue('max_new_tokens', tempAdvancedParams.max_new_tokens)
setValue('temperature', tempAdvancedParams.temperature)
setValue('top_k', tempAdvancedParams.top_k)
setValue('top_p', tempAdvancedParams.top_p)
setValue('repetition_penalty', tempAdvancedParams.repetition_penalty)
setAdvancedOpen(false)
}}
>
{tCommon('ok')}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
<TooltipProvider>
<Tooltip>
<TooltipTrigger asChild>
<Button type="submit" className="w-full" disabled={isLoading || isPolling}>
<Play className="mr-2 h-4 w-4" />
{isLoading ? t('creating') : t('generate')}
</Button>
</TooltipTrigger>
<TooltipContent>
<p>{t('generate')}</p>
</TooltipContent>
</Tooltip>
</TooltipProvider>
{isPolling && <LoadingState elapsedTime={elapsedTime} />}
{isCompleted && currentJob && (
<div className="space-y-4 pt-4 border-t">
<AudioPlayer
audioUrl={memoizedAudioUrl}
jobId={currentJob.id}
text={currentJob.parameters?.text}
/>
<Button
type="button"
variant="outline"
className="w-full"
onClick={() => setShowSaveDialog(true)}
>
<Save className="mr-2 h-4 w-4" />
{t('saveDesignButton')}
</Button>
</div>
)}
</form>
)
})
export default VoiceDesignForm

View File

@@ -1,101 +0,0 @@
import { createContext, useContext, useState, useEffect, useMemo, useCallback, type ReactNode } from 'react'
import { ttsApi } from '@/lib/api'
import type { Language, Speaker } from '@/types/tts'
interface AppContextType {
currentTab: string
setCurrentTab: (tab: string) => void
languages: Language[]
speakers: Speaker[]
isLoadingConfig: boolean
}
interface CacheEntry<T> {
data: T
timestamp: number
}
const CACHE_DURATION = 5 * 60 * 1000
const cache: {
languages: CacheEntry<Language[]> | null
speakers: CacheEntry<Speaker[]> | null
} = {
languages: null,
speakers: null,
}
const isCacheValid = <T,>(entry: CacheEntry<T> | null): boolean => {
if (!entry) return false
return Date.now() - entry.timestamp < CACHE_DURATION
}
const AppContext = createContext<AppContextType | undefined>(undefined)
export function AppProvider({ children }: { children: ReactNode }) {
const [currentTab, setCurrentTabState] = useState('custom-voice')
const [languages, setLanguages] = useState<Language[]>([])
const [speakers, setSpeakers] = useState<Speaker[]>([])
const [isLoadingConfig, setIsLoadingConfig] = useState(true)
const setCurrentTab = useCallback((tab: string) => {
setCurrentTabState(tab)
}, [])
useEffect(() => {
const loadConfig = async () => {
try {
let languagesData: Language[]
let speakersData: Speaker[]
if (isCacheValid(cache.languages)) {
languagesData = cache.languages!.data
} else {
languagesData = await ttsApi.getLanguages()
cache.languages = { data: languagesData, timestamp: Date.now() }
}
if (isCacheValid(cache.speakers)) {
speakersData = cache.speakers!.data
} else {
speakersData = await ttsApi.getSpeakers()
cache.speakers = { data: speakersData, timestamp: Date.now() }
}
setLanguages(languagesData)
setSpeakers(speakersData)
} catch (error) {
console.error('Failed to load config:', error)
} finally {
setIsLoadingConfig(false)
}
}
loadConfig()
}, [])
const value = useMemo(
() => ({
currentTab,
setCurrentTab,
languages,
speakers,
isLoadingConfig,
}),
[currentTab, setCurrentTab, languages, speakers, isLoadingConfig]
)
return (
<AppContext.Provider value={value}>
{children}
</AppContext.Provider>
)
}
export function useApp() {
const context = useContext(AppContext)
if (!context) {
throw new Error('useApp must be used within AppProvider')
}
return context
}

View File

@@ -1,126 +0,0 @@
import { createContext, useContext, useState, useEffect, useCallback, useMemo, type ReactNode } from 'react'
import { jobApi } from '@/lib/api'
import type { Job } from '@/types/job'
import { toast } from 'sonner'
interface HistoryContextType {
jobs: Job[]
total: number
loading: boolean
loadingMore: boolean
hasMore: boolean
error: string | null
loadMore: () => Promise<void>
refresh: () => Promise<void>
retry: () => Promise<void>
deleteJob: (id: number) => Promise<void>
}
const HistoryContext = createContext<HistoryContextType | undefined>(undefined)
export function HistoryProvider({ children }: { children: ReactNode }) {
const [jobs, setJobs] = useState<Job[]>([])
const [total, setTotal] = useState(0)
const [loading, setLoading] = useState(true)
const [loadingMore, setLoadingMore] = useState(false)
const [error, setError] = useState<string | null>(null)
const [skip, setSkip] = useState(0)
const limit = 20
const hasMore = jobs.length < total
const loadJobs = useCallback(async (currentSkip: number, isLoadMore = false) => {
try {
if (isLoadMore) {
setLoadingMore(true)
} else {
setLoading(true)
}
setError(null)
const response = await jobApi.listJobs(currentSkip, limit)
if (isLoadMore) {
setJobs(prev => [...prev, ...response.jobs])
} else {
setJobs(response.jobs)
}
setTotal(response.total)
} catch (error: any) {
const errorMessage = error.message || '加载历史记录失败'
setError(errorMessage)
toast.error(errorMessage)
} finally {
setLoading(false)
setLoadingMore(false)
}
}, [])
const loadMore = useCallback(async () => {
if (loadingMore || !hasMore) return
const newSkip = skip + limit
setSkip(newSkip)
await loadJobs(newSkip, true)
}, [skip, loadingMore, hasMore, loadJobs])
const refresh = useCallback(async () => {
setSkip(0)
await loadJobs(0, false)
}, [loadJobs])
const retry = useCallback(async () => {
setSkip(0)
await loadJobs(0, false)
}, [loadJobs])
const deleteJob = useCallback(async (id: number) => {
const previousJobs = [...jobs]
const previousTotal = total
setJobs(prev => prev.filter(job => job.id !== id))
setTotal(prev => prev - 1)
try {
await jobApi.deleteJob(id)
toast.success('删除成功')
} catch (error) {
setJobs(previousJobs)
setTotal(previousTotal)
toast.error('删除失败')
}
}, [jobs, total])
useEffect(() => {
loadJobs(0, false)
}, [loadJobs])
const value = useMemo(
() => ({
jobs,
total,
loading,
loadingMore,
hasMore,
error,
loadMore,
refresh,
retry,
deleteJob,
}),
[jobs, total, loading, loadingMore, hasMore, error, loadMore, refresh, retry, deleteJob]
)
return (
<HistoryContext.Provider value={value}>
{children}
</HistoryContext.Provider>
)
}
export function useHistoryContext() {
const context = useContext(HistoryContext)
if (!context) {
throw new Error('useHistoryContext must be used within HistoryProvider')
}
return context
}

View File

@@ -1,137 +0,0 @@
import { createContext, useContext, useState, useCallback, useMemo, useRef, useEffect, type ReactNode } from 'react'
import { toast } from 'sonner'
import { jobApi } from '@/lib/api'
import type { Job, JobStatus } from '@/types/job'
import { POLL_INTERVAL } from '@/lib/constants'
import { useHistoryContext } from '@/contexts/HistoryContext'
interface JobContextType {
currentJob: Job | null
status: JobStatus | null
error: string | null
elapsedTime: number
startJob: (jobId: number) => void
stopJob: () => void
resetJob: () => void
loadCompletedJob: (job: Job) => void
}
const JobContext = createContext<JobContextType | undefined>(undefined)
export function JobProvider({ children }: { children: ReactNode }) {
const [currentJob, setCurrentJob] = useState<Job | null>(null)
const [status, setStatus] = useState<JobStatus | null>(null)
const [error, setError] = useState<string | null>(null)
const [elapsedTime, setElapsedTime] = useState(0)
const { refresh: historyRefresh } = useHistoryContext()
const pollIntervalRef = useRef<ReturnType<typeof setInterval> | null>(null)
const timeIntervalRef = useRef<ReturnType<typeof setInterval> | null>(null)
const clearIntervals = useCallback(() => {
if (pollIntervalRef.current) {
clearInterval(pollIntervalRef.current)
pollIntervalRef.current = null
}
if (timeIntervalRef.current) {
clearInterval(timeIntervalRef.current)
timeIntervalRef.current = null
}
}, [])
const stopJob = useCallback(() => {
clearIntervals()
setCurrentJob(null)
setStatus(null)
setError(null)
setElapsedTime(0)
}, [clearIntervals])
const resetJob = useCallback(() => {
setError(null)
}, [])
const loadCompletedJob = useCallback((job: Job) => {
setCurrentJob(job)
setStatus(job.status)
setError(job.error_message || null)
setElapsedTime(0)
}, [])
const startJob = useCallback((jobId: number) => {
clearIntervals()
// Reset state for new job
setCurrentJob(null)
setStatus('pending')
setError(null)
setElapsedTime(0)
const poll = async () => {
try {
const job = await jobApi.getJob(jobId)
setCurrentJob(job)
setStatus(job.status)
if (job.status === 'completed') {
clearIntervals()
toast.success('任务完成!')
try {
historyRefresh()
} catch {}
} else if (job.status === 'failed') {
clearIntervals()
setError(job.error_message || '任务失败')
toast.error(job.error_message || '任务失败')
try {
historyRefresh()
} catch {}
}
} catch (error: any) {
clearIntervals()
const message = error.response?.data?.detail || '获取任务状态失败'
setError(message)
toast.error(message)
}
}
poll()
pollIntervalRef.current = setInterval(poll, POLL_INTERVAL)
timeIntervalRef.current = setInterval(() => {
setElapsedTime((prev) => prev + 1)
}, 1000)
}, [historyRefresh, clearIntervals])
useEffect(() => {
return () => {
clearIntervals()
}
}, [clearIntervals])
const value = useMemo(
() => ({
currentJob,
status,
error,
elapsedTime,
startJob,
stopJob,
resetJob,
loadCompletedJob,
}),
[currentJob, status, error, elapsedTime, startJob, stopJob, resetJob, loadCompletedJob]
)
return (
<JobContext.Provider value={value}>
{children}
</JobContext.Provider>
)
}
export function useJob() {
const context = useContext(JobContext)
if (!context) {
throw new Error('useJob must be used within JobProvider')
}
return context
}

View File

@@ -1,107 +0,0 @@
import { useState, useEffect, useCallback } from 'react'
import { jobApi } from '@/lib/api'
import type { Job } from '@/types/job'
import { toast } from 'sonner'
interface UseHistoryReturn {
jobs: Job[]
total: number
loading: boolean
loadingMore: boolean
hasMore: boolean
error: string | null
loadMore: () => Promise<void>
refresh: () => Promise<void>
retry: () => Promise<void>
deleteJob: (id: number) => Promise<void>
}
export function useHistory(): UseHistoryReturn {
const [jobs, setJobs] = useState<Job[]>([])
const [total, setTotal] = useState(0)
const [loading, setLoading] = useState(true)
const [loadingMore, setLoadingMore] = useState(false)
const [error, setError] = useState<string | null>(null)
const [skip, setSkip] = useState(0)
const limit = 20
const hasMore = jobs.length < total
const loadJobs = useCallback(async (currentSkip: number, isLoadMore = false) => {
try {
if (isLoadMore) {
setLoadingMore(true)
} else {
setLoading(true)
}
setError(null)
const response = await jobApi.listJobs(currentSkip, limit)
if (isLoadMore) {
setJobs(prev => [...prev, ...response.jobs])
} else {
setJobs(response.jobs)
}
setTotal(response.total)
} catch (error: any) {
const errorMessage = error.message || '加载历史记录失败'
setError(errorMessage)
toast.error(errorMessage)
} finally {
setLoading(false)
setLoadingMore(false)
}
}, [])
const loadMore = useCallback(async () => {
if (loadingMore || !hasMore) return
const newSkip = skip + limit
setSkip(newSkip)
await loadJobs(newSkip, true)
}, [skip, loadingMore, hasMore, loadJobs])
const refresh = useCallback(async () => {
setSkip(0)
await loadJobs(0, false)
}, [loadJobs])
const retry = useCallback(async () => {
setSkip(0)
await loadJobs(0, false)
}, [loadJobs])
const deleteJob = useCallback(async (id: number) => {
const previousJobs = [...jobs]
const previousTotal = total
setJobs(prev => prev.filter(job => job.id !== id))
setTotal(prev => prev - 1)
try {
await jobApi.deleteJob(id)
toast.success('删除成功')
} catch (error) {
setJobs(previousJobs)
setTotal(previousTotal)
toast.error('删除失败')
}
}, [jobs, total])
useEffect(() => {
loadJobs(0, false)
}, [loadJobs])
return {
jobs,
total,
loading,
loadingMore,
hasMore,
error,
loadMore,
refresh,
retry,
deleteJob,
}
}

View File

@@ -1,19 +0,0 @@
import { useJob } from '@/contexts/JobContext'
export function useJobPolling() {
const { currentJob, status, error, elapsedTime, startJob, stopJob, resetJob, loadCompletedJob } = useJob()
return {
currentJob,
status,
error,
elapsedTime,
isPolling: status === 'processing' || status === 'pending',
isCompleted: status === 'completed',
isFailed: status === 'failed',
startPolling: startJob,
stopPolling: stopJob,
resetError: resetJob,
loadCompletedJob,
}
}

View File

@@ -1,97 +0,0 @@
import { useState, useRef, lazy, Suspense } from 'react'
import { useTranslation } from 'react-i18next'
import { Navbar } from '@/components/Navbar'
import { Card, CardContent } from '@/components/ui/card'
import { Tabs, TabsContent, TabsList, TabsTrigger } from '@/components/ui/tabs'
import { User, Palette, Copy } from 'lucide-react'
import type { CustomVoiceFormHandle } from '@/components/tts/CustomVoiceForm'
import type { VoiceDesignFormHandle } from '@/components/tts/VoiceDesignForm'
import { HistorySidebar } from '@/components/HistorySidebar'
import { OnboardingDialog } from '@/components/OnboardingDialog'
import FormSkeleton from '@/components/FormSkeleton'
import LoadingScreen from '@/components/LoadingScreen'
import { useUserPreferences } from '@/contexts/UserPreferencesContext'
const CustomVoiceForm = lazy(() => import('@/components/tts/CustomVoiceForm'))
const VoiceDesignForm = lazy(() => import('@/components/tts/VoiceDesignForm'))
const VoiceCloneForm = lazy(() => import('@/components/tts/VoiceCloneForm'))
function Home() {
const { t } = useTranslation('nav')
const [currentTab, setCurrentTab] = useState('custom-voice')
const [sidebarOpen, setSidebarOpen] = useState(false)
const { preferences } = useUserPreferences()
const customVoiceFormRef = useRef<CustomVoiceFormHandle>(null)
const voiceDesignFormRef = useRef<VoiceDesignFormHandle>(null)
if (!preferences) {
return <LoadingScreen />
}
const showOnboarding = !preferences.onboarding_completed
return (
<div className="h-screen overflow-hidden flex bg-background">
<OnboardingDialog
open={showOnboarding}
onComplete={() => {}}
/>
<HistorySidebar
open={sidebarOpen}
onOpenChange={setSidebarOpen}
/>
<div className="flex-1 flex flex-col overflow-hidden bg-muted/30">
<Navbar onToggleSidebar={() => setSidebarOpen(!sidebarOpen)} />
<main className="flex-1 overflow-y-auto flex items-start md:items-center justify-center lg:rounded-tl-2xl bg-background">
<div className="w-full container mx-auto p-3 md:p-6 max-w-[800px] md:max-w-[700px]">
<Tabs value={currentTab} onValueChange={setCurrentTab}>
<TabsList className="grid w-full grid-cols-3 h-9 mb-3">
<TabsTrigger value="custom-voice" variant="default">
<User className="h-4 w-4 md:mr-2" />
<span className="hidden md:inline">{t('customVoiceTab')}</span>
</TabsTrigger>
<TabsTrigger value="voice-design" variant="secondary">
<Palette className="h-4 w-4 md:mr-2" />
<span className="hidden md:inline">{t('voiceDesignTab')}</span>
</TabsTrigger>
<TabsTrigger value="voice-clone" variant="outline">
<Copy className="h-4 w-4 md:mr-2" />
<span className="hidden md:inline">{t('voiceCloneTab')}</span>
</TabsTrigger>
</TabsList>
<Card>
<CardContent className="pt-6 px-3 md:px-6 pb-6">
<TabsContent value="custom-voice" className="mt-0">
<Suspense fallback={<FormSkeleton />}>
<CustomVoiceForm ref={customVoiceFormRef} />
</Suspense>
</TabsContent>
<TabsContent value="voice-design" className="mt-0">
<Suspense fallback={<FormSkeleton />}>
<VoiceDesignForm ref={voiceDesignFormRef} />
</Suspense>
</TabsContent>
<TabsContent value="voice-clone" className="mt-0">
<Suspense fallback={<FormSkeleton />}>
<VoiceCloneForm />
</Suspense>
</TabsContent>
</CardContent>
</Card>
</Tabs>
</div>
</main>
</div>
</div>
)
}
export default Home

View File

@@ -1,124 +0,0 @@
import { useState, useEffect } from 'react'
import { toast } from 'sonner'
import { useTranslation } from 'react-i18next'
import { Trash2, Cpu, Cloud } from 'lucide-react'
import { Navbar } from '@/components/Navbar'
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
import { Button } from '@/components/ui/button'
import { Badge } from '@/components/ui/badge'
import {
AlertDialog,
AlertDialogAction,
AlertDialogCancel,
AlertDialogContent,
AlertDialogDescription,
AlertDialogFooter,
AlertDialogHeader,
AlertDialogTitle,
} from '@/components/ui/alert-dialog'
import { voiceDesignApi } from '@/lib/api'
import type { VoiceDesign } from '@/types/voice-design'
export default function VoiceManagement() {
const { t } = useTranslation(['voice', 'common'])
const [voices, setVoices] = useState<VoiceDesign[]>([])
const [isLoading, setIsLoading] = useState(true)
const [deleteTarget, setDeleteTarget] = useState<VoiceDesign | null>(null)
const [isDeleting, setIsDeleting] = useState(false)
const load = async () => {
try {
setIsLoading(true)
const res = await voiceDesignApi.list()
setVoices(res.designs)
} catch {
toast.error(t('voice:loadFailed'))
} finally {
setIsLoading(false)
}
}
useEffect(() => { load() }, [])
const handleDelete = async () => {
if (!deleteTarget) return
try {
setIsDeleting(true)
await voiceDesignApi.delete(deleteTarget.id)
toast.success(t('voice:voiceDeleted'))
setDeleteTarget(null)
await load()
} catch {
toast.error(t('voice:deleteFailed'))
} finally {
setIsDeleting(false)
}
}
return (
<div className="min-h-screen bg-background">
<Navbar />
<div className="container mx-auto p-4 sm:p-6 max-w-[800px]">
<Card>
<CardHeader>
<CardTitle>{t('voice:myVoices')}</CardTitle>
</CardHeader>
<CardContent>
{isLoading ? (
<div className="text-center text-muted-foreground py-8">{t('common:loading')}</div>
) : voices.length === 0 ? (
<div className="text-center text-muted-foreground py-8">{t('voice:noVoices')}</div>
) : (
<div className="divide-y">
{voices.map((voice) => (
<div key={voice.id} className="flex items-start justify-between py-4 gap-4">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 flex-wrap">
<span className="font-medium truncate">{voice.name}</span>
<Badge variant="outline" className="shrink-0 gap-1">
{voice.backend_type === 'local'
? <><Cpu className="h-3 w-3" />{t('voice:local')}</>
: <><Cloud className="h-3 w-3" />{t('voice:aliyun')}</>
}
</Badge>
</div>
<p className="text-sm text-muted-foreground mt-1 line-clamp-2">{voice.instruct}</p>
<p className="text-xs text-muted-foreground mt-1">
{t('voice:createdAt')}: {new Date(voice.created_at).toLocaleDateString()}
</p>
</div>
<Button
variant="ghost"
size="icon"
className="shrink-0 text-destructive hover:text-destructive"
onClick={() => setDeleteTarget(voice)}
>
<Trash2 className="h-4 w-4" />
</Button>
</div>
))}
</div>
)}
</CardContent>
</Card>
</div>
<AlertDialog open={!!deleteTarget} onOpenChange={(open) => !open && setDeleteTarget(null)}>
<AlertDialogContent>
<AlertDialogHeader>
<AlertDialogTitle>{t('voice:deleteVoice')}</AlertDialogTitle>
<AlertDialogDescription>
{t('voice:deleteConfirmDesc', { name: deleteTarget?.name })}
</AlertDialogDescription>
</AlertDialogHeader>
<AlertDialogFooter>
<AlertDialogCancel>{t('common:cancel')}</AlertDialogCancel>
<AlertDialogAction onClick={handleDelete} disabled={isDeleting}>
{isDeleting ? t('voice:deleting') : t('common:delete')}
</AlertDialogAction>
</AlertDialogFooter>
</AlertDialogContent>
</AlertDialog>
</div>
)
}