Commit Graph

56 Commits

Author SHA1 Message Date
475df0c9ca Refactor code structure for improved readability and maintainability 2026-03-12 15:37:08 +08:00
bbcfc0e8d3 feat: Add emotion handling to audiobook segments with emo_text and emo_alpha attributes 2026-03-12 14:34:20 +08:00
c79ffac6d9 fix: Enhance emotion vector calculation in IndexTTS2Backend with emo_alpha adjustment 2026-03-12 13:50:21 +08:00
8aec4f6f44 feat: Integrate IndexTTS2 model and update related schemas and frontend components 2026-03-12 13:30:53 +08:00
29bd45e0e0 fix: Remove enable_thinking parameter from stream_chat methods 2026-03-11 19:09:17 +08:00
4f0d9f5ed6 fix: Adjust chunk size in parse_one_chapter to 1500 and add enable_thinking parameter to LLMService methods 2026-03-11 19:05:03 +08:00
b6d4d2d5f2 feat: Enhance stream_chat methods to accept max_tokens parameter for improved token management 2026-03-11 18:47:22 +08:00
f9a0e2bcc4 refactor: Simplify SQLite checks by introducing a variable for database type 2026-03-11 18:04:16 +08:00
d9082b12a8 feat: Validate LLM configuration by sending a test request during API key update. 2026-03-11 17:32:54 +08:00
14def62d3b feat: introduce new feature with database persistence and refine cancel event resolution logic. 2026-03-11 16:50:52 +08:00
0d8756ebab feat: Implement generation cancellation for projects, update project status handling, and mark chapters as done upon segment completion. 2026-03-11 16:37:33 +08:00
44c39f1456 Removed direct instantiation of ProgressStore in audiobook service and added new feature documentation. 2026-03-11 16:30:51 +08:00
ffd3d6675d feat: Implement gender-specific TTS instructions, refactor async database session handling for character creation and preview generation, and add Aliyun voice design creation. 2026-03-11 15:58:14 +08:00
d3c6297a09 feat: Implement character voice preview playback and regeneration, and add a turbo mode status indicator for audiobook projects. 2026-03-11 15:36:43 +08:00
5dded459fc feat: Implement startup logic to reset stale audiobook chapter parsing and segment generation statuses to pending. 2026-03-11 14:42:00 +08:00
264b511228 feat: Implement functionality to retry only failed audiobook chapters and refine UI for batch operations. 2026-03-11 14:37:41 +08:00
d96089a2aa feat: Automatically delete associated source files when an audiobook project is removed. 2026-03-11 14:28:11 +08:00
b7b6f5ef8e feat: Implement batch cancellation for audiobook processing with enhanced frontend progress display. 2026-03-11 14:22:35 +08:00
a0047d5c29 feat: Add batch processing for audiobook chapters including parse, generate, and combined process actions. 2026-03-11 14:08:09 +08:00
2e005b0084 feat(audiobook): add gender field to audiobook character model and update related functionality 2026-03-10 20:23:03 +08:00
1db41b6278 feat(audiobook): enhance chapter expansion functionality in ProjectCard component 2026-03-10 18:05:31 +08:00
bf7c73e57c feat(audiobook): change audio format from MP3 to WAV for project downloads and merging 2026-03-10 17:56:46 +08:00
006aa0c85f feat(audiobook): add turbo mode for project analysis and enhance log streaming with chapter support 2026-03-10 17:01:50 +08:00
3c30afc476 feat(audiobook): implement chapter management with CRUD operations and enhance project detail responses 2026-03-10 16:42:32 +08:00
01b6f4633e feat(audiobook): implement log streaming for project status updates and enhance progress tracking 2026-03-10 16:27:01 +08:00
230274bbc3 feat(audiobook): refactor background tasks to use asyncio for project analysis and generation 2026-03-10 16:13:35 +08:00
5037857dd4 Refactor audiobook service to extract chapters from EPUB files, implement chapter chunking, and enhance project analysis and generation flow 2026-03-09 19:04:13 +08:00
a68a343536 feat(llm_service): enhance chat_json error handling and improve character extraction prompt 2026-03-09 12:42:03 +08:00
6fec2eb937 feat(audiobook): implement character voice bootstrapping and enhance polling during project status transitions 2026-03-09 12:39:02 +08:00
e1dbb79564 refactor(tts_service): simplify audio data handling in LocalTTSBackend 2026-03-09 11:53:16 +08:00
9b6691bffe feat(audiobook): add endpoint to retrieve audio for a specific segment 2026-03-09 11:48:47 +08:00
a3d7d318e0 feat(audiobook): implement audiobook project management features 2026-03-09 11:39:36 +08:00
28218e6616 feat: update requirements.txt to include additional dependencies for torch, numpy, pydub, and requests 2026-03-09 10:43:46 +08:00
c880fb8949 feat: add Aliyun region configuration to .env.example 2026-03-06 16:33:24 +08:00
4081fe3754 feat: remove Aliyun region configuration from .env.example 2026-03-06 16:21:36 +08:00
38e00fd38c feat: add Docker deployment support and fix /users/me endpoint
- Add docker/ directory with Dockerfile for backend and frontend
- Backend: pytorch/pytorch CUDA base image with all qwen_tts deps
- Frontend: multi-stage nginx build with /api/ proxy to backend
- docker-compose.yml (CPU) + docker-compose.gpu.yml (GPU overlay)
- Fix /users/me returning 404 due to missing route (was caught by /{user_id})
- Update .gitignore to exclude docker/models, docker/data, docker/.env
- Update README and README.zh.md with Docker deployment instructions

Images: bdim404/qwen3-tts-backend, bdim404/qwen3-tts-frontend

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 15:15:27 +08:00
964ebb824c feat: Add voice management functionality with delete capability and UI integration 2026-03-06 14:35:59 +08:00
ad90e5f96c feat: Implement prepare-and-create endpoint for voice design creation and update related API and frontend logic 2026-03-06 14:23:15 +08:00
5e1e3e0668 refactor: Consolidate cache file loading logic and enhance cache saving for different data types 2026-03-06 14:07:22 +08:00
a93754f449 feat: Enhance API interactions and improve job handling with new request validation and error management 2026-03-06 12:03:41 +08:00
3844e825cd fix: Update repository clone URL and adjust huggingface-cli commands in README files 2026-02-13 11:57:00 +08:00
4f535b20e5 refactor: Remove cache and metrics endpoints, and clean up voice design CRUD operations 2026-02-04 19:41:08 +08:00
73aacd174a Remove file logging handler 2026-02-04 19:05:51 +08:00
9e5d12c9fb feat: Add voice design support for voice cloning and enhance cache management 2026-02-04 17:52:24 +08:00
ddaa0abfc7 feat: Implement voice design management with CRUD operations and integrate into frontend 2026-02-04 13:57:20 +08:00
6c25dd9dd9 feat: Add systemd service, configure API for proxy deployment, and enhance mobile audio playback with token authentication. 2026-02-03 21:53:41 +08:00
18fd1f0fa5 feat: Integrate Aliyun TTS backend with dynamic speaker validation and listing, and adjust toast notification position.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 19:09:10 +08:00
47e1411390 Update Local Permission Assignments 2026-02-03 18:48:25 +08:00
d8a9f277be feat: improve accessibility and mobile responsiveness across UI components
Enhanced dialog accessibility by adding descriptive text to all dialog components. Implemented responsive layouts for mobile devices including card-based user table, adaptive navigation bar, and improved dialog spacing. Fixed Aliyun TTS health check to use WebSocket-based connectivity testing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 18:03:38 +08:00
244ff94c6a feat: enhance audio processing and error handling in TTS backend; refactor user dialog form validation 2026-02-03 17:37:14 +08:00