VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
$ npx skills add OpenBMB/VoxCPMAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/multiplatform CPU, AMD, NVIDIA GPU PyTorch support, handling, and auto-stitching
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
$ npx skills add OpenBMB/VoxCPM🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
$ npx skills add babysor/MockingBird🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
$ npx skills add coqui-ai/TTSA simple, high-quality voice conversion tool focused on ease of use and performance.
$ npx skills add IAHispano/ApplioSelf-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.
$ npx skills add devnen/Chatterbox-TTS-ServerControllable and fast Text-to-Speech for over 7000 languages!
$ npx skills add DigitalPhonetics/IMS-ToucanUnsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
$ npx skills add unslothai/unsloth1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
$ npx skills add RVC-Boss/GPT-SoVITSVITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
$ npx skills add jaywalnut310/vitsStyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
$ npx skills add yl4579/StyleTTS2Translate the video from one language to another and embed dubbing & subtitles.
$ npx skills add jianchang512/pyvideotransAn Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
$ npx skills add index-tts/index-ttsFoundational model for human-like, expressive TTS
$ npx skills add metavoiceio/metavoice-srcA TTS model capable of generating ultra-realistic dialogue in one pass.
$ npx skills add nari-labs/diaAmphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
$ npx skills add open-mmlab/AmphionAI-powered multi-voice audiobook generator — LLM script annotation, voice cloning, voice design, LoRA training, per-line style control, and export to MP3, chaptered M4B, or Audacity multi-track. Built on Qwen3-TTS.
$ npx skills add Finrandojin/alexandria-audiobookHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Kokoro FastAPI if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.