Alternatives

Diffwave alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Compare shortlist View Diffwave

Current skill

Diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Quality

Trust

888

Stars

Glow Tts

Similarity 123Trust 75Promising 55

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

714 starsJul 12, 2022 pushmedia-automationPythonVoice

$ npx skills add jaywalnut310/glow-tts

Transformer TTS

Similarity 123Trust 74Promising 55

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

690 starsNov 8, 2023 pushmedia-automationPythonVoice

$ npx skills add soobinseo/Transformer-TTS

Vits2 Pytorch

Similarity 122Trust 75Needs review 54

unofficial vits2-TTS implementation in pytorch

549 starsMar 28, 2024 pushmedia-automationPythonVoice

$ npx skills add p0p4k/vits2_pytorch

Diffusers

Similarity 120Trust 98Excellent 100

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

34K starsJun 16, 2026 pushmedia-automationPythonImage Generation

$ npx skills add huggingface/diffusers

E2 Tts Pytorch

Similarity 120Trust 85Strong 74

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

517 starsDec 20, 2025 pushmedia-automationPythonVoice

$ npx skills add lucidrains/e2-tts-pytorch

Kur

Similarity 117Trust 74Promising 55

Descriptive Deep Learning

822 starsFeb 5, 2024 pushmedia-automationPythonSpeech

$ npx skills add deepgram/kur

Vits2

Similarity 117Trust 77Needs review 54

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

641 starsSep 11, 2023 pushmedia-automationJupyter NotebookVoice

$ npx skills add daniilrobnikov/vits2

GPA

Similarity 116Trust 88Excellent 87

[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny model!

866 starsMay 25, 2026 pushmedia-automationPythonVoice

$ npx skills add AutoArk/GPA

Voicebox Pytorch

Similarity 116Trust 79Promising 55

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

691 starsOct 1, 2024 pushmedia-automationPythonVoice

$ npx skills add lucidrains/voicebox-pytorch

#10

AI-powered multi-voice audiobook generator — LLM script annotation, voice cloning, voice design, LoRA training, per-line style control, and export to MP3, chaptered M4B, or Audacity multi-track. Built on Qwen3-TTS.

682 starsJun 4, 2026 pushmedia-automationPythonVoice

$ npx skills add Finrandojin/alexandria-audiobook

#11

Pandrator

Similarity 115Trust 89Excellent 85

Turn PDFs and EPUBs into audiobooks; subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, including voice-cloning (instant, RVC-enhanced, XTTS fine-tuning) and LLM processing. It aspires to be a user-friendly app with a GUI, an installer and all-in-one packages.

572 starsJun 14, 2026 pushmedia-automationPythonVoice

$ npx skills add lukaszliniewicz/Pandrator

#12

Willow Inference Server

Similarity 114Trust 85Strong 74

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

504 starsFeb 12, 2026 pushmedia-automationPythonSpeech

$ npx skills add toverainc/willow-inference-server

#13

Vui

Similarity 113Trust 82Strong 81

Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice cloning, barge-in, ~9× realtime on a 4090. Apache 2.0.

701 starsJun 12, 2026 pushmedia-automationPythonVoice

$ npx skills add fluxions-ai/vui

#14

FireRed Image Edit

Similarity 113Trust 93Excellent 96

FireRed-Image-Edit is a powerful image editing foundation model achieving open-source state-of-the-art performance with precise instruction following, high-fidelity generation, superior identity consistency, and seamless multi-element fusion.

1.3K starsApr 3, 2026 pushmedia-automationPythonImage Generation

$ npx skills add FireRedTeam/FireRed-Image-Edit

#15

MASR

Similarity 113Trust 83Promising 68

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。

723 starsDec 17, 2025 pushmedia-automationPythonSpeech

$ npx skills add yeyupiaoling/MASR

#16

Chatterbox Tts API

Similarity 112Trust 82Strong 75

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate voice cloned speech anywhere the OpenAI API is used (e.g. Open WebUI, AnythingLLM, etc.)

613 starsDec 23, 2025 pushmedia-automationPythonVoice

$ npx skills add travisvn/chatterbox-tts-api

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Diffwave if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.