Alternatives

Mlx Audio alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Mlx Audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

100
Quality
94
Trust
7.4K
Stars
#1

TheWhisper

Similarity 139Trust 84Excellent 87

Optimized Whisper models for streaming and on-device use

888 starsJun 15, 2026 pushmedia-automationPythonSpeech
$ npx skills add TheStageAI/TheWhisper
#2

Parlor

Similarity 138Trust 92Excellent 100

On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine. Powered by Gemma 4 E2B and Kokoro.

1.9K starsJun 4, 2026 pushmedia-automationHTMLSpeech
$ npx skills add fikrikarim/parlor
#3

Voice Pro

Similarity 138Trust 91Excellent 95

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

11K starsDec 5, 2025 pushmedia-automationPythonSpeech
$ npx skills add abus-aikorea/voice-pro
#4

AI Waifu Vtuber

Similarity 134Trust 87Excellent 97

AI Vtuber for Streaming on Youtube/Twitch

1.1K starsMay 31, 2026 pushmedia-automationPythonSpeech
$ npx skills add ardha27/AI-Waifu-Vtuber
#5

WhisperX

Similarity 133Trust 95Excellent 100

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

23K starsJun 3, 2026 pushmedia-automationPythonSpeech
$ npx skills add m-bain/whisperX
#6

Faster Whisper

Similarity 132Trust 89Excellent 99

Faster Whisper transcription with CTranslate2

24K starsNov 19, 2025 pushmedia-automationPythonSpeech
$ npx skills add SYSTRAN/faster-whisper
#7

Speechbrain

Similarity 132Trust 93Excellent 100

A PyTorch-based Speech Toolkit

12K starsJun 15, 2026 pushmedia-automationPythonSpeech
$ npx skills add speechbrain/speechbrain
#8

Speech Recognition

Similarity 131Trust 94Excellent 100

Speech recognition module for Python, supporting several engines and APIs, online and offline.

9.0K starsJun 16, 2026 pushmedia-automationPythonSpeech
$ npx skills add Uberi/speech_recognition
#9

Open Speech Corpora

Similarity 131Trust 83Strong 72

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

1.4K starsJun 6, 2024 pushmedia-automationSpeechClaude Code
$ npx skills add coqui-ai/open-speech-corpora
#10

ComfyUI Custom Nodes AlekPet

Similarity 129Trust 89Excellent 97

Custom nodes that extend the capabilities of Comfyui

1.5K starsMay 9, 2026 pushmedia-automationJavaScriptSpeech
$ npx skills add AlekPet/ComfyUI_Custom_Nodes_AlekPet
#11

Stt

Similarity 129Trust 88Excellent 98

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式

4.6K starsJan 22, 2026 pushmedia-automationPythonSpeech
$ npx skills add jianchang512/stt
#12

SoniTranslate

Similarity 129Trust 89Excellent 97

Synchronized Translation for Videos. Video dubbing

1.4K starsApr 27, 2026 pushmedia-automationPythonVoice
$ npx skills add R3gm/SoniTranslate
#13

Whisper.Cpp

Similarity 128Trust 93Excellent 100

Port of OpenAI's Whisper model in C/C++

51K starsJun 22, 2026 pushmedia-automationC++Speech
$ npx skills add ggml-org/whisper.cpp
#14

Mlx Tune

Similarity 127Trust 90Excellent 100

Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.

1.3K starsMay 31, 2026 pushmedia-automationPythonSpeech
$ npx skills add ARahim3/mlx-tune
#15

Willow Inference Server

Similarity 127Trust 83Strong 74

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

504 starsFeb 12, 2026 pushmedia-automationPythonSpeech
$ npx skills add toverainc/willow-inference-server
#16

Quillman

Similarity 127Trust 90Excellent 100

A voice chat app

1.2K starsMay 28, 2026 pushmedia-automationPythonSpeech
$ npx skills add modal-labs/quillman

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Mlx Audio if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.