A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
$ npx skills add Blaizzy/mlx-audioAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine. Powered by Gemma 4 E2B and Kokoro.
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
$ npx skills add Blaizzy/mlx-audioFine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.
$ npx skills add ARahim3/mlx-tuneAI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML
$ npx skills add soniqo/speech-swiftOptimized Whisper models for streaming and on-device use
$ npx skills add TheStageAI/TheWhisperPort of OpenAI's Whisper model in C/C++
$ npx skills add ggml-org/whisper.cppWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
$ npx skills add m-bain/whisperXEasy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
$ npx skills add PaddlePaddle/PaddleSpeechOffline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
$ npx skills add alphacep/vosk-apiFaster Whisper transcription with CTranslate2
$ npx skills add SYSTRAN/faster-whisperA PyTorch-based Speech Toolkit
$ npx skills add speechbrain/speechbrainOpenVINO™ is an open source toolkit for optimizing and deploying AI inference
$ npx skills add openvinotoolkit/openvinoSpeech recognition module for Python, supporting several engines and APIs, online and offline.
$ npx skills add Uberi/speech_recognitionEnd-to-End Speech Processing Toolkit
$ npx skills add espnet/espnetA Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
$ npx skills add nl8590687/ASRT_SpeechRecognitionMultilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.
$ npx skills add FunAudioLLM/SenseVoice💬 Speech recognition for your site
$ npx skills add TalAter/annyangHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Parlor if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.