Alternatives

Vosk API alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Vosk API

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

100
Quality
93
Trust
15K
Stars
#1

Speech To Text Benchmark

Similarity 139Trust 81Strong 80

speech to text benchmark framework

693 starsMar 19, 2026 pushmedia-automationPythonSpeech
$ npx skills add Picovoice/speech-to-text-benchmark
#2

Speechbrain

Similarity 134Trust 93Excellent 100

A PyTorch-based Speech Toolkit

12K starsJun 15, 2026 pushmedia-automationPythonSpeech
$ npx skills add speechbrain/speechbrain
#3

Pytorch Kaldi

Similarity 131Trust 80Promising 69

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

2.4K starsMar 14, 2022 pushmedia-automationPythonSpeech
$ npx skills add mravanelli/pytorch-kaldi
#4

Whisper Diarization

Similarity 130Trust 89Excellent 99

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

5.6K starsFeb 23, 2026 pushmedia-automationJupyter NotebookSpeech
$ npx skills add MahmoudAshraf97/whisper-diarization
#5

Whisper Finetune

Similarity 129Trust 91Excellent 96

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

1.2K starsMay 8, 2026 pushmedia-automationCSpeech
$ npx skills add yeyupiaoling/Whisper-Finetune
#6

Speech Emotion Analyzer

Similarity 129Trust 82Strong 72

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

1.4K starsFeb 7, 2023 pushmedia-automationJupyter NotebookSpeech
$ npx skills add MiteshPuthran/Speech-Emotion-Analyzer
#7

MASR

Similarity 129Trust 81Promising 68

Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。

723 starsDec 17, 2025 pushmedia-automationPythonSpeech
$ npx skills add yeyupiaoling/MASR
#8

PaddlePaddle DeepSpeech

Similarity 128Trust 80Promising 68

基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。

762 starsDec 17, 2025 pushmedia-automationPythonSpeech
$ npx skills add yeyupiaoling/PaddlePaddle-DeepSpeech
#9

WhisperX

Similarity 127Trust 95Excellent 100

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

23K starsJun 3, 2026 pushmedia-automationPythonSpeech
$ npx skills add m-bain/whisperX
#10

DeepLearningExamples

Similarity 127Trust 85Strong 79

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

15K starsAug 12, 2024 pushmedia-automationJupyter NotebookSpeech
$ npx skills add NVIDIA/DeepLearningExamples
#11

PaddleSpeech

Similarity 126Trust 95Excellent 100

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

13K starsJun 21, 2026 pushmedia-automationPythonSpeech
$ npx skills add PaddlePaddle/PaddleSpeech
#12

Faster Whisper

Similarity 126Trust 89Excellent 99

Faster Whisper transcription with CTranslate2

24K starsNov 19, 2025 pushmedia-automationPythonSpeech
$ npx skills add SYSTRAN/faster-whisper
#13

Openvino

Similarity 126Trust 93Excellent 100

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

10K starsJun 23, 2026 pushmedia-automationC++Speech
$ npx skills add openvinotoolkit/openvino
#14

Speech Swift

Similarity 126Trust 86Excellent 87

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML

894 starsJun 14, 2026 pushmedia-automationSwiftSpeech
$ npx skills add soniqo/speech-swift
#15

Espnet

Similarity 125Trust 92Excellent 100

End-to-End Speech Processing Toolkit

9.9K starsJun 23, 2026 pushmedia-automationPythonSpeech
$ npx skills add espnet/espnet
#16

Vosk Android Demo

Similarity 125Trust 86Strong 84

Offline speech recognition for Android with Vosk library.

1.1K starsDec 8, 2025 pushmedia-automationJavaSpeech
$ npx skills add alphacep/vosk-android-demo

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Vosk API if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.