speech to text benchmark framework
$ npx skills add Picovoice/speech-to-text-benchmarkAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
speech to text benchmark framework
$ npx skills add Picovoice/speech-to-text-benchmarkA PyTorch-based Speech Toolkit
$ npx skills add speechbrain/speechbrainpytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
$ npx skills add mravanelli/pytorch-kaldiAutomatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
$ npx skills add MahmoudAshraf97/whisper-diarizationFine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
$ npx skills add yeyupiaoling/Whisper-FinetuneThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
$ npx skills add MiteshPuthran/Speech-Emotion-AnalyzerPytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
$ npx skills add yeyupiaoling/MASR基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。
$ npx skills add yeyupiaoling/PaddlePaddle-DeepSpeechWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
$ npx skills add m-bain/whisperXState-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
$ npx skills add NVIDIA/DeepLearningExamplesEasy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
$ npx skills add PaddlePaddle/PaddleSpeechFaster Whisper transcription with CTranslate2
$ npx skills add SYSTRAN/faster-whisperOpenVINO™ is an open source toolkit for optimizing and deploying AI inference
$ npx skills add openvinotoolkit/openvinoAI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML
$ npx skills add soniqo/speech-swiftEnd-to-End Speech Processing Toolkit
$ npx skills add espnet/espnetOffline speech recognition for Android with Vosk library.
$ npx skills add alphacep/vosk-android-demoHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Vosk API if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.