Vosk-api
Audio & Voice
Vosk-api

Vosk is an offline, open-source speech recognition toolkit that supports 20+ languages with fast, low-latency transcription on any device.

Vosk is a powerful offline speech recognition toolkit designed for flexibility, speed, and privacy. Supporting over 20 languages and dialects—including English, Chinese, Spanish, Russian, and Arabic—it offers developers a lightweight yet capable solution for real-time transcription and voice interaction.

With model sizes as small as 50MB, Vosk provides continuous large vocabulary recognition, zero-latency streaming APIs, speaker identification, and customizable vocabularies. It’s ideal for building voice interfaces into chatbots, smart home devices, virtual assistants, or adding subtitles to media content.

Vosk is well-suited for developers, educators, researchers, and creators who need reliable voice recognition without internet access. It works across platforms, from low-power devices like Raspberry Pi and Android smartphones to large server clusters.

The toolkit includes bindings for multiple programming languages such as Python, Java, C#, C++, Node.js, Rust, and Go, making integration easy across diverse tech stacks.

Whether you're building an AI assistant, transcribing interviews, or powering hands-free control systems, Vosk delivers offline, multilingual speech recognition with remarkable performance and adaptability.

Relevant Sites