πŸ€–

Whisper (OpenAI)

OpenAI's open-source speech recognition model β€” the most accurate free transcription tool available, in 99 languages.

Audio & Music Free Free β˜… 4.7

OpenAI Whisper is an open-source automatic speech recognition (ASR) model that delivers professional-grade transcription accuracy for free. Released as open-source, it can be run locally on your own hardware or accessed via the OpenAI API at $0.006 per minute.

There is no subscription β€” you either run it locally (free, requires a decent GPU) or pay per minute via API. Numerous third-party apps (including Descript, Otter.ai, and Fireflies) use Whisper under the hood.

Whisper large-v3 supports 99 languages and handles diverse accents, technical terminology, and noisy audio with impressive accuracy. The model is trained on 680,000 hours of multilingual audio β€” making it robust to real-world conditions that trip up lesser models. For developers, it’s the default choice for adding transcription to any application.

Use cases include: building custom transcription apps, local transcription for privacy-sensitive content, podcast and video transcription, accessibility tools, and voice command systems. The turbo model offers a 7.7x speed improvement over large-v3 with minimal quality loss.

Pros: Free to self-host, industry-leading accuracy, 99 language support, fast turbo model available, no usage limits when self-hosted, open source.
Cons: Requires technical setup to self-host, API pricing for hosted access, no built-in speaker diarisation (who said what).

Best for: Developers building transcription features, privacy-conscious users who need local processing, researchers, and teams building multilingual voice applications.