Parakeet V3 vs Whisper: 10x Faster, Better Accuracy (Benchmark)

March 7, 2026
·
6 min read
·Whisper Notes Team

TL;DR

Parakeet V3 Whisper Large V3
Speed 10×
Supported languages 25 100+
English error rate (WER) 6.32% 7.44%
Avg error rate, 25 langs (WER) 12.0% 12.6%
Hallucinations None On silence
Best for English & European Asian, Arabic, 100+

* Speed: 35-min audio, Apple Silicon. English WER: Open ASR Leaderboard. 25-lang avg: FLEURS benchmark.

Starting with version 1.3.2, Whisper Notes for Mac ships with NVIDIA Parakeet TDT 0.6B as the default speech engine. It's 10x faster than Whisper Large V3 Turbo for English, and more accurate. Whisper models are still available if you need other languages.

Why We Switched the Default

Whisper is great, but it was designed as a general-purpose model. It handles 100+ languages, translates, generates timestamps — a Swiss Army knife. The trade-off is speed. For English dictation, where you just want words on screen fast, it's overkill.

Here's the thing that bugged me: when using Fn-key system-wide dictation with Whisper, finishing a ~1 minute utterance meant waiting 3–5 seconds for the transcript to appear. That pause breaks the flow. You stop talking, you wait, you stare at the cursor — it kills the magic of voice typing.

Parakeet changed that completely. The speed is so fast that the transcript appears the instant you stop speaking. Speak, and the words are just there. Once you experience that feeling — that seamless, zero-wait flow — it's really hard to go back to Whisper.

How Fast Is Parakeet V3?

Numbers speak louder than words. Here's a real-world comparison using a 35-minute audio file on the same Mac:

Model 35 min Audio
Whisper Large V3 Turbo 3 minutes
Parakeet TDT 0.6B v3 18 seconds

That's 10x faster. And because the model is smaller (600M vs 800M parameters), it uses less memory and less battery too.

What Makes Parakeet v3 So Fast

Whisper listens to audio the way you'd read a book out loud — word by word, frame by frame, never skipping ahead. Even during silence, it's still processing, still guessing what comes next. That's thorough, but slow.

Parakeet takes a fundamentally different approach. It compresses the audio signal 8x before processing, so the model sees only what matters. Then, instead of grinding through every single frame, it predicts not just what word you said, but how long that word lasts — and jumps ahead. Silence? Skipped entirely. A long vowel? One prediction instead of dozens.

The result is a model that processes speech the way your brain does — focusing on the words, ignoring the gaps. That's why it's 10x faster with fewer parameters and higher accuracy.

Benchmarks: Parakeet v3 vs Whisper

Word Error Rate comparison: Parakeet TDT 0.6B v3 vs Whisper Large V3 vs Seamless M4T across multiple benchmark datasets

Parakeet v3 matches or beats models 2-4x its size across FLEURS, CoVoST, and MLS benchmarks

On the Hugging Face Open ASR Leaderboard, Parakeet v3 tops the chart with only 600M parameters — less than half of Whisper Large V3's 1.55B:

Model Parameters Avg WER Speed (RTFx)
Parakeet TDT 0.6B v3 0.6B 6.32% 3,333x
Canary 1B v2 1.0B 7.15% 749x
Whisper Large V3 1.55B 7.44% 146x
Whisper Large V3 Turbo 0.8B 7.6% 350x

Lower WER = fewer errors. Higher RTFx = faster. Parakeet wins on both. With 600M parameters, it's also the smallest model on that list — which means it runs beautifully on Apple Silicon with minimal memory and battery drain.

Multilingual WER: All 25 Languages

The leaderboard above covers English only. Here's the full picture — how the three models available in Whisper Notes compare across all 25 languages Parakeet supports, measured on the FLEURS benchmark. Lower WER = fewer transcription errors. Best value between Large V3 and Parakeet is highlighted per row:

Language Whisper Small Whisper Large V3 Parakeet V3
Bulgarian 37.3 12.9 12.6
Croatian 33.4 11.1 12.5
Czech 37.6 11.3 11.0
Danish 32.8 12.6 18.4
Dutch 16.4 5.6 7.5
English 6.1 4.3 4.9
Estonian 51.3 19.1 17.7
Finnish 24.0 7.7 13.2
French 15.0 6.3 5.2
German 10.2 4.3 5.0
Greek 30.8 27.0 20.7
Hungarian 38.9 14.1 15.7
Italian 9.8 2.3 3.0
Latvian 53.2 18.3 22.8
Lithuanian 65.6 22.3 20.4
Maltese 92.2 68.9 20.5
Polish 14.7 4.7 7.3
Portuguese 7.3 3.7 4.8
Romanian 29.8 8.2 12.4
Russian 11.4 4.2 5.5
Slovak 33.3 8.4 8.8
Slovenian 49.3 19.9 24.0
Spanish 5.6 3.1 3.5
Swedish 20.8 7.9 15.1
Ukrainian 19.3 6.5 6.8
Average 29.8 12.6 12.0

WER (%) on FLEURS. Whisper Small data from Radford et al.; Large V3 and Parakeet V3 data from NVIDIA Canary-1B-v2 paper.

Whisper Large V3 edges ahead on most individual languages — it's 2.5x larger, after all. But Parakeet V3 matches it on average (12.0% vs 12.6%), takes Greek, French, Estonian, and Maltese decisively, and crushes Whisper Small across the board (60% fewer errors on average). The real story isn't a fraction of a percent in WER — it's the total package: Large V3–level accuracy at 23x the speed, with 40% of the memory, zero hallucinations, and everything running locally on your Mac.

No More Hallucinations

If you've used Whisper for dictation, you've probably seen it hallucinate during silence — repeating phrases, inventing words, or spitting out "Subtitles by Amara.org" from nowhere. This happens because Whisper's autoregressive decoder always expects to produce text, even when there's nothing to transcribe.

NVIDIA trained Parakeet on 36,000 hours of pure non-speech audio (background noise, coughs, silence) paired with empty string targets. The model learned what silence sounds like and stays quiet. For "always-on" system-wide dictation, this is a game-changer — no more garbage text appearing when you pause to think.

Languages Parakeet Supports

Parakeet v3 supports 25 languages: Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, and Ukrainian.

That covers most of Europe, but it does not support Chinese, Japanese, Korean, Arabic, or Hindi. That's why we kept the Whisper models as downloadable options. If you dictate in Japanese or Mandarin, pick Whisper Large V3 Turbo from the model picker. For English and European languages, Parakeet v3 is simply the better engine.

Whisper Notes Mac model picker showing Parakeet V3 as default, with Whisper Small and Whisper Large V3 Turbo as downloadable options

Model picker: Parakeet V3 (default), Whisper Small, and Whisper Large V3 Turbo — all running locally

Model Picker in Whisper Notes

Open Settings to switch between models:

  • Parakeet V3 (default) — Fastest, best for English & European languages
  • Whisper Small — Lightweight, 100+ languages
  • Whisper Large V3 Turbo — Most accurate multi-language model

All models run 100% locally on your Mac. No internet, no cloud, no data leaves your device.

What About Parakeet V2?

If you used V2 before, you might wonder how it compares. V2 was an English-only model — and its English accuracy is actually slightly better than V3 (6.05% vs 6.32% WER). V3 trades that tiny margin for 25-language support. Both are significantly more accurate than Whisper.

Parakeet V2 Parakeet V3 Whisper Large V3
English WER 6.05% 6.32% 7.44%
Languages English only 25 100+

In short: if you only need English, both V2 and V3 are great. V3 is the default in Whisper Notes because multilingual support matters to most users — and the English accuracy difference is negligible.

Try It

Parakeet v3 is available now in the Mac version — just download the latest DMG. Update: Parakeet is now available in the latest iOS version as well.

Questions or feedback? Email support@whispernotes.app.