Question 1

Is my audio or video uploaded anywhere?

Accepted Answer

No. The speech-recognition AI and all processing run inside your browser on your own device — there are no servers, no upload, no analytics. After the one-time model download you can disconnect from the internet and it still works. For private recordings (interviews, meetings, voice memos) that matters: cloud transcribers send your audio to their servers; this one never does.

Question 2

How does speech recognition run in my browser with nothing uploaded?

Accepted Answer

It uses Whisper — OpenAI’s open-source speech-recognition model — compiled to run on your device via WebAssembly. The first time you transcribe, the model downloads once (~63 MB), caches, and from then on every transcription happens locally and offline. Your browser decodes the audio and the AI reads it; the file is never sent anywhere.

Question 3

Can I make subtitles (.srt / .vtt)?

Accepted Answer

Yes. The transcript is timestamped per phrase, so you can export standard SubRip (.srt) or WebVTT (.vtt) subtitle files ready to drop into a video editor or upload alongside a video. You can also export a plain-text transcript or copy it.

Question 4

Does it work on video files too?

Accepted Answer

Yes — drop in a video and it reads the audio track. You get a player you can scrub, with each transcript line clickable to jump to that moment. (Very large videos use more memory; if a format’s audio can’t be decoded in your browser, extract the audio first.)

Question 5

How accurate is it, and what are the limits?

Accepted Answer

This is the compact English model (Whisper tiny.en) chosen so it runs fast and privately on your device. It’s good on clear English speech, but heavy accents, background noise, music, crosstalk and very long files are harder, and it’s English-only. You can edit the transcript before exporting. It trades some of a giant cloud model’s peak accuracy for total privacy and offline use.

Question 6

Is there a length limit or watermark?

Accepted Answer

No watermark and no hard length cap — but it runs on your CPU in the browser, so a long recording takes a while (a progress bar shows it) and uses memory. Shorter clips are quick; hour-long files are slower than a cloud GPU but completely private.

AI Transcriber

Frequently asked questions

Related tools

Built by a privacy company