Audio to Text Converter — Free Online Transcription
Drag any audio file — MP3, WAV, M4A, FLAC, OGG — and get back searchable text with timestamps. 99.5% AI accuracy on sermons, interviews, podcasts, lectures, and meetings. First 10 minutes are free; everything after is $0.006/min.
Processing
5 min / 45-min file
Accuracy
99.0–99.5%
Formats
MP3 · WAV · M4A · FLAC · OGG
What this is
Audio-to-text conversion (also called speech-to-text or transcription) uses a neural network to turn recorded speech into written words. Modern AI models like OpenAI Whisper achieve near-human accuracy on clear voice recordings at a tiny fraction of the cost of professional human transcription.
This page is the universal entry point — upload anything from a podcast MP3 to a lecture WAV to an iPhone Voice Memo M4A, and the same backend handles it. If you have a specific format in mind we have dedicated guides for MP3, WAV, M4A, MP4 video, and YouTube links.
Step-by-step: audio file to text
- 1
Open the upload tool
Go to /transcribe in the nav. No signup needed for your first 10 minutes of audio.
- 2
Drop the audio file in
Drag-drop from Finder, Explorer, Dropbox, or Google Drive. Up to 25MB on the free tier; 500MB per file on Pro. Accepted: MP3, WAV, M4A, AAC, FLAC, OGG, OPUS, AIFF.
- 3
Pick Standard or Premium
Standard ($0.006/min, OpenAI Whisper) is the right call for most single-speaker recordings. Premium ($0.02/min, ElevenLabs Scribe) adds speaker diarization — labels who said what. Pick Premium if you're transcribing a panel, interview, or roundtable.
- 4
Let the engine run
Processing takes ~10% of audio length. A 45-minute file transcribes in about 5 minutes. You can close the tab; results land in your dashboard and an email goes out when done.
- 5
Review the transcript in the browser
Hit play in the inline player — each word is clickable and seeks the audio to that timestamp. Edit typos directly. Search-find works for jumping to specific phrases.
- 6
Download the output
Choose .txt (plain text), .srt (subtitle format), .vtt (HTML5 captions), or .docx (Word document with formatted timestamps). Pro accounts also get JSON with word-level timestamps for custom processing.
Supported audio formats
| Format | Free tier max | Pro max | Best for |
|---|---|---|---|
| MP3 | 25 MB ≈ 45 min | 500 MB | Podcasts, sermons, lectures (most common) |
| WAV | 25 MB ≈ 2.5 min | 500 MB | Studio masters, mixer recordings |
| M4A / AAC | 25 MB | 500 MB | iPhone Voice Memos, QuickTime, Zoom |
| FLAC | 25 MB | 500 MB | Lossless without WAV bulk |
| OGG / OPUS | 25 MB | 500 MB | Discord exports, WhatsApp voice notes |
| AIFF | 25 MB | 500 MB | Apple Logic Pro masters |
| MP4 / MOV (video) | 25 MB | 500 MB | Audio extracted automatically |
Tips that lift accuracy 1–3%
- Mic placement is everything. A $20 lavalier within 6 inches of the speaker beats a $2,000 condenser at 6 feet, every time. The biggest accuracy gains come from before the recording, not after.
- Trim silence at the start and end with Audacity (Effect → Truncate Silence) before upload. Smaller files upload faster and you don't pay to transcribe dead air.
- Normalize the audio if levels are quiet. Audacity Effect → Normalize → -1.0 dB. This makes a measurable accuracy difference on under-gained recordings.
- Mono, not stereo, for voice. The model collapses to mono internally, but a properly summed mono export skips one step and gives marginally cleaner input.
- Two voices speaking over each other? Use Premium tier. Standard treats overlapping speech as one stream and confuses the lattice.
The audio-to-text workflow
Audio-to-text pricing vs alternatives
| Service | Cost / 45-min audio | Accuracy | Output formats | Free tier |
|---|---|---|---|---|
| Sermon Transcription (Std) | $0.27 | 99.0–99.5% | .txt .srt .vtt .docx | 10 min free |
| Sermon Transcription (Premium) | $0.90 | 99.5%+ with diarization | .txt .srt .vtt .docx + JSON | 10 min free |
| Rev AI | $11.25 | 90–95% | .txt .srt .vtt + JSON | 5 hours free |
| Rev human | $67.50 | 99%+ | .txt .docx | None |
| Otter Pro | ~$0.64 effective | 90–95% | .txt .docx .srt | 300 min / mo |
| HappyScribe AI | ~$9.00 | 85–92% | .txt .srt .vtt | None |
Pricing as of early 2026. Rev AI $0.25/min; Rev human $1.50/min. Otter Pro $16.99/mo ÷ 1,200 included min × 45 min. HappyScribe AI ~$0.20/min.
Audio to text FAQ
What audio formats can I convert to text?+
MP3, WAV, M4A (AAC), FLAC, OGG, AIFF, and OPUS upload natively. Video formats (MP4, MOV, MKV, WebM, AVI) work too — we strip the audio track on upload. If your format isn't listed, convert it to MP3 with VLC or HandBrake first.
Is it really free to convert audio to text?+
Yes — the first 10 minutes of audio are free for every user, no credit card and no signup required. After that, Standard tier is $0.006/min (about $0.27 per 45-minute file) and Premium is $0.02/min (about $0.90).
How accurate is the audio-to-text conversion?+
99.0–99.5% on clear voice recordings. The model is biased toward English speech but handles Spanish, French, German, Mandarin, Portuguese, and 90+ other languages with similar accuracy. Heavily accented speech, reverb-heavy rooms, and background music are the main accuracy killers.
How long does audio-to-text conversion take?+
Roughly one-tenth of the audio length. A 10-minute audio file converts in about 1 minute. A 45-minute sermon in about 5. A 90-minute podcast episode in about 10.
What output formats do I get?+
Plain text (.txt) for blog posts and search indexing. SRT (.srt) for video captions. WebVTT (.vtt) for HTML5 video players. Word document (.docx) with timestamps every 30 seconds. JSON with word-level timestamps available on Pro accounts.
Can I batch-convert multiple audio files at once?+
Yes — Pro accounts support folder upload. Drop in a folder of mixed MP3/WAV/M4A files and each is processed in parallel. Results land in your dashboard, downloadable individually or as a zipped bundle.
Convert your audio to text now
MP3, WAV, M4A, FLAC, OGG. First 10 minutes free. No signup until you go beyond that.
Start freeRelated
Alternative
Sermon Transcription vs Otter.ai
No file-length cap, no monthly subscription.
Alternative
Sermon Transcription vs HappyScribe
Higher accuracy on religious content.
Comparison
Best AI Transcription Software
Six tools tested on real audio.
Guide
Free vs Paid Sermon Transcription
When free tools cost you more than paid.