Transcribe WAV to Text — Free + Accurate
Upload a WAV file from your digital recorder, mixer, or DAW and get back a transcript with timestamps in about 5 minutes. 99.5%+ accuracy on clean voice recordings — and yes, even your 24-bit 96kHz session files.
Processing
5 min / 45-min WAV
Accuracy
99.5%+
Bit depth
8/16/24/32-bit accepted
What is a WAV file?
WAV (Waveform Audio File Format) is a lossless audio container introduced by Microsoft and IBM in 1991 and still the de facto standard for raw audio in studios, mixers, and field recorders. A WAV file stores audio as uncompressed PCM samples — typically 16-bit at 44.1kHz (CD quality) or 24-bit at 48kHz (broadcast standard).
WAV files are big — about 10MB per minute at 16-bit 44.1kHz stereo. They're the right format for archiving and editing but heavy to upload. For transcription specifically, the lossless quality is overkill: speech-to-text models downsample everything to 16kHz mono before processing. A 64 kbps MP3 will give you the same transcript as a 750MB multi-track WAV.
Step-by-step: WAV to text
- 1
Open the transcription tool
Click /transcribe in the nav. The upload zone accepts .wav directly — no conversion required.
- 2
Drag your .wav file into the upload zone
Up to 25MB on the free tier. If your WAV is from a digital mixer multitrack export, drop only the mixed stereo bus or the preacher's lavalier track — not the room mic.
- 3
Over 25MB? Compress first or upgrade
A 45-minute 16-bit 44.1kHz stereo WAV is ~460MB. Either drop into Audacity → File → Export as MP3 at 64 kbps mono (~22MB result), or upgrade to Pro for 500MB uploads.
- 4
Pick Standard or Premium tier
Standard ($0.006/min, OpenAI Whisper) for single-speaker. Premium ($0.02/min, ElevenLabs) for diarization on multi-speaker WAVs. Both downsample your WAV internally to 16kHz mono before transcribing.
- 5
Wait about a tenth of the audio length
A 45-minute WAV finishes in about 5 minutes. You can close the tab and come back — the result is saved to your dashboard and emailed when ready.
- 6
Download .txt, .srt, .vtt, or .docx
Same output formats whether you uploaded WAV or MP3. Word-level timestamps included in the .docx and JSON exports (Pro).
Audio format & size compatibility
| Format | Free tier max | Pro max | Notes |
|---|---|---|---|
| WAV (16-bit / 44.1kHz) | 25 MB ≈ 2.5 min | 500 MB ≈ 50 min | Native — works at full fidelity |
| WAV (24-bit / 48kHz) | 25 MB ≈ 1.5 min | 500 MB ≈ 30 min | Broadcast standard — works native |
| WAV (24-bit / 96kHz) | 25 MB ≈ 45 sec | 500 MB ≈ 15 min | Studio quality — works native |
| AIFF (Apple) | 25 MB | 500 MB | Same as WAV — accepted |
| FLAC (lossless) | 25 MB ≈ 5–8 min | 500 MB | ~50% smaller than WAV, same fidelity |
| MP3 / M4A / OGG (recommended for size) | 25 MB ≈ 45 min | 500 MB | Best fit if file size matters |
WAV-specific tips
- Big WAV? Compress to MP3 first — accuracy stays the same. In Audacity: File → Export → Export as MP3 → Quality: 64 kbps, Channel Mode: Mono. Or in ffmpeg:
ffmpeg -i input.wav -ac 1 -b:a 64k output.mp3 - Recording from a mixer? Use the post-fader mix bus, not the input gain track. Compressed/normalized board mixes transcribe more accurately than raw mic preamp signals.
- Multi-track WAV? Bounce down to a single mono or stereo WAV first in Reaper, Pro Tools, Logic, or Audacity. Choose the mic track that has the cleanest preacher audio — don't use a room mic or audience track.
- Want to keep a lossless archive? Keep the original WAV in your church's storage, but upload an MP3 copy for transcription. You preserve fidelity for any future re-edit while keeping uploads fast.
- Background hum or HVAC noise? A quick Audacity Noise Reduction pass (Effect → Noise Reduction) on the WAV before upload typically lifts accuracy by 1–2 percentage points on noisy recordings.
The WAV transcription workflow
WAV transcription pricing vs alternatives
| Service | Cost / 45-min WAV | Accuracy | Max WAV size | Free tier |
|---|---|---|---|---|
| Sermon Transcription (Std) | $0.27 | 99.0–99.5% | 500 MB (Pro) | 10 min free |
| Sermon Transcription (Premium) | $0.90 | 99.5%+ with diarization | 500 MB (Pro) | 10 min free |
| Rev AI | $11.25 | 90–95% | 2 GB | 5 hours free |
| Rev human | $67.50 | 99%+ | 2 GB | None |
| Otter Pro | ~$0.64 effective | 90–95% | 3 GB | 300 min / mo (30-min file cap) |
| HappyScribe AI | ~$9.00 | 85–92% | 2 GB | None |
Pricing as of early 2026. Rev AI $0.25/min; Rev human $1.50/min. Otter Pro $16.99/mo ÷ 1,200 included min × 45 min. HappyScribe AI list price ~$0.20/min.
WAV transcription FAQ
Why use WAV instead of MP3 for transcription?+
WAV is lossless — the audio is stored exactly as recorded with no compression. For transcription, this only matters at the edges: very quiet voices, heavily reverberant rooms, or speakers with subtle accents. On clean voice recordings, WAV and MP3 transcribe at nearly identical accuracy. WAV is more relevant if you want to keep a high-fidelity master and have the disk space to spare.
What's the maximum WAV file size I can upload?+
25MB on the free tier. WAV files are big — a 45-minute 16-bit 44.1kHz mono recording is about 230MB, which exceeds the free tier. Either compress to MP3 at 64 kbps mono first (drops to ~20MB with no accuracy loss), or upgrade to Pro for 500MB uploads.
Does it support 24-bit and 96kHz WAV?+
Yes. We accept 8/16/24/32-bit WAV at any sample rate from 8kHz to 96kHz, mono or stereo. The transcription engine downsamples internally to 16kHz mono before processing — the higher fidelity isn't useful for speech-to-text but doesn't break anything.
Will compressing my WAV to MP3 hurt accuracy?+
No. We tested side-by-side: identical sermon audio at 24-bit/48kHz WAV (~750MB) and 64 kbps mono MP3 (~22MB) produce transcripts that differ by less than 0.1%. The Whisper model is trained on compressed audio and is essentially blind to the difference. Compress freely.
Can I upload a WAV from my church board mixer?+
Yes — most digital mixers (Behringer X32, Allen & Heath, Yamaha) export multi-track WAV. Use the mixed-down stereo bus or the dedicated preacher mic track for best accuracy. Avoid using a room mic track as the source; it will pick up congregation noise and lower accuracy.
What about multi-channel WAV files?+
Stereo WAV is fine — we sum to mono before transcription. True multi-channel WAV (3+ channels, like 5.1 surround) is accepted but only the first two channels are read. If you have an isolated preacher mic on channel 3, mix it to a stereo or mono file in any DAW first.
Upload your WAV. Get text back in 5 minutes.
First 10 minutes free. Lossless audio in, searchable transcript out.
Upload WAVRelated
Alternative
Sermon Transcription vs Rev.com
Same WAV uploads, lower price.
Alternative
Sermon Transcription vs SermonShots
Transcript-first vs clip-first workflow.
Comparison
Best Sermon Transcription Services (2026)
6 services tested with real sermon audio.
Guide
Build a Searchable Sermon Archive
From WAV master files to a queryable library.