36 Answers

Sermon Transcription FAQ

Every question pastors and church tech teams ask before signing up — answered plainly.

Pricing

How much does sermon transcription cost?+

Standard tier is $0.006 per minute of audio (about $0.27 per 45-minute sermon). Premium tier with speaker identification is $0.02 per minute ($0.90 per sermon). The Pro Monthly plan is $29/month with 1,000 Standard + 300 Premium minutes included. First 10 minutes are free — no credit card required.

Is there a free plan?+

Yes. Every new account gets 10 minutes of free transcription with no credit card required. After that, pay only for what you use at $0.006 per minute on Standard.

Are there setup fees or minimum charges?+

No. There are no setup fees, no monthly minimums, and no contracts on the pay-as-you-go tier. You're charged per minute of audio transcribed.

Do you offer nonprofit or denomination-wide discounts?+

Pricing is already 250× cheaper than Rev.com. For institutional volume above 5,000 minutes per month, contact hello@sermon-transcription.com for custom pricing.

Is Sermon Transcription cheaper than Rev.com?+

Yes — 250× cheaper. Rev.com charges $1.50/min for human transcription or $0.25/min for AI. Sermon Transcription's Standard tier is $0.006/min. A 45-minute sermon costs $0.27 vs $67.50 at Rev.com.

Accuracy

How accurate is AI sermon transcription?+

Standard tier (OpenAI Whisper) achieves 99% accuracy on clear audio (about 1.8% word error rate). Premium tier (ElevenLabs) achieves 99.5%. Both match or exceed professional human transcribers on typical sermon audio.

Does it handle theological vocabulary correctly?+

Yes. Modern AI transcription handles standard theological terminology (sanctification, propitiation, justification, imputation) and Bible book names (Habakkuk, Ecclesiastes, Zephaniah) accurately. For unusual proper nouns, we use prompt seeding to bias accuracy.

What if my audio quality is poor?+

AI transcription tolerates reasonable noise, but the best accuracy comes from a lapel mic close to the speaker. If your audio has heavy noise, music bleed, or multiple overlapping speakers, use the Premium tier for better results.

Can it transcribe whispered or quiet sermon moments?+

Yes. Both tiers handle quiet passages, prayers, and pastoral asides as long as they're audibly captured by the recording.

Features

What file formats are supported?+

MP3, MP4, WAV, M4A, MOV, AAC, FLAC, OGG, WebM, and most other major audio and video container formats. If a standard player can play it, we can transcribe it.

What's the maximum file size?+

The underlying APIs limit individual requests to 25 MB. We handle automatic chunking for larger files — a typical 60-minute sermon at standard quality is well under any practical limit.

What output formats do I get?+

Every transcription returns plain text (.txt), SubRip Subtitle (.srt), WebVTT (.vtt), and optional verbose JSON with word-level timestamps. Premium tier output includes speaker labels.

Does it support speaker identification?+

Yes, in the Premium tier (powered by ElevenLabs Audio Intelligence). Standard tier does not include diarization natively.

Can I get word-level timestamps?+

Yes. The verbose JSON response includes timestamps for every individual word, enabling karaoke-style captions, exact-quote linking, and rapid clip extraction.

Languages

What languages are supported?+

Standard tier supports 90+ languages including Spanish, Portuguese, Korean, Mandarin, Tagalog, Haitian Creole, French, German, Italian, Russian, Arabic, and more. Premium tier supports 100+ languages with auto-detection.

Can it auto-detect the language?+

Yes — Premium tier auto-detects the spoken language. On Standard tier, specifying the language explicitly improves accuracy for non-English sermons.

Can it translate sermons?+

Transcription captures speech in its original language. For translation, pair the English transcript with DeepL, ChatGPT, or Claude. We're working on integrated multilingual output for a future release.

Speed

How long does transcription take?+

A 45-minute sermon completes in 3–5 minutes. A 90-minute service typically takes 6–10 minutes. We process audio at roughly 10× real-time speed.

Can I transcribe live during the service?+

Not yet — the current service is batch-only. Sermon Transcription processes recorded audio. For live captioning, look at YouTube auto-captions or Otter.ai's live stream feature, then re-caption with a proper SRT post-event.

Use Cases

Is this for individual pastors or whole churches?+

Both. Solo pastors building sermon archives, multisite networks needing standardized output, and seminaries archiving lectures all use the service. Pricing scales linearly with usage.

Can I use it for podcast transcription?+

Absolutely. Sermon podcasters use it for show notes, chapter markers, episode SEO, and accessible transcripts. The cost is the same: about $0.27 per 45-minute episode.

What about archiving old sermon tapes or CDs?+

Digitize the recordings to MP3 or WAV, then upload. Archive projects of hundreds of old sermons typically cost under $50 total on Standard tier.

Privacy

Is my audio kept private?+

Yes. Audio is processed once and deleted from our servers within 24 hours of completion. Transcripts are retained on your account until you delete them. We never share audio or transcripts with third parties.

Is my audio used to train AI models?+

No. We operate under OpenAI and ElevenLabs enterprise API terms that explicitly exclude API inputs from model training. Your sermons are not used to train anyone's models.

Is the service GDPR / CCPA compliant?+

Yes. Audio is processed in-region where possible, deleted within 24 hours, and never sold. We respond to data deletion requests within 30 days. Full privacy policy at our /privacy page.

Copyright

Who owns the transcribed text?+

You do. The transcript is a derivative of the original audio recording, which is owned by the speaker (pastor) and/or the church. Our terms of service do not claim any ownership of your transcripts.

Can I transcribe a sermon I didn't preach (e.g., a guest speaker)?+

Legally, the speaker holds copyright on their spoken work. Transcribing for internal church archive purposes is widely accepted under fair use. For public republishing, get the speaker's written permission.

What about transcribing worship music or hymns?+

Worship music typically requires CCLI or similar licensing for reproduction. Public-domain hymns can be freely transcribed. For commercial music, transcripts of lyrics may still fall under copyright.

Accessibility

Do you support ADA-compliant captions?+

Yes. SRT and VTT output upload directly to YouTube, Vimeo, and HTML5 video players as closed captions, meeting WCAG 2.1 Level AA captioning requirements.

Are sermons accessible to deaf and hard-of-hearing members with this?+

Yes — and this is one of the biggest reasons churches subscribe. Full transcripts serve members who prefer reading; closed captions on video serve those watching with sound off or with hearing loss.

Integration

Can I integrate this with my church website?+

Yes. Most churches paste the transcript directly into their CMS as a blog post. We also expose webhooks for automated CMS publishing once you're on the Pro tier.

Do you have an API?+

Yes — the Sermon Transcription API is the same API powering our web UI. See our API documentation page for endpoints, authentication, and code samples.

Can I batch upload many sermons?+

Yes. The web UI supports drag-and-drop of multiple files. For large archive projects (hundreds of sermons), the API is the cleaner path.

Compare

How is this different from Otter.ai?+

Otter.ai is built for meetings — speed and live captioning. Sermon Transcription is built for sermons — theological vocabulary, scripture references, batch-quality output, lower per-minute cost. For weekly sermon publishing, Sermon Transcription is 25× cheaper than Otter's effective per-minute price.

How is this different from Rev.com?+

Rev.com is general-purpose transcription. Sermon Transcription is church-specific: theological accuracy, sermon-optimized AI prompts, and 250× cheaper than Rev's human tier.

How is this different from Descript?+

Descript is a transcription-aware video editor — best if you want to edit video and audio by editing text. Sermon Transcription is a pure transcription service: simpler, faster, and dramatically cheaper for archive workflows.

Still have questions?

Email hello@sermon-transcription.com. We read and respond to every message within one business day.

Or just try it free