Video to Text · MP4 · MOV · MKV · WebM · 10 min free

Transcribe Video to Text Online — Any Format

Upload an MP4, MOV, MKV, WebM, or AVI. We extract the audio, run it through the same 99.5%-accurate transcription engine that powers our sermon tool, and hand you back captions in SRT and VTT plus a clean text transcript. About 5 minutes for a 45-minute video.

Processing

5 min / 45-min video

Caption sync

.srt + .vtt

Formats

MP4 · MOV · MKV · WebM

What this is

Video-to-text transcription extracts the speech from your video file and turns it into a written transcript plus time-coded caption files (.srt, .vtt). The video itself isn't transcribed — only the audio track, because text-to-speech models work on sound waves, not pixels.

This is the universal video page. For a deeper guide on the specific MP4 workflow see Transcribe MP4 to Text; for YouTube URLs, see YouTube Link to Transcript.

Step-by-step: video to text

  1. 1

    Open /transcribe

    Hit the top nav. The same upload zone takes both audio and video — no separate workflow.

  2. 2

    Drag your video file in

    MP4, MOV, MKV, WebM, or AVI. Free tier accepts up to 25MB; Pro accepts 500MB. The browser shows upload progress live.

  3. 3

    Server extracts the audio

    ffmpeg pulls the audio track out, downsamples to 16kHz mono, and discards the video. You're billed per audio minute, not video minute, so a 4K and 480p file with identical audio cost the same.

  4. 4

    Pick Standard or Premium

    Standard ($0.006/min) is fine for single-camera sermon or talking-head video. Premium ($0.02/min) labels who said what — pick it for panel discussions, interviews, or roundtables.

  5. 5

    Wait ~5 minutes for a 45-minute video

    Processing time scales with audio duration, not file size. A 45-minute 4K video and a 45-minute 480p video both finish in about 5 minutes.

  6. 6

    Download .srt for YouTube, .vtt for the web, .txt for blog

    Upload the .srt to YouTube Studio for instant synced captions. Drop the .vtt into your church website's <video> tag. Use the .txt for a sermon blog post or searchable archive entry.

Video format compatibility

FormatFree tier maxPro maxSource
MP4 (H.264 + AAC)25 MB500 MBYouTube Studio, OBS, Premiere
MOV (QuickTime)25 MB500 MBiPhone, Mac, Final Cut Pro
MKV (Matroska)25 MB500 MBOBS Studio, Plex
WebM25 MB500 MBYouTube downloads, browser recordings
AVI25 MB500 MBOlder Windows video editors
FLV / WMV25 MB500 MBLegacy formats — convert if errors
3GP25 MB500 MBOld phones — convert to MP4 first

Video transcription tips

  • Video over 25MB on free tier? Extract audio to MP3 first. ffmpeg:ffmpeg -i video.mp4 -vn -ac 1 -b:a 64k audio.mp3That gets a 45-min sermon under 25MB. Upload the .mp3 instead.
  • No command line? HandBrake (free GUI) — open the video, choose "Audio Only" preset, export as MP3. Or VLC: Media → Convert/Save → choose MP3 profile.
  • Want YouTube chapters? Use the .srt timestamps to mark natural section breaks in the sermon, then paste them into your YouTube description (one per line, e.g., "3:42 The big idea"). YouTube auto-detects and creates chapter markers.
  • Burned-in vs sidecar captions? Most platforms (YouTube, Vimeo, Squarespace, WordPress with the right plugin) take sidecar .srt or .vtt files and overlay captions live. Only burn captions in (with HandBrake or DaVinci Resolve) if you're distributing the raw .mp4 standalone.
  • Want to clip a section? Trim the video first in iMovie, CapCut, or Premiere to just the section you want transcribed. You pay only for the audio minutes you actually upload — trim aggressively for a tighter, cheaper transcript.

The video-to-text workflow

.mp4 / .movffmpegaudioextractedAI5 min /45-min video.srt.vtt.txt

Video transcription pricing vs alternatives

ServiceCost / 45-min videoAccuracyCaption filesFree tier
Sermon Transcription (Std)$0.2799.0–99.5%.srt .vtt .txt .docx10 min free
Sermon Transcription (Premium)$0.9099.5%+ with diarization.srt .vtt .txt .docx10 min free
YouTube auto-captionsFree75–88%Built-in onlyUnlimited
Rev AI$11.2590–95%.srt .vtt5 hours free
Rev human captioning$67.5099%+.srt .vtt + burned-inNone
HappyScribe AI~$9.0085–92%.srt .vttNone

Pricing as of early 2026. Rev AI $0.25/min; Rev human $1.50/min. HappyScribe AI ~$0.20/min. YouTube auto-captions are free but limited to YouTube videos and rarely meet WCAG accessibility standards.

Video to text FAQ

Which video formats can I transcribe?+

MP4 (H.264/H.265), MOV (QuickTime), MKV (Matroska), WebM, AVI, and FLV all upload natively. We strip the audio track using ffmpeg on our end and transcribe only the audio. If a format isn't accepted, run it through HandBrake (free) to MP4 first.

Do I need to extract audio before uploading?+

No — we do that step server-side. You upload the video, and the audio track is extracted, downsampled to 16kHz mono, and fed to the transcription engine. The video is dropped (no video minutes billed; only audio minutes).

What's the maximum video file size?+

25MB on the free tier — that's about 5 minutes of 720p MP4. A typical 45-minute sermon video at 720p is 200–500MB. To transcribe full-length video on the free tier, extract the audio to MP3 first (see tips below). Or upgrade to Pro for 500MB per file.

Will I get .srt and .vtt caption files back?+

Yes — every video transcription produces an SRT file (for YouTube, Vimeo, video editors) and a WebVTT file (for HTML5 <video> on the web). Plus plain text (.txt) and a Word doc (.docx) with timestamps every 30 seconds.

How does video accuracy compare to audio?+

Identical — because we transcribe only the audio track, video resolution has zero effect on accuracy. A 480p webcam recording with a good lavalier mic transcribes more accurately than a 4K cinema rig with a distant shotgun mic. Mic placement and signal quality are what matter.

Can I add the captions to my YouTube video?+

Yes. Download the .srt, go to YouTube Studio → your video → Subtitles → Add language → English → Upload file → Select with timing → choose the .srt. Captions appear within minutes and remain editable in YouTube's caption editor.

Drop your video. Get captions back.

First 10 minutes free. SRT and VTT generated automatically. Drop into YouTube Studio for instant captions.

Upload video

Related