Free · SRT and VTT · Browser-only

Text to SRT Converter

Paste a plain sermon transcript, enter the total duration, and get a valid SRT or WebVTT caption file. Cues are evenly distributed across the runtime — perfect for YouTube, Vimeo, or your church streaming platform.

1800.0 seconds

Standard is 6–10 words per cue

Cues produced

0

~0.0s per cue

0 cues · 0 chars

Skip the manual splitting

Upload your sermon audio — we generate timed SRT and VTT with word-level precision.

Start Free

How the conversion works

  1. Tokenize and chunk. The transcript is split by whitespace into individual words, then grouped into cues of the size you specify (8 words per cue is the standard for sermon captions).
  2. Distribute time evenly. The total duration is divided by the number of cues, producing a uniform per-cue interval. Each cue gets sequential timestamps in HH:MM:SS,mmm format for SRT or HH:MM:SS.mmm for WebVTT.
  3. Format and emit. SRT cues are numbered starting at 1, with a blank line between them. The WebVTT variant adds the required WEBVTT header. Both formats validate in YouTube Studio, Vimeo, and FFmpeg.

When even distribution is good enough

Word-perfect timing requires the original audio. Without it, no tool can know exactly when the preacher paused or sped up. But for a remarkable number of use cases — preview captions for promotional clips, accessibility fallback when a sync file is lost, social-media reels where the audio is muted by default — uniformly distributed cues are accurate enough that viewers cannot tell. Speech tends to average out across a sermon, and at 8 words per cue with a typical 135-WPM delivery, each caption lands within about one second of the actual speech 80% of the time.

SubRip (SRT) remains the most widely supported caption format. It works in every major video editor, YouTube Studio, Vimeo's caption uploader, and OBS Studio. WebVTT is required for HTML5 video and adds optional styling — but for sermon use the two formats are nearly interchangeable, which is why we generate both with a single click. If your church streams to multiple platforms, keep an SRT for legacy systems and a VTT for your custom player.

For audio-synced precision, use the full transcription pipeline which produces word-level timestamps from your actual audio. This text-to-SRT tool is best understood as a fast caption scaffold: perfect for when you have the manuscript and the runtime but not the original recording handy.

Related tools

Keep reading