Rip DVD Subtitles to SRT
Almost a decade or two ago I authored my own DVD for fun. I
embedded subtitles in that DVD. I now want to “rip” my
DVD’s subtitles using the SRT format rather
than recreate them from scratch.
Note: This process is not guaranteed to work well
Using mencoder the subtitles can be ripped in the
split .sub and .idx “VobSub”
format.
mkdir /tmp/subtitles
cd /tmp/subtitles
title=1
subtitle_id=0
mencoder \
-dvd-device /home/my-user/my-dvd.iso \
dvd://${title} \
-nosound \
-ovc copy -o /dev/null \
-sid ${subtitle_id} \
-vobsubout my-subtitles
Two files should now exist in the current directory.
my-subtitles.sub and my-subtitles.idx.
Then vobsub2srt and tesseract can be
used to convert those subtitle files to an English language
.srt subtitle file.
vobsub2srt uses tesseract which uses OCR
(optical character recognition) to “read” the bitmap
subtitle images from the DVD and try to translate them into simple
text. Note that, as stated above, the OCR process is subject to
flaws and will almost certainly not work as well as desired.
This assumes the tesseract-data-eng (or similar
package that provides the desired language) is installed.
vobsub2srt --tesseract-lang eng my-subtitles
Open the resulting my-subtitles.srt file to verify
how well the rendered subtitle output looks.