Rip DVD Subtitles to SRT
Almost a decade or two ago I authored my own DVD for fun. I
embedded subtitles in that DVD. I now want to “rip” my
DVD’s subtitles using the SRT
format rather
than recreate them from scratch.
Note: This process is not guaranteed to work well
Using mencoder
the subtitles can be ripped in the
split .sub
and .idx
“VobSub”
format.
mkdir /tmp/subtitles
cd /tmp/subtitles
title=1
subtitle_id=0
mencoder \
-dvd-device /home/my-user/my-dvd.iso \
dvd://${title} \
-nosound \
-ovc copy -o /dev/null \
-sid ${subtitle_id} \
-vobsubout my-subtitles
Two files should now exist in the current directory.
my-subtitles.sub
and my-subtitles.idx
.
Then vobsub2srt
and tesseract
can be
used to convert those subtitle files to an English language
.srt
subtitle file.
vobsub2srt
uses tesseract
which uses OCR
(optical character recognition) to “read” the bitmap
subtitle images from the DVD and try to translate them into simple
text. Note that, as stated above, the OCR process is subject to
flaws and will almost certainly not work as well as desired.
This assumes the tesseract-data-eng
(or similar
package that provides the desired language) is installed.
vobsub2srt --tesseract-lang eng my-subtitles
Open the resulting my-subtitles.srt
file to verify
how well the rendered subtitle output looks.