|
Posted by Jukka Aho on 06/15/06 01:53
Veli-Pekka Tätilä wrote:
> John Howells wrote:
>
>> Subtitles are encoded bitmaps, and do not exist on the DVD in a
>> textual form
> Ah, bad news but good to know. ISn't such a representation pretty
> inefficient, by the way, compared to unicode text data?
Unicode-based textual subtitle data would make it necessary to include a
full Unicode font (however that is defined) in the firmware of each DVD
player. Not only that, but you would need to support combining
diacritics and other advanced, hard-to-program features. Bitmaps, on the
other hand, allow for flexible subtitling in Chinese, Arabic, Hindi,
Korean, etc. - even imaginary scripts such as Klingon.
> Well I guess that's not an issue on DVD disks, now that I think of it,
> especially if it is a black and white image i.e. a true bitmap.
DVD subtitling allows four colors - or, in practice, three colors, since
one of the four is reserved for marking the transparent areas through
which you can see the actual video. Usually white text with black
outlines is used (for the simple reason that this color scheme will make
the text visible regardless of whether the video on the background is
dark or bright in its tone in a given scene.)
> I really lack the expertise to do this, but would domain specific
> optical character recognition work here? Are the fonts standard or do
> they vary in the sub-titles? I'm thinking that a specially tuned
> OCR program might be able to turn the bitmaps into text.
The fonts vary in their shape and design, but OCR applications actually
_do_ exist for this purpose. One of them is SubRip:
<http://zuggy.wz.cz/dvd.php>
<http://www.afterdawn.com/guides/archive/rip_subtitles_with
_subrip.cfm>
> <EIther in real time if the machine can handle it or as a one-shot
> process writing out the cached data on disk for future reference.
SupRip is not a real-time tool, so it's a one-shot process if you use
that.
> OK now that the disk format is out of the way, howabout the subtitle
> files with the sub extension? I do .realize these are most often used
> with ripped DVDs but that's not what I'm
> thinking. I'd gladly use them with the original disks just to get at
> the text itself more easily. I took a look at a couple of files and
> they do seem like timestamped text. Any
> players which do the rendering accessibly using the OS's GUI
> controls and font rendering engime?
Practically all modern (software) video players support or can be made
to support *.sub and other subtitle formats. There are DirectX filter
called "VobSub" which you can install on a Windows system to be able to
play back video files together with their subtitle files - even on
Windows Media Player. See the following link for more information:
<http://www.weethet.nl/english/video_vobsub_subtitles.php>
--
znark
[Back to original message]
|