homesearch



mediacaptain

At FX Palo Alto Laboratory, I was solely responsible for my own project, the "mediacaptain", which was approved immediately. It is a system to facilitate indexing and browsing of streaming media over the Web. I managed my own project from the idea, through the development, to the evaluation. It uses speech analysis, speech recognition and linking to textual and graphical content.

It works like this: You upload your video, and it gets converted to all major streaming formats automatically. A website is also created, and you can specify whether you have additional content that you would like to link (such as graphics or diagrams), or would like the system to do it automatically. If you prefer that, text is extracted from the video via automatic speech recognition. If you have used the system to record TV programs, the provided closed-captions or subtitles are used as additional textual content.

This data is then used to segment the video: If there is a large pause in the audio, or the text contains a paragraph, it can be safely assumed that the video also separates scenes at this particular point. (Initial studies have shown this.) At each of these index points, a frame is extracted from the video. With these frames, a website is automatically created which contains the video as well as the text or graphic. If the user moves the mouse over each paragraph, the corresponding frame is displayed on top of the video window, giving the user a visual indication of where the video would start. This avoids the time-consuming connecting to the server and seeking through the video, and also provides access to determined index points, supported by textual and visual content, all on one webpage.

The project page is on mediacaptain.com.




 
    To my next example
to the contact page