Achievements of Florian Floyd Mueller

mediacaptain

At FX Palo Alto Laboratory, I was responsible for the "mediacaptain". It is a system to facilitate indexing and browsing of streaming media over the Web. It uses speech analysis, speech recognition and linking to textual and graphical content.

It works like this: You upload your video, and a website is also created, and you can specify whether you have additional content that you would like to link (such as graphics or diagrams), or would like the system to do it automatically. If you prefer that, text is extracted from the video via automatic speech recognition. If you have used the system to record TV programs, the provided closed-captions or subtitles are used as additional textual content.

This data is then used to segment the video: If there is a large pause in the audio, or the text contains a paragraph, it can be safely assumed that the video also separates scenes at this particular point. (Initial studies have shown this.) At each of these index points, a frame is extracted from the video. With these frames, a website is automatically created which contains the video as well as the text or graphic. If the user moves the mouse over each paragraph, the corresponding frame is displayed on top of the video window, giving the user a visual indication of where the video would start. This avoids the time-consuming connecting to the server and seeking through the video, and also provides access to determined index points, supported by textual and visual content, all on one webpage.

The project page is on mediacaptain.com.

Research abstract

The increase of bandwidth and streaming technology has made video on the Web the current “killer-app” of the dot-com world. However, users still face many problems. Users have to find the right video and the right segment within the video. Locally stored files provide easy (but still not very sophisticated) access to individual points in the video by utilizing a seek slider. If the video is streamed over the Internet, this slider loses much of its attraction. Every accessed point in the video requires the video player to buffer, which causes a time lag. The mediacaptain is a system that addresses this issue by using supplementary material like text and graphics to provide indices. This time-aligned material is used to help the user make an informed decision on whether they want to watch a video and if so, what portions. This web-enabled prototype called mediacaptain emerged from user surveys and is demonstrated on several content types and represents an advanced experience with video on the Web.

mediacaptain – an interface for browsing streaming media (3 pages, 800 KB)

mediacaptain – a demo (2 pages, 500 KB)