Høgskolen i Gjøvik

HiG / IMT / emnesider / IMT4901 / mt2012

MEDIA TECHNOLOGY 2012

Bilde av Gaddam, Vamsidhar Reddy

VAMSIDHAR REDDY GADDAM

Title: Real time estimation of dense depth maps

Abstract:

Many automated vision systems could benefit a lot from the depth maps. Traditionally depth maps were extracted using stereo camera approach. Depth map extraction in stereo systems can be broadly classified into two classes, window based approaches and global optimisation approaches. The former approaches suffer from the inter-camera differences like different sensor response curves, different exposure times, lens distortions and others. On the contrary, the latter approaches provide robust depth information at the cost of computation time. Recent trends in GPU programming showed hope for real-time implementation of such approaches. 

Adding an extra camera will provide robustness to the technique, a lot of research has been done to incorporate a third camera in different configurations. Most research considers the cameras as a collection of stereo pairs or a special case of mutliple camera system. But the system as itself has certain geometric constraints which can be exploited for depth estimation.

This thesis will fuse both the problems presented above and enhance the graph-cut techniques to include the three-camera geometry. Additionally the energy function will be modified to include the previously extracted depth information.

Bilde av Vlad Caia

VLAD CAIA

Title: Speech recognition for people with disabilities

Abstract:

One’s interaction with the environment differs from person to person and can be done in various ways. The passing of information from a person to a computer based system is primarily done by one’s senses thru auditory and visual communication. Human and computer interactions is usually done by using an input device such as a mouse and a keyboard while using a computer screen, speakers or printer for output. Human speech has always been an important part of our social function, being developed even earlier than writing. With increase in computing power, enhanced scalability and mobility of devices, computers have become an important useful tool for our activities. Developers, engineers and programmers have always strived to create an easier way of communication with computer-based systems. 

Speech recognition by a computer poses a complex problem because one’s speech can differ in terms of accent, pronunciation, articulation, nasality, pitch, volume, speed etc. On top of that, speech can sometimes be polluted by background noises and reverberations. These variables can easily affect the performance of an ASR system and much effort has to be put in proper configuration and optimization. An important aspect is the existence of a different demographic of users who can’t use traditional computer interaction (such as keyboard and mouse) and such have to rely on alternative technologies. ASR presents the opportunity to provide these disabled users a better way to interact with an informational system and a better user experience throughout without leading to evident frustration. This poses another problem on top of the basic existing variables. The degree and type of user’s disability will impose great effort on a traditionally designed ASR system, as most of the recognition computations will be done outside of the existing system’s rules. This corresponds to a high degree of variability in the disabled user’s speech and an adaptation strategy has to be implemented. With disabled users in mind, scalability of a complete ASR system represents an important factor in their daily lives. With existing hardware capabilities, a mobile device incorporating ASR technology can be achieved.

Bilde av Jose Mario PerezJOSE MARIO PEREZ

Title:  Feasibility of Face Detection and Face Tracking with combination of Augmented Reality and Color Science on Smartphones

Abstract:

This master thesis has as objective to determine the Feasibility of the use of Face Detection and Face Tracking techniques with the combination of Augmented Reality and Color Science technologies working together on a mobile setting.
A prototype has been created using an Android smartphone. This prototype tries to demonstrate if this is possible by addressing the problem that women go through when trying different makeup combinations and how skin color, current light condition and illumination may affect this process. The prototype shows to the user how she looks like with "`virtual makeup"'.
Think in the following scenario: a blonde woman is trying a new red lipstick under white light illumination in a store. When she gets home and applies the lipstick under a yellow light illumination, it may look different due the different light color conditions. The same red lipstick may look different on an Afro-American woman.
This application uses the front-facing camera of the smartphone and uses the Viola-Jones feature detection algorithm to detect the lips and eyes in a face in real-time. When either lips or eyes are detected, the user can select a color of lipstick or shades for the eyes. The application will calculate the correct hue to display. Once this calculation has been made, with the aid of AR, the lips and eyes will be shown with the virtual makeup on the screen.

15.04.2012