Face recognition enhancement through the use of depth maps and deep learning
thesisposted on 22.09.2017 by Yaser Saleh
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Face recognition, although being a popular area of research for over a decade has still many open research challenges. Some of these challenges include the recognition of poorly illuminated faces, recognition under pose variations and also the challenge of capturing sufficient training data to enable recognition under pose/viewpoint changes. With the appearance of cheap and effective multimodal image capture hardware, such as the Microsoft Kinect device, new possibilities of research have been uncovered. One opportunity is to explore the potential use of the depth maps generated by the Kinect as an additional data source to recognize human faces under low levels of scene illumination, and to generate new images through creating a 3D model using the depth maps and visible-spectrum / RGB images that can then be used to enhance face recognition accuracy by improving the training phase of a classification task.. With the goal of enhancing face recognition, this research first investigated how depth maps, since not affected by illumination, can improve face recognition, if algorithms traditionally used in face recognition were used. To this effect a number of popular benchmark face recognition algorithms are tested. It is proved that algorithms based on LBP and Eigenfaces are able to provide high level of accuracy in face recognition due to the significantly high resolution of the depth map images generated by the latest version of the Kinect device. To complement this work a novel algorithm named the Dense Feature Detector is presented and is proven to be effective in face recognition using depth map images, in particular under wellilluminated conditions. Another technique that was presented for the goal of enhancing face recognition is to be able to reconstruct face images in different angles, through the use of the data of one frontal RGB image and the corresponding depth map captured by the Kinect, using faster and effective 3D object reconstruction technique. Using the Overfeat network based on Convolutional Neural Networks for feature extraction and a SVM for classification it is shown that a technically unlimited number of multiple views can be created from the proposed 3D model that consists features of the face if captured real at similar angles. Thus these images can be used as real training images, thus removing the need to capture many examples of a facial image from different viewpoints for the training of the image classifier. Thus the proposed 3D model will save significant amount of time and effort in capturing sufficient training data that is essential in recognition of the human face under variations of pose/viewpoint. The thesis argues that the same approach can also be used as a novel approach to face recognition, which promises significantly high levels of face recognition accuracy base on depth images. Finally following the recent trends in replacing traditional face recognition algorithms with the effective use of deep learning networks, the thesis investigates the use of four popular networks, VGG-16, VGG-19, VGG-S and GoogLeNet in depth maps based face recognition and proposes the effective use of Transfer Learning to enhance the performance of such Deep Learning networks.
- Computer Science