Machine learning-based human observer analysis of video sequences
2018-06-28T07:37:27Z (GMT) by
The research contributes to the field of video analysis by proposing novel approaches to automatically generating human observer performance patterns that can be effectively used in advancing the modern video analytic and forensic algorithms. Eye tracker and eye movement analysis technology are employed in medical research, psychology, cognitive science and advertising. The data collected on human eye movement from the eye tracker can be analyzed using the machine and statistical learning approaches. Therefore, the study attempts to understand the visual attention pattern of people when observing a captured CCTV footage. It intends to prove whether the eye gaze of the observer which determines their behaviour is dependent on the given instructions or the knowledge they learn from the surveillance task. The research attempts to understand whether the attention of the observer on human objects is differently identified and tracked considering the different areas of the body of the tracked object. It attempts to know whether pattern analysis and machine learning can effectively replace the current conceptual and statistical approaches to the analysis of eye-tracking data captured within a CCTV surveillance task. A pilot study was employed that took around 30 minutes for each participant. It involved observing 13 different pre-recorded CCTV clips of public space. The participants are provided with a clear written description of the targets they should find in each video. The study included a total of 24 participants with varying levels of experience in analyzing CCTV video. A Tobii eye tracking system was employed to record the eye movements of the participants. The data captured by the eye tracking sensor is analyzed using statistical data analysis approaches like SPSS and machine learning algorithms using WEKA. The research concluded the existence of differences in behavioural patterns which could be used to classify participants of study is appropriate machine learning algorithms are employed. The research conducted on video analytics was perceived to be limited to few iii projects where the human object being observed was viewed as one object, and hence the detailed analysis of human observer attention pattern based on human body part articulation has not been investigated. All previous attempts in human observer visual attention pattern analysis on CCTV video analytics and forensics either used conceptual or statistical approaches. These methods were limited with regards to making predictions and the detection of hidden patterns. A novel approach to articulating human objects to be identified and tracked in a visual surveillance task led to constrained results, which demanded the use of advanced machine learning algorithms for classification of participants The research conducted within the context of this thesis resulted in several practical data collection and analysis challenges during formal CCTV operator based surveillance tasks. These made it difficult to obtain the appropriate cooperation from the expert operators of CCTV for data collection. Therefore, if expert operators were employed in the study rather than novice operator, a more discriminative and accurate classification would have been achieved. Machine learning approaches like ensemble learning and tree based algorithms can be applied in cases where a more detailed analysis of the human behaviour is needed. Traditional machine learning approaches are challenged by recent advances in the field of convolutional neural networks and deep learning. Therefore, future research can replace the traditional machine learning approaches employed in this study, with convolutional neural networks. The current research was limited to 13 different videos with different descriptions given to the participants for identifying and tracking different individuals. The research can be expanded to include any complicated demands with regards to changes in the analysis process.