A multimodal data processing system for LiDAR-based human activity recognition

Roche, Jamie; De-Silva, Varuna; Hook, Joosep; Moencks, Mirco; Kondoz, Ahmet

A multimodal data processing system for LiDAR-based human activity recognition

journal contribution

posted on 2021-09-15, 09:25 authored by Jamie Roche, Varuna De-SilvaVaruna De-Silva, Joosep Hook, Mirco Moencks, Ahmet Kondoz

Increasingly, the task of detecting and recognizing the actions of a human has been delegated to some form of neural network processing camera or wearable sensor data. Due to the degree to which the camera can be affected by lighting and wearable sensors scantiness, neither one modality can capture the required data to perform the task confidently. That being the case, range sensors, like light detection and ranging (LiDAR), can complement the process to perceive the environment more robustly. Most recently, researchers have been exploring ways to apply convolutional neural networks to 3-D data. These methods typically rely on a single modality and cannot draw on information from complementing sensor streams to improve accuracy. This article proposes a framework to tackle human activity recognition by leveraging the benefits of sensor fusion and multimodal machine learning. Given both RGB and point cloud data, our method describes the activities being performed by subjects using regions with a convolutional neural network (R-CNN) and a 3-D modified Fisher vector network. Evaluated on a custom captured multimodal dataset demonstrates that the model outputs remarkably accurate human activity classification (90%). Furthermore, this framework can be used for sports analytics, understanding social behavior, surveillance, and perhaps most notably by autonomous vehicles (AVs) to data-driven decision-making policies in urban areas and indoor environments.

History

School

Loughborough University London

Published in

IEEE Transactions on Cybernetics

Volume

52

Issue

10

Pages

10027 - 10040

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Version

AM (Accepted Manuscript)

Rights holder

Publisher statement

© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Acceptance date

2021-05-23

Publication date

2021-06-24

Copyright date

2021

DOI

https://doi.org/10.1109/tcyb.2021.3085489

ISSN

2168-2267

eISSN

2168-2275

Publisher version

https://doi.org/10.1109/tcyb.2021.3085489

Language

en

Depositor

Dr Varuna De Silva. Deposit date: 9 September 2021

Usage metrics

Keywords

convolutional neural network faster RCNN Fisher vector human activity recognition (HAR)multimodal machine learning (ML)Artificial Intelligence and Image Processing

A multimodal data processing system for LiDAR-based human activity recognition

History

School

Published in

Volume

Issue

Pages

Publisher

Version

Rights holder

Publisher statement

Acceptance date

Publication date

Copyright date

DOI

ISSN

eISSN

Publisher version

Language

Depositor

Usage metrics

Categories

Keywords

Licence

Exports