Upper limb amputation can significantly affect a person's capabilities with a dramatic impact on their quality of life. It is difficult for upper limb amputees to perform basic gestures such as holding, buttoning and feed themselves. The motivation of this research is to explore the possibility of providing upper limb amputees with the capability of precisely making hand gestures, and thus improve their life quality. For this purpose, a transferable, sustainable, physiologically related, intuitive and surface electromyogram (sEMG) based non-invasive control system is thus highly desirable for improving the usability of upper limb prosthesis by applying Deep Learning (DL) technology.
The efforts of this research were considered in six strands:
Firstly, a review of the related research in upper limb gesture recognition for prosthesis control, including the research background, prosthetic devices, advanced approaches, and existing challenges was considered.
Secondly, an investigation of a specific one-dimensional convolutional neural network (1-D CNN) was conducted, for gesture recognition on two sub-datasets from the Ninapro sEMG database. As an initial experiment, the model achieved the gesture recognition accuracy of 51.89% at the Ninapro database 2 (NinaproDB2) and 45.77% at the Ninapro database 3 (NinaproDB3) respectively, with the input of raw sEMG signals.
Thirdly, three data pre-processing approaches were employed on the raw sEMG signals for optimizing the quality of input signals. The methods include the general normalisation, window-based statistical feature extraction and recurrence plots (RP) transform. The optimized inputs were used and evaluated in subsequent experiments. Then, based on the experience and knowledge from the upper stage, four advanced DL models were developed including an improved 1-D CNN, a Long Short Term Memory (LSTM) model, a basic hybrid model with one convolutional layer and one recurrent layer (1+1 C-RNN), and an advanced hybrid model with three convolutional layers and three recurrent layers (3+3 C-RNN). The models were evaluated on three different databases: a human activity recognition (HAR) dataset, the NinaproDB2 and the Ninapro database 5 (NinaproDB5), in which the 3+3 C-RNN achieved the best performance at 85.29%, 57.12%, and 75.92%, respectively. By replacing the raw sEMG input with pre-processed signals, the performance of models was increased, especially with generated statistical features. As a result, the 3+3 C-RNN reached the increased accuracy of 63.74% and 83.61% on the NinaproDB2 and NinaproDB5. In addition, the different sliding window sizes and filter sizes were tested on the hybrid 3+3 C-RNN for hyperparameter tuning.
As the fourth stage in the research, a novel attention-based bidirectional convolutional gated recurrent unit (Bi-ConvGRU) network was developed, to recognise the hand gestures using multi-channel sEMG signals. In this part of work, a novel application of a bidirectional sequential GRU (Bi-GRU) that focused on multi-channel muscle activation correlation among the signals from both the prior time steps and the posterior time steps was developed inspired by the biomechanics of muscle group activation behaviours. In addition, the Bi-ConvGRU model enhanced the signal intra-channel features extracted by improved 1-D convolutional layers, which were inserted before the bidirectional GRU layers. Furthermore, an attention mechanism was employed following each Bi-GRU layer. The attention layer learns different intra-attention weights, enabling the model to focus on vital parts and corresponding dependencies among the signals. This approach helps to increase robustness to feature noise and consequently improves the recognition accuracy. The Bi-ConvGRU was evaluated on the benchmark NinaproDB5 dataset, containing 18 hand postures from 10 healthy subjects. The average accuracy on statistical feature input obtained achieved 88.7%, which outperforms the state-of-the-art and my previous models.
The fifth research stage explored the transferable approaches in the field. A transfer learning (TL) strategy was introduced to demonstrate that the baseline model pre-trained with 20 non-amputees can be refined using 2 amputees' data to build the classification model for amputees. This additional TL experiment was conducted on the Ninapro database 7 (NinaproDB7). Using only limited amputee data, the model converged much quicker than the non-transferred model and ultimately achieved an accuracy of 76.2%.
Finally, to alleviate the lack of the amputee’s sEMG data during the TL experiment, several data augmentation approaches had been employed to extend the existing dataset. As a novel approach in this field, a deep convolutional generative adversarial network (DCGAN) was developed to create new sEMG signals. The generated clones and original data were evaluated and compared on the NinarpoDB5 in different combinations. The comparative results indicated that the data augmentation approaches can be valuable when the raw data is limited.
The future work will focus on the combination between DCGAN and TL models. As described before, some sEMG data have been created based on the NinarpoDB5 and evaluated on the Bi-ConvGRU model. However, the generated sEMG clones have not been employed on the TL models. We would like to train a DL model as the pre-trained model with sufficient data (original sEMG data plus generated clones). And then transfer the features on unseen subjects.