Enhancing automated bug report analysis through advanced neural language models
The recent advances of machine learning, in particular neural language models, has excited significant growth on a few software engineering research fields. Bug report analysis is one of such examples. Given the potentially large number of bug reports and their semantic naturalness, there is an increasing demand on approach that is able to automatically understand, extract and correlate information from the reports. This provides a perfect place for neural language models in machine learning, which was designed precisely for these tasks. Thus, this thesis aims to identify and address the shortcomings and obstacles present in existing literature on bug report analysis using neural language models.
This thesis begins with a systematic literature review, focusing on bug report analysis using machine learning. We examined 1,825 papers from three repositories and identified 204 studies of high relevance for in-depth analysis. The review yielded key statistics and classifications related to this research area. More importantly, we derived seven insightful findings from the results. Conclusively, we outline a set of future research opportunities for scholars in this field. Following this literature review, we confirm that bug report analysis using neural language models requires more attention and has gaps that merit further research efforts.
Following the research direction provided in the systematic literature review, we carried out an extensive exploratory study on bug reports of deep learning frameworks. This study examined the ten most popular open-source deep learning frameworks on GitHub, including TensorFlow, Keras, and PyTorch. From a total of 22,522 bug reports, we selected 664 that were representative of typical performance and accuracy issues. Based on our findings, we offer a set of actionable recommendations for researchers, maintainers, and those submitting bug reports.
By using the sampled data obtained in the exploratory study as part of the dataset for evaluation, we developed an end-to-end tool designed to automatically identify non-functional bug reports in deep learning frameworks. This method capitalizes on semantic knowledge learning, considers hierarchical structures, and employs diverse feature extraction techniques. The primary advantage of this approach is its ability to identify bug reports using advanced neural language model, significantly reducing the need for intensive human analysis. Our approach outperforms nine other leading classifiers, demonstrating substantial improvements with strong statistical significance—specifically, an increase in AUC (Area Under the Curve) by up to 71%. Additionally, it achieved the top Scott-Knott ranking in four frameworks and the second-best in one.
During our research, we discovered that labeling datasets is both time-consuming and labor-intensive. To streamline this process, we developed a cross-project framework that automates and enhances the identification of bug reports from GitHub repositories through a synergy of human and machine efforts. We assessed MHNurf with a dataset encompassing 1,275,881 reports from over 127,000 software projects, comparing it against contemporary methodologies, basic benchmarks, and various adaptations. Remarkably, MHNurf achieved substantial efficiencies, reducing effort by up to 95.8% for readability and 196.0% for identifiability during human labeling tasks. This approach not only improved performance metrics, such as the F1-score, in bug report identification but also proved to be model-agnostic, enhancing performance across diverse neural language models. Additionally, a qualitative case study involving 10 participants further validated the effectiveness of MHNurf, with users reporting significant savings in time and cost.
Through the extensive research conducted in this thesis, we have identified and effectively addressed several critical knowledge gaps within the existing literature. Consequently, we have advanced the field of bug report analysis using neural language models by developing more precise, stable, and dependable analysis techniques.
History
School
- Science
Department
- Computer Science
Publisher
Loughborough UniversityRights holder
© Guoming LongPublication date
2024Notes
A Doctoral Thesis. Submitted in partial fulfilment of the requirements for the award of the degree of Doctor of Philosophy of Loughborough University.Language
- en
Supervisor(s)
Tao Chen ; Hui Fang ; Georgina CosmaQualification name
- PhD
Qualification level
- Doctoral
This submission includes a signed certificate in addition to the thesis file(s)
- I have submitted a signed certificate