Generating synthetic replicates of complex high-dimensional biological cell clusters
Flow Cytometry (FC) is crucial for CD34+ Haematopoietic stem cell enumeration, facilitating life-saving regenerative transplants for the treatment of diseases such as leukaemia. However, clinical and research laboratories face challenges in validating the metrological and biological accuracy of FC data due to its relative nature and lack of a definitive 'ground truth'. This uncertainty, exacerbated by operator differences, instrument inconsistencies, sample preparation, and data analysis methods, undermines diagnostic confidence. This research aims to address these uncertainty issues by utilising synthetic cell cluster data to reduce operator-induced variability. The study seeks to validate synthetic datasets as suitable substitutes for traditional FC datasets. Synthetic data offers superior analytical models with absolute accuracy, traceability, and reproducibility compared to current 'gold-standard' methods. The project uses synthetic cluster-generating benchmarking software often employed to assess clustering algorithms’ performances. However, through meticulous configuration via a cluster-based evaluation protocol, the generator is optimised to create high-dimensional synthetic CD34+ stem cell haematological models. A key component of the protocol is our ‘Rosetta-Routine’, a novel codebase that deciphers the statistical properties of real data and translates them into unique computational coefficients to generate synthetic statistical replicants. Our methodology could enhance analytical confidence in all FC applications, including diagnostics. Whereby utilising synthetic models may reduce analytical variability, thereby facilitating improved clinical decision making and ultimately improving patient care.
Presented at Future Investigators of Regenerative Medicine - FIRM 2024, Girona, Spain, 7-10 Oct 2024
History
School
- Mechanical, Electrical and Manufacturing Engineering