Flow cytometry is used in cell therapy manufacturing to determine critical quality attributes such as viability, identity, purity, and potency. Machine learning algorithms have been developed to improve the reliability of flow cytometry data analysis compared to manual gating. However, previous work to define the confidence in these black box algorithms has been limited by the availability of datasets.
Here, we present the use of synthetic flow cytometry datasets with controlled properties to evaluate clustering algorithms. We designed datasets with different numbers of cell populations from well-separated to merged clusters, and rare cell populations. Our synthetic datasets were run through three state-of-the-art software employing different clustering algorithms, Flock, FlowSOM and SPADE3.
Our results identified the accuracy and precision of software output decrease as clusters get closer together (separation index reduces). Initial runs on a rare cell dataset found the density-based algorithm in Flock performed better than others with a lower limit of detection at 0.1% when using fixed processing variables.
This study will allow flow cytometry users a better understanding of the benefits and limitations of the algorithms, and help to inform decision making during in-process and final quality control of cell therapy manufacturing.
History
School
Mechanical, Electrical and Manufacturing Engineering