Data-driven point adjusted Jouyban-Acree-Artificial neural network hybrid model for predicting solubility of active pharmaceutical ingredients in binary solvent mixtures
Solubility data of active pharmaceutical ingredients (APIs) in binary solvent mixtures are crucial for optimizing solid-liquid separation processes, conducting early solvent screening, and ensuring safety. This study presents data-driven models integrating Monte Carlo optimization algorithms, the Jouyban-Acree (JA) model, and artificial neural networks (ANN) to comprehensively predict API solubility in binary solvent mixtures. A comprehensive database comprising 71,888 data points was constructed, encompassing quantitative descriptors of the three-dimensional structures of solutes and binary solvent molecules, as well as the molecular interaction energies between these solvents. A hybrid model, Jouyban-Acree-ANN (JAANN), was developed to predict solubility across various temperatures and solvent compositions. This model demonstrated robust predictive performance, with prediction errors generally below 10%. Additionally, we introduced a Point Adjusted JAANN (PA-JAANN) model that integrates Monte Carlo simulations to refine solubility predictions by calibrating a single experimental data point. This calibration significantly enhances the model’s accuracy, achieving an average error reduction of over 20% compared to the standard JAANN model. A comparative Direct Prediction-ANN (DP-ANN) model was also constructed, providing rapid solubility predictions without experimental data, though it had limitations in robustness. The predictive abilities of these three models were thoroughly validated based on experiments involving the mefenamic acid-2-butanol-heptane system. These models can be used for different predictive needs, offering flexible and reliable solubility predictions essential for optimizing crystallization processes in pharmaceutical manufacturing.
History
School
Aeronautical, Automotive, Chemical and Materials Engineering