A comparison of variate pre-selection methods for use in partial least squares regression: a case study on NIR spectroscopy applied to monitoring beer fermentation
posted on 2009-03-16, 15:09authored byGeorgina McLeod, Kirsty Clelland, Henri S. Tapp, E. Katherine Kemsley, Reginald H. Wilson, Graham H. Poulter, David Coombs, Christopher Hewitt
This work investigates four methods of selecting variates from near-infrared (NIR) spectra for use
in partial least squares (PLS) regression models to predict biomass and chemical changes during beer
fermentation. The fermentation parameters studied were ethanol concentration, specific gravity (SG), optical
density (OD) and dry cell weight (DCW). The four selection methods investigated were: Simple, where a
fingerprint region is chosen manually; CovProc, a covariance procedure where variates are introduced
based on the magnitude of the 1st PLS vector coefficients; CovProc-SavGo, a modification to CovProc
where the window size of a Savitzky-Golay filter applied to the spectra is also optimised; and Genetic
Algorithm (GA), where variates are selected based on the frequency of appearance in 8-variate multiple
linear regression models found from repeated execution of the GA routine. The analysis found that all four
methods produced good predictive models. The GA approach produced the lowest standard error in
prediction (SEP) based on leave-one-out cross validation (LOO-CV), although this advantage was not reflected in the standard error in validation values, SEV, where all four models performed comparably. From
this work, we would recommend using the Simple approach if a suitable fingerprint region can be identified,
and using CovProc otherwise.
History
School
Aeronautical, Automotive, Chemical and Materials Engineering
Department
Chemical Engineering
Citation
MCLEOD, G. ... et al, 2009. A comparison of variate pre-selection methods for use in partial least squares regression: a case study on NIR spectroscopy applied to monitoring beer fermentation. Journal of Food Engineering, 90 (2), pp. 300-397