Knowing expected milk yield can help dairy farmers in better decision-making and management. The objective of this study was to build and compare predictive models to forecast daily milk yield over a long duration. A machine-learning pipeline was provided and five baseline models as well as a novel stacking model were developed for the prediction of milk yield on the CowNflow dataset using 414 Holstein cattle records collected from 1983 to 2019. Four different feature selection methods were performed to evaluate the essential features that affect milk yield. The results showed that the overall performance of predictive models improved after proper feature selection, with an R2 value increased to 0.811, and a root mean squared error (RMSE) decreased to 3.627. The stacking model achieved the best performance with an R2 value of 0.85, a mean absolute error (MAE) of 2.537 and an RMSE of 3.236. This research provides benchmark information for the prediction of milk yield on the CowNflow dataset and identified useful factors in long-term milk yield prediction.
Funding
Cattle Information Service (CIS)
National Bovine Data Centre (NBDC)
UK Engineering and Physical Sciences Research Council (EPSRC) grant (EP/Y00597X/1)
History
School
Science
Department
Computer Science
Published in
International Conference on Computer Science, Machine Learning and Big Data
Source
International Conference on Computer Science, Machine Learning and Big Data