A case-based reasoning methodology to formulating polyurethanes

2011-01-26T09:13:28Z (GMT) by Diana M. Segura-Velandia
Formulation of polyurethanes is a complex problem poorly understood as it has developed more as an art rather than a science. Only a few experts have mastered polyurethane (PU) formulation after years of experience and the major raw material manufacturers largely hold such expertise. Understanding of PU formulation is at present insufficient to be developed from first principles. The first principle approach requires time and a detailed understanding of the underlying principles that govern the formulation process (e.g. PU chemistry, kinetics) and a number of measurements of process conditions. Even in the simplest formulations, there are more that 20 variables often interacting with each other in very intricate ways. In this doctoral thesis the use of the Case-Based Reasoning and Artificial Neural Network paradigm is proposed to enable support for PUs formulation tasks by providing a framework for the collection, structure, and representation of real formulating knowledge. The framework is also aimed at facilitating the sharing and deployment of solutions in a consistent and referable way, when appropriate, for future problem solving. Two basic problems in the development of a Case-Based Reasoning tool that uses past flexible PU foam formulation recipes or cases to solve new problems were studied. A PU case was divided into a problem description (i. e. PU measured mechanical properties) and a solution description (i. e. the ingredients and their quantities to produce a PU). The problems investigated are related to the retrieval of former PU cases that are similar to a new problem description, and the adaptation of the retrieved case to meet the problem constraints. For retrieval, an alternative similarity measure based on the moment's description of a case when it is represented as a two dimensional image was studied. The retrieval using geometric, central and Legendre moments was also studied and compared with a standard nearest neighbour algorithm using nine different distance functions (e.g. Euclidean, Canberra, City Block, among others). It was concluded that when cases were represented as 2D images and matching is performed by using moment functions in a similar fashion to the approaches studied in image analysis in pattern recognition, low order geometric and Legendre moments and central moments of any order retrieve the same case as the Euclidean distance does when used in a nearest neighbour algorithm. This means that the Euclidean distance acts a low moment function that represents gross level case features. Higher order (moment's order>3) geometric and Legendre moments while enabling finer details about an image to be represented had no standard distance function counterpart. For the adaptation of retrieved cases, a feed-forward back-propagation artificial neural network was proposed to reduce the adaptation knowledge acquisition effort that has prevented building complete CBR systems and to generate a mapping between change in mechanical properties and formulation ingredients. The proposed network was trained with the differences between problem descriptions (i.e. mechanical properties of a pair of foams) as input patterns and the differences between solution descriptions (i.e. formulation ingredients) as the output patterns. A complete data set was used based on 34 initial formulations and a 16950 epochs trained network with 1102 training exemplars, produced from the case differences, gave only 4% error. However, further work with a data set consisting of a training set and a small validation set failed to generalise returning a high percentage of errors. Further tests on different training/test splits of the data also failed to generalise. The conclusion reached is that the data as such has insufficient common structure to form any general conclusions. Other evidence to suggest that the data does not contain generalisable structure includes the large number of hidden nodes necessary to achieve convergence on the complete data set.