A model-based RL approach is developed to identify optimal reaction conditions to maximize several key performance indicators such as yield and selectivity. The synthesis of N-Benzylidenebenzylamine in a tubular reactor (flow chemistry) is used to validate the proposed approach. A mathematical model of the environment/process was built to train a deep deterministic policy gradient (DDPG) agent and help achieve the best performance over a set of training episodes. The proposed method was validated against benchmark techniques such as gradient free and gradient-based methods.
Funding
Takeda pharmaceutical company
History
School
Aeronautical, Automotive, Chemical and Materials Engineering
Department
Chemical Engineering
Source
14th European Congress of Chemical Engineering and 7th European Congress of Applied Biotechnology (ECCE&ECAB 2023)