Reinforcement learning strategy for the optimization of flow chemistry [Extended abstract]
A model-based RL approach is developed to identify optimal reaction conditions to maximize several key performance indicators such as yield and selectivity. The synthesis of N-Benzylidenebenzylamine in a tubular reactor (flow chemistry) is used to validate the proposed approach. A mathematical model of the environment/process was built to train a deep deterministic policy gradient (DDPG) agent and help achieve the best performance over a set of training episodes. The proposed method was validated against benchmark techniques such as gradient free and gradient-based methods.
Takeda pharmaceutical company
- Aeronautical, Automotive, Chemical and Materials Engineering
- Chemical Engineering
Source14th European Congress of Chemical Engineering and 7th European Congress of Applied Biotechnology (ECCE&ECAB 2023)
- AM (Accepted Manuscript)
Rights holder© The Authors