Reinforcement learning is a promising method for solving decision problems, and its potential has been increasingly recognized for large-scale combinatorial optimization problems in recent years. However, the existing studies on reinforcement learning for cutting stock problems mostly rely on sequence-to-sequence or graph neural network approaches that use the learned experience to make decisions while neglecting the combination features of cutting stock problems. In this paper, we propose a novel reinforcement learning framework for cutting stock problems that integrate integer programming and monotone comparative statics to construct a Markov decision process with a high-quality action space. We start by constructing a new Markov decision process that considers the diagonal structure of the integer programming model for combinatorial optimization problems, and then use column generation to obtain each action by combining multiple decision variables. Furthermore, we design a bipartite graph and related bipartite graph convolutional network to find the solutions. The results show that the proposed reinforcement learning framework provides a high-quality action space, and the designed bipartite graph convolutional network can effectively select the best actions from the action set.
Funding
Major Program of National Natural Science Foundation of China (grant no: 72192830)
National Natural Science Foundation of China (grant no. 72002028)
National Natural Science Foundation of China - 111 Project (grant no. B16009)
History
School
Loughborough Business School
Published in
IEEE Transactions on Automation Science and Engineering
Volume
22
Pages
12455 - 12469
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
This accepted manuscript has been made available under the Creative Commons Attribution licence (CC BY) under the IEEE JISC UK green open access agreement.