Deep Reinforcement Learning for Scheduling of a Steel Plant in the Electricity Spot Market

Margi Shah; Yue Zhou; Jianzhong Wu; Max Mowbray

doi:10.1016/j.eng.2025.12.038

Engineering ›› :202512038 DOI: 10.1016/j.eng.2025.12.038

Research

research-article

Deep Reinforcement Learning for Scheduling of a Steel Plant in the Electricity Spot Market

Author information +

History +

PDF

Abstract

The steel industry, characterized by its substantial energy consumption, is grappling with rising energy costs and the imperative to decarbonize. However, the scheduling of a steel plant is challenged by the complexity and interdependency of its processes with various uncertainties. This study introduces a deep reinforcement learning (DRL) methodology specifically designed to optimize scheduling in the presence of the exogenous uncertainties brought by electricity prices and on-site renewable generation. The scheduling problem is formulated as a partially observable Markov decision process (POMDP), which enables decision-making despite the state not being fully observable. The attention mechanism is utilized to abstract a representation of a window of observations upon which decisions are conditioned. The control space is defined by domain knowledge-informed heuristic rules, and evolutionary search is utilized for the purpose of policy optimization. The case study considers an electric arc furnace (EAF)-based steel plant with various problem sizes and processing times for steelmaking tasks. The performance of the proposed method is compared with a traditional mixed integer linear programming (MILP) approach and the policy gradient method, proximal policy optimization (PPO). The proposed method is evaluated under uncertainty conditions arising from market prices and on-site renewable energy sources. Case study results reveal that the proposed DRL strategy effectively integrates uncertainties into real-time decision-making, achieving a desirable performance level with minimal online computational cost.

Keywords

Demand response / Production process / Steel plant / Reinforcement learning / Optimization

Cite this article

Download citation ▾

Margi Shah, Yue Zhou, Jianzhong Wu, Max Mowbray. Deep Reinforcement Learning for Scheduling of a Steel Plant in the Electricity Spot Market. Engineering 202512038 DOI:10.1016/j.eng.2025.12.038

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Iron and steel technology roadmap. Paris: International Energy Agency; 2020.

[2]	Devlin A, Markkanen S. Steel sector deep dive:how could demand drive low carbon innovation in the innovation in the steel industry. Report. Cambridge: Cambridge Institute for Sustainability Leadership (CISL); 2023.

[3]	Benefits of demand response in electricity markets and recommendations for achieving them. Report. Washington, DC: US Department of Energy; 2006.

[4]	Nagy Z, Henze G, Dey S, Arroyo J, Helsen L, Zhang X, et al. Ten questions concerning reinforcement learning for building energy management. Build Environ 2023; 241:110435.

[5]	Mariano-Hernández D, Hernández-Callejo L, Zorita-Lamadrid A, Duque-Pérez O, Santos GF. A review of strategies for building energy management system: model predictive control, demand side management, optimization, and fault detect & diagnosis. J Build Eng 2021; 33:101692.

[6]	Xie J, Ajagekar A, You F. Multi-agent attention-based deep reinforcement learning for demand response in grid-responsive buildings. Appl Energy 2023; 342:121162.

[7]	Rodriguez-Garcia J, Ribo-Perez D, Alvarez-Bel C, Penalvo-Lopez E. Maximizing the profit for industrial customers of providing operation services in electric power systems via a parallel particle swarm optimization algorithm. IEEE Access 2020; 8:24721-33.

[8]	Harjunkoski I, Maravelias CT, Bongers P, Castro PM, Engell S, Grossmann IE, et al. Scope for industrial applications of production scheduling models and solution methods. Comput Chem Eng 2014; 62:161-93.

[9]	Mowbray M, Zhang D,Chanona EADR. Distributional reinforcement learning for scheduling of chemical production processes. 2022. arXiv:2203.00636.

[10]	Castro PM, Harjunkoski I, Grossmann IE. New continuous-time scheduling formulation for continuous plants under variable electricity cost. Ind Eng Chem Res 2009; 48(14):6701-14.

[11]	Castro PM, Dalle Ave G, Engell S, Grossmann IE, Harjunkoski I. Industrial demand side management of a steel plant considering alternative power modes and electrode replacement. Ind Eng Chem Res 2020; 59(30):13642-56.

[12]	Zhang X, Hug G, Kolter Z, Harjunkoski I.Industrial demand response by steel plants with spinning reserve provision. In:Proceedings of 2015 North American Power Symposium (NAPS); 2015 Oct 4-6; Charlotte, NC, USA. IEEE; 2015. p. 1-6.

[13]	Ashok S. Peak-load management in steel plants. Appl Energy 2006; 83(5):413-24.

[14]	Zhang X, Hug G, Harjunkoski I. Cost-effective scheduling of steel plants with flexible EAFs. IEEE Trans Smart Grid 2017; 8(1):239-49.

[15]	Nolde K, Morari M. Electrical load tracking scheduling of a steel plant. Comput Chem Eng 2010; 34(11):1899-903.

[16]	Vázquez-Canteli JR, Nagy Z. Reinforcement learning for demand response: a review of algorithms and modeling techniques. Appl Energy 2019; 235:1072-89.

[17]	Oh S, Kong J, Yang Y, Jung J, Lee CH. A multi-use framework of energy storage systems using reinforcement learning for both price-based and incentive-based demand response programs. Int J Electr Power Energy Syst 2023; 144:108519.

[18]	Margi S, Zhou Y, Wu J, Mowbray M.A review of reinforcement learning based approaches for industrial demand response. In:Proceedings of 15th International Conference on Applied Energy (ICAE2023); 2023 Dec 3-7; Doha, Qatar. Energy Proceedings; 2024.

[19]	Lu R, Bai R, Luo Z, Jiang J, Sun M, Zhang HT. Deep reinforcement learning-based demand response for smart facilities energy management. IEEE Trans Ind Electron 2022; 69(8):8554-65.

[20]	Lu R, Li YC, Li Y, Jiang J, Ding Y. Multi-agent deep reinforcement learning based demand response for discrete manufacturing systems energy management. Appl Energy 2020; 276:115473.

[21]	Pan R, Wang Q, Cao J, Zhou C. Deep reinforcement learning for solving steelmaking-continuous casting scheduling problems under time-of-use tariffs. Int J Prod Res 2024; 62(1-2):404-20.

[22]	Huang X, Hong SH, Yu M, Ding Y, Jiang J. Demand response management for industrial facilities: a deep reinforcement learning approach. IEEE Access 2019; 7:82194-205.

[23]	Avadiappan V, Maravelias CT. State estimation in online batch production scheduling: concepts, definitions, algorithms and optimization models. Comput Chem Eng 2021; 146:107209.

[24]	Harjunkoski I, Grossmann IE. A decomposition approach for the scheduling of a steel plant production. Comput Chem Eng 2001; 25(11-12):1647-60.

[25]	Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. 2017. arXiv:1706.03762.

[26]	Castro PM, Sun L, Harjunkoski I. Resource-task network formulations for industrial demand side management of a steel plant. Ind Eng Chem Res 2013; 52(36):13046-58.

[27]	elia.be [Internet]. Belgian’s electricity system operator; [cited 2024 Jun 18]. Available from:

[28]	nordpoolgroup.com [Internet]. Market data; [cited 2024 Jun 18]. Available from:

[29]	Huang S,Ontañón S. A closer look at invalid action masking in policy gradient algorithms. 2020. arXiv:2006.14171.

[30]	Schulman J, Wolski F, Dhariwal P, Radford A,Klimov O. Proximal policy optimization algorithms. 2017. arXiv:1707.06347.

PDF

Accesses

Citation

Detail

Sections

Recommended

Journal home

Browse

Online first

Latest issue

All volumes and issues

Collections

Authors & reviewers

Guidelines for authors

Call for papers

Editorial policy

Copyright & license

Ethical requirements

Download templates

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

中文版

Abstract

Keywords

Cite this article

References

Just Accepted