|Table of Contents|

[1] GU Yanping, ZHANG Hao, XU Tao, QIAN Bin, et al. Temperature regulation of an optomechanical frame based on reinforcement learning active disturbance rejection control [J]. Journal of Southeast University (English Edition), 2026, 42 (1): 112-120. [doi:10.3969/j.issn.1003-7985.2026.01.011]
Copy

Temperature regulation of an optomechanical frame based on reinforcement learning active disturbance rejection control()
基于强化学习自抗扰控制的光学机架温度调节

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
42
Issue:
2026 1
Page:
112-120
Research Field:
Automation
Publishing date:
2026-03-20

Info

Title:
Temperature regulation of an optomechanical frame based on reinforcement learning active disturbance rejection control
基于强化学习自抗扰控制的光学机架温度调节
Author(s):
GU Yanping1, ZHANG Hao1, XU Tao1, QIAN Bin2
1.Shanghai Institute of Satellite Engineering, Shanghai 201109, China
2.Shanghai Academy of Spaceflight Technology, Shanghai 201109, China
顾燕萍1, 张好1, 徐涛1, 钱斌2
1.上海卫星工程研究所, 上海 201109
2.上海航天技术研究院, 上海 201109
Keywords:
optomechanical system active disturbance rejection controller Q-learning high precision temperature control
光机系统 自抗扰控制 Q学习 高精度温度控制
PACS:
TP273.2
DOI:
10.3969/j.issn.1003-7985.2026.01.011
Abstract:
Spaceborne optomechanical systems face the dual challenges of extreme thermal disturbances and millikelvin-level temperature control precision during orbital operations, demanding robust control strategies. To address the performance limitations of conventional fixed-parameter active disturbance rejection control (ADRC) under complex operating conditions, this work proposes a Q- learning-enhanced adaptive ADRC framework. A thermal-transfer model incorporating multisource disturbances (solar radiation, structural conduction, and contact thermal resistance) is established, coupled with a reinforcement learning-driven parameter optimization mechanism. The ε-greedy policy dynamically adjusts observer bandwidth (ωo∈[0.01,0.2]) and controller bandwidth (ωc∈[0.01,0.1]) to enable real-time estimation and compensation of total disturbances. Simulation results demonstrate significant improvements over fixed-parameter ADRC and a self-tuning internal model control proportional-integral (SIMC-PI) controller: 31.3% and 15.4% reduction in settling time during setpoint responses, respectively; 21.8% lower integral absolute error (IAE) than the fixed-parameter ADRC during setpoint step responses; 12.7% and 52.5% enhancement in control precision over conventional fixed-parameter and SIMC-PI controllers, respectively, under ±10 K periodic and step thermal disturbances. Monte Carlo robustness tests reveal smaller fluctuation ranges of IAE, settling time, and overshoot under ±5% parameter perturbations. This methodology establishes a new paradigm for millikelvin-level thermal control in space optical payloads.
空间光机系统在轨运行过程中面临极端热扰动与毫开尔文级温度控制精度的双重挑战,对控制策略的鲁棒性提出了极高要求。针对传统固定参数自抗扰控制(ADRC)在复杂工况下性能受限的问题,本文提出一种基于Q学习的自适应ADRC框架。结合强化学习驱动的参数优化机制,建立了包含多源热扰动(太阳辐射、结构传导及接触热阻)的热传递模型。通过ε-贪婪策略实时调整观测器带宽(ωo∈[0.01,0.2])与控制器带宽(ωc∈[0.01,0.1]),实现对总扰动的动态估计与补偿。仿真结果表明,与固定参数ADRC及SIMC-PI控制相比,本方法在设定值响应过程中可分别缩短31.3%和15.4%的调节时间;在阶跃响应下,积分绝对误差(IAE)较固定参数ADRC降低21.8%;在±10 K周期扰动与阶跃扰动下,控制精度分别提高12.7%和52.5%。蒙特卡洛鲁棒性试验结果显示,在±5%参数摄动下,IAE、调节时间及超调量的波动范围显著减小。该方法为空间光学载荷的毫开尔文级高精度热控提供了一种新的控制范式。

References:

[1]GAO J X, SONG Y S, LIU Y. Application of nonlinear PID self-immunity control in temperature control system of fast mirror[J]. Laser & Optoelectronics Progress, 2023, 60(5): 0523001. (in Chinese)
[2]WEN M X, LI J, WANG C, et al. Summary of high precision temperature sensing, measurement and control technology[J]. Acta Scientiarum Naturalium Universitatis Sunyatseni, 2021, 60(S1): 146-155. (in Chinese)
[3]YU F, XU N N, ZHAO Y, et al. Design and validation of thermal control system of Gaofen-4 satellite camera[J]. Space Return and Remote Sensing, 2016, 37(4): 72-79. (in Chinese)
[4]HIETA T, MERIMAA M. Spectroscopic measurement of air temperature[J]. International Journal of Thermophysics, 2010, 31(8): 1710-1718.
[5]AARON K M, HASHEMI A, MORRIS P A, et. Space Interferometry Mission thermal design[C]//Astronomical Telescopes and Instrumentation. Waikoloa, HI, USA, 2003: 279.
[6]TONG Y L, LI G Q, GENG L Y. Current status of research on precision temperature control technology for spacecraft[J]. Space Return and Remote Sensing, 2016, 37(2): 1-8. (in Chinese)
[7]ZHAO Z M, LU P, SONG X Y. Design and validation of thermal control system for Gaofen-2 satellite camera[J]. Space Return and Remote Sensing, 2015, 36(4): 34-40. (in Chinese)
[8]GILMORE D. Spacecraft thermal control handbook, Volume Ⅰ: Fundamental technologies[M]. Washington, DC, USA: American Institute of Aeronautics and Astronautics, Inc. , 2002.
[9]TONG Y L, LI G Q, YU L, et al. Application of PI control for precision temperature control of space camera[J]. Space Return and Remote Sensing, 2012, 33(4): 42-49. (in Chinese)
[10]DE PALO S, CAIROLA M, COMPASSI M, et al. Herschel heaters control modeling and correlation[J]. SAE International Journal of Aerospace, 2009, 4(1): 29-39.
[11]HAN J Q. From PID to active disturbance rejection control[J]. IEEE Transactions on Industrial Electronics, 2009, 56(3): 900-906.
[12]PAN C, YE Y, GU B Z, et al. Temperature control of the extinction cylinder of a 2. 5 m large-field-of-view high-resolution telescope[J]. Infrared and Laser Engineering, 2023, 52(9): 20230024. (in Chinese)
[13]YUN Z R, WANG Z G, WANG J H. ADRC-based temperature control system for blackbody radiation sources[J]. Infrared Technology, 2019, 41(3): 232-238. (in Chinese)
[14]SIVAMAYIL K, RAJASEKAR E, ALJAFARI B, et al. A systematic study on reinforcement learning based applications[J]. Energies, 2023, 16(3): 1512.
[15]WILSON C, RICCARDI A. Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning[J]. Optimization and Engineering, 2023, 24(1): 223-255.
[16]YU B, LI C L, YANG T, et al. A high-precision temperature control method based on thermal characteristics of space camera[J]. Aerospace Return and Remote Sensing. 2014, 35(3): 84-89. (in Chinese)
[17]LI S. Research on high stability temperature control technology for optical machines[D]. Shanghai: University of Chinese Academy of Sciences, 2021. (in Chinese)
[18]ZHAO S, SHI H W, LIU X S, et al. Hydraulic servo flow control with third-order linear self-immunity controller[J]. Hydraulic and Pneumatic 2021, 45(5): 149-156. (in Chinese)
[19]ZHAO X J, ZHU J, LUO X. Application of ADRC in lower limb rehabilitation training apparatus[J]. Journal of Southeast University (Natural Science Edition), 2019, 49(6): 1026-1032. (in Chinese)
[20]JIN H Y, SONG J C, LAN W Y, et al. On the characteristics of ADRC: A PID interpretation[J]. Science China Information Sciences, 2020, 63(10): 209201.
[21]WANG X P, ZHAO J, WANG B H, et al. Predictive current control system of PMSM based on LADRC[J]. Journal of Southeast University (English Edition), 2022, 38(3): 227-234.
[22]BAE Y, LEE S, YOON K J, et al. Three-dimensional dynamic modeling and transport analysis of solid oxide fuel cells under electrical load change[J]. Energy Conversion and Management, 2018, 165: 405-418.
[23]DAI W. Structural design and numerical simulation for high-precision sounding temperature sensor[J]. Transducer and Microsystem Technologies, 2022, 41(11): 5-8, 17. (in Chinese)
[24]CHENG D X, CHEN Z F, SU D W, et al. Stability analysis and robustness improvement of high-precision thermostat[J]. Journal of Hefei University of Technology (Natural Science Edition), 2022, 45(9): 1160-1164. (in Chinese)
[25]TARUN A K, CHUNDAWAT V S, MANDAL M, et al. Fast yet effective machine unlearning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 13046-13055.
[26]ZHANG J, JIANG X, SHI X Y, et al. Offline reinforcement learning for eco-driving control at signalized intersections[J]. Journal of Southeast University (Natural Science Edition), 2022, 52(4): 762-769. (in Chinese)
[27]ZHANG Y Q, LI D H. Active disturbance rejection control on a bubbling fluidized bed[J]. Journal of University of Science and Technology of China, 2012, 42(5): 391-397. (in Chinese)

Memo

Memo:
Received: 2025-04-27; Revised: 2025-10-17.
Biography: GU Yanping(1986—), female, doctor, senior research fellow, gyp0523@163.com.
Foundation items: The National Key R&D Program of China (No.2022YFB3902902), the National Natural Science Foundation of China (No.52276003).
Citation: GU Yanping, ZHANG Hao, XU Tao, et al. Temperature regulation of an optomechanical frame based on reinforcement learning active disturbance rejection control[J]. Journal of Southeast University (English Edition), 2026, 42(1): 112-120. DOI: 10. 3969/j. issn. 1003-7985. 2026. 01. 011.
Last Update: 2026-03-20