Search scope:
排序: Display mode:
Pre-training with asynchronous supervised learning for reinforcement learning based autonomous driving Research Articles
Yunpeng Wang, Kunxian Zheng, Daxin Tian, Xuting Duan, Jianshan Zhou,ypwang@buaa.edu.cn,zhengkunxian@buaa.edu.cn,dtian@buaa.edu.cn,duanxuting@buaa.edu.cn
Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5, Pages 615-766 doi: 10.1631/FITEE.1900637
Keywords: 自主驾驶;自动驾驶车辆;强化学习;监督学习
MDLB: a metadata dynamic load balancing mechanism based on reinforcement learning Research Articles
Zhao-qi Wu, Jin Wei, Fan Zhang, Wei Guo, Guang-wei Xie,17034203@qq.com
Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 7, Pages 963-1118 doi: 10.1631/FITEE.1900121
Keywords: 面向对象的存储系统;元数据;动态负载均衡;强化学习;Q_learning
Decentralized multi-agent reinforcement learning with networked agents: recent advances Review Article
Kaiqing Zhang, Zhuoran Yang, Tamer Başar,kzhang66@illinois.edu,zy6@princeton.edu,basar1@illinois.edu
Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 6, Pages 802-814 doi: 10.1631/FITEE.1900661
Keywords: 强化学习;多智能体系统;网络系统;一致性优化;分布式优化;博弈论
Oguzhan Dogru, Kirubakaran Velswamy, Biao Huang
Engineering 2021, Volume 7, Issue 9, Pages 1248-1261 doi: 10.1016/j.eng.2021.04.027
This paper synchronizes control theory with computer vision by formalizing object tracking as a sequential decision-making process. A reinforcement learning (RL) agent successfully tracks an interface between two liquids, which is often a critical variable to track in many chemical, petrochemical, metallurgical, and oil industries. This method utilizes less than 100 images for creating an environment, from which the agent generates its own data without the need for expert knowledge. Unlike supervised learning (SL) methods that rely on a huge number of parameters, this approach requires far fewer parameters, which naturally reduces its maintenance cost. Besides its frugal nature, the agent is robust to environmental uncertainties such as occlusion, intensity changes, and excessive noise. From a closed-loop control context, an interface location-based deviation is chosen as the optimization goal during training. The methodology showcases RL for real-time object-tracking applications in the oil sands industry. Along with a presentation of the interface tracking problem, this paper provides a detailed review of one of the most effective RL methodologies: actor–critic policy.
Keywords: Interface tracking Object tracking Occlusion Reinforcement learning Uniform manifold approximation and projection
A home energy management approach using decoupling value and policy in reinforcement learning
熊珞琳,唐漾,刘臣胜,毛帅,孟科,董朝阳,钱锋
Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 9, Pages 1261-1272 doi: 10.1631/FITEE.2200667
Keywords: Home energy system Electric vehicle Reinforcement learning Generalization
Embedding expert demonstrations into clustering buffer for effective deep reinforcement learning Research Article
Shihmin WANG, Binqi ZHAO, Zhengfeng ZHANG, Junping ZHANG, Jian PU
Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 11, Pages 1541-1556 doi: 10.1631/FITEE.2300084
Keywords: Reinforcement learning Sample efficiency Sampling process Clustering methods Autonomous driving
A self-supervised method for treatment recommendation in sepsis Research Articles
Sihan Zhu, Jian Pu,jianpu@fudan.edu.cn
Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 7, Pages 926-939 doi: 10.1631/FITEE.2000127
Keywords: 治疗推荐;脓毒症;自监督学习;强化学习;电子病历
Jingda Wu, Zhiyu Huang, Zhongxu Hu, Chen Lv
Engineering 2023, Volume 21, Issue 2, Pages 75-91 doi: 10.1016/j.eng.2022.05.017
Due to its limited intelligence and abilities, machine learning is currently unable to handle various situations thus cannot completely replace humans in real-world applications. Because humans exhibit robustness and adaptability in complex scenarios, it is crucial to introduce humans into the training loop of artificial intelligence (AI), leveraging human intelligence to further advance machine learning algorithms. In this study, a real-time human-guidance-based (Hug)-deep reinforcement learning (DRL) method is developed for policy training in an end-to-end autonomous driving case. With our newly designed mechanism for control transfer between humans and automation, humans are able to intervene and correct the agent's unreasonable actions in real time when necessary during the model training process. Based on this human-in-the-loop guidance mechanism, an improved actor-critic architecture with modified policy and value networks is developed. The fast convergence of the proposed Hug-DRL allows real-time human guidance actions to be fused into the agent's training loop, further improving the efficiency and performance of DRL. The developed method is validated by human-in-the-loop experiments with 40 subjects and compared with other state-of-the-art learning approaches. The results suggest that the proposed method can effectively enhance the training efficiency and performance of the DRL algorithm under human guidance without imposing specific requirements on participants' expertise or experience.
Keywords: Human-in-the-loop AI Deep reinforcement learning Human guidance Autonomous driving
Cooperative channel assignment for VANETs based on multiagent reinforcement learning Research Articles
Yun-peng Wang, Kun-xian Zheng, Da-xin Tian, Xu-ting Duan, Jian-shan Zhou,ypwang@buaa.edu.cn,zhengkunxian@buaa.edu.cn,dtian@buaa.edu.cn,duanxuting@buaa.edu.cn
Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 7, Pages 1047-1058 doi: 10.1631/FITEE.1900308
Keywords: Vehicular ad-hoc networks Reinforcement learning Dynamic channel assignment Multichannel
Stochastic pedestrian avoidance for autonomous vehicles using hybrid reinforcement learning Research Article
Huiqian LI, Jin HUANG, Zhong CAO, Diange YANG, Zhihua ZHONG,lihq20@mails.tsinghua.edu.cn,huangjin@tsinghua.edu.cn,caoc15@mails.tsinghua.edu.cn,ydg@tsinghua.edu.cn
Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 1, Pages 131-140 doi: 10.1631/FITEE.2200128
Keywords: Pedestrian Hybrid reinforcement learning Autonomous vehicles Decision-making
Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents Research Article
Jian ZHAO, Youpeng ZHAO, Weixun WANG, Mingyu YANG, Xunhan HU, Wengang ZHOU, Jianye HAO, Houqiang LI
Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 7, Pages 1032-1042 doi: 10.1631/FITEE.2100594
Keywords: Multi-agent system Reinforcement learning Unexpected crashed agents
Behavioral control task supervisor with memory based on reinforcement learning for human–multi-robot coordination systems Research Article
Jie HUANG, Zhibin MO, Zhenyi ZHANG, Yutao CHEN,yutao.chen@fzu.edu.cn
Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 8, Pages 1174-1188 doi: 10.1631/FITEE.2100280
Keywords: Human– multi-robot coordination systems Null-space-based behavioral control Task supervisor Reinforcement learning Knowledge base
Ziang Li,Zhengtao Ding,Meihong Wang
Engineering 2017, Volume 3, Issue 2, Pages 257-265 doi: 10.1016/J.ENG.2017.02.014
In this paper, a reinforcement learning (RL)-based Sarsa temporal-difference (TD) algorithm is applied to search for a unified bidding and operation strategy for a coal-fired power plant with monoethanolamine (MEA)-based post-combustion carbon capture under different carbon dioxide (CO2) allowance market conditions. The objective of the decision maker for the power plant is to maximize the discounted cumulative profit during the power plant lifetime. Two constraints are considered for the objective formulation. Firstly, the tradeoff between the energy-intensive carbon capture and the electricity generation should be made under presumed fixed fuel consumption. Secondly, the CO2 allowances purchased from the CO2 allowance market should be approximately equal to the quantity of CO2 emission from power generation. Three case studies are demonstrated thereafter. In the first case, we show the convergence of the Sarsa TD algorithm and find a deterministic optimal bidding and operation strategy. In the second case, compared with the independently designed operation and bidding strategies discussed in most of the relevant literature, the Sarsa TD-based unified bidding and operation strategy with time-varying flexible market-oriented CO2 capture levels is demonstrated to help the power plant decision maker gain a higher discounted cumulative profit. In the third case, a competitor operating another power plant identical to the preceding plant is considered under the same CO2 allowance market. The competitor also has carbon capture facilities but applies a different strategy to earn profits. The discounted cumulative profits of the two power plants are then compared, thus exhibiting the competitiveness of the power plant that is using the unified bidding and operation strategy explored by the Sarsa TD algorithm.
Keywords: Power plants Post-combustion carbon capture Chemical absorption CO2 allowance market Optimal decision-making Reinforcement learning
Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks Research Article
Xiaoyu LIU, Chi XU, Haibin YU, Peng ZENG,liuxiaoyu1@sia.cn,xuchi@sia.cn,yhb@sia.cn,zp@sia.cn
Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 1, Pages 47-60 doi: 10.1631/FITEE.2100331
Keywords: Multi-agent deep reinforcement learning End–edge orchestrated Industrial wireless networks Delay Energy consumption
Proximal policy optimization with an integral compensator for quadrotor control Research
Huan Hu, Qing-ling Wang,qlwang@seu.edu.cn
Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 5, Pages 649-808 doi: 10.1631/FITEE.1900641
Keywords: 强化学习;近端策略优化;四旋翼控制;神经网络
Title Author Date Type Operation
Pre-training with asynchronous supervised learning for reinforcement learning based autonomous driving
Yunpeng Wang, Kunxian Zheng, Daxin Tian, Xuting Duan, Jianshan Zhou,ypwang@buaa.edu.cn,zhengkunxian@buaa.edu.cn,dtian@buaa.edu.cn,duanxuting@buaa.edu.cn
Journal Article
MDLB: a metadata dynamic load balancing mechanism based on reinforcement learning
Zhao-qi Wu, Jin Wei, Fan Zhang, Wei Guo, Guang-wei Xie,17034203@qq.com
Journal Article
Decentralized multi-agent reinforcement learning with networked agents: recent advances
Kaiqing Zhang, Zhuoran Yang, Tamer Başar,kzhang66@illinois.edu,zy6@princeton.edu,basar1@illinois.edu
Journal Article
Actor–Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tracking
Oguzhan Dogru, Kirubakaran Velswamy, Biao Huang
Journal Article
A home energy management approach using decoupling value and policy in reinforcement learning
熊珞琳,唐漾,刘臣胜,毛帅,孟科,董朝阳,钱锋
Journal Article
Embedding expert demonstrations into clustering buffer for effective deep reinforcement learning
Shihmin WANG, Binqi ZHAO, Zhengfeng ZHANG, Junping ZHANG, Jian PU
Journal Article
A self-supervised method for treatment recommendation in sepsis
Sihan Zhu, Jian Pu,jianpu@fudan.edu.cn
Journal Article
Toward Human-in-the-loop AI: Enhancing Deep Reinforcement Learning Via Real-time Human Guidance for Autonomous Driving
Jingda Wu, Zhiyu Huang, Zhongxu Hu, Chen Lv
Journal Article
Cooperative channel assignment for VANETs based on multiagent reinforcement learning
Yun-peng Wang, Kun-xian Zheng, Da-xin Tian, Xu-ting Duan, Jian-shan Zhou,ypwang@buaa.edu.cn,zhengkunxian@buaa.edu.cn,dtian@buaa.edu.cn,duanxuting@buaa.edu.cn
Journal Article
Stochastic pedestrian avoidance for autonomous vehicles using hybrid reinforcement learning
Huiqian LI, Jin HUANG, Zhong CAO, Diange YANG, Zhihua ZHONG,lihq20@mails.tsinghua.edu.cn,huangjin@tsinghua.edu.cn,caoc15@mails.tsinghua.edu.cn,ydg@tsinghua.edu.cn
Journal Article
Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents
Jian ZHAO, Youpeng ZHAO, Weixun WANG, Mingyu YANG, Xunhan HU, Wengang ZHOU, Jianye HAO, Houqiang LI
Journal Article
Behavioral control task supervisor with memory based on reinforcement learning for human–multi-robot coordination systems
Jie HUANG, Zhibin MO, Zhenyi ZHANG, Yutao CHEN,yutao.chen@fzu.edu.cn
Journal Article
Optimal Bidding and Operation of a Power Plant with Solvent-Based Carbon Capture under a CO2 Allowance Market: A Solution with a Reinforcement Learning-Based Sarsa Temporal-Difference Algorithm
Ziang Li,Zhengtao Ding,Meihong Wang
Journal Article
Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks
Xiaoyu LIU, Chi XU, Haibin YU, Peng ZENG,liuxiaoyu1@sia.cn,xuchi@sia.cn,yhb@sia.cn,zp@sia.cn
Journal Article