Inverse Reinforcement Learning with Hybrid-weight Trust-region Optimization and Curriculum Learning for Autonomous Maneuvering


Despite significant advancements, collision-free navigation in autonomous driving is still challenging, considering the navigation module needs to balance learning and planning to achieve efficient and effective control of the vehicle. We propose a novel framework of inverse reinforcement learning with hybrid-weight trust-region optimization and curriculum learning (IRL-HC) for autonomous maneuvering. Our method can incorporate both expert demonstration (from real driving) and domain knowledge (hard constraints such as collision avoidance, goal reaching, etc. encoded in reward functions) to learn an effective control policy. The hybrid-weight trust-region optimization is used to determine the difficulty of the task curriculum for fast incremental curriculum learning and improve the efficiency of inverse reinforcement learning by hybrid weight tuning of different sets of hyperparameters. IRL-HC is also compatible with domain-dependent techniques such as learn-from-accident, which can further boost performance. Overall, IRL-HC can reduce the number of collisions up to 48%, increase the training efficiency by 2.8x, and enable the vehicle to drive 10x further compared to other methods.

International Conference on Intelligent Robots and Systems, 2022 (under review)