Version 1
: Received: 1 September 2024 / Approved: 2 September 2024 / Online: 3 September 2024 (11:50:48 CEST)
How to cite:
Wrede, K.; Zarnack, S.; Lange, R.; Donath, O.; Wohlfahrt, T.; Feldmann, U. Curriculum Design and Sim2Real Transfer for Reinforcement Learning in Robotic Dual-Arm Assembly. Preprints2024, 2024090214. https://doi.org/10.20944/preprints202409.0214.v1
Wrede, K.; Zarnack, S.; Lange, R.; Donath, O.; Wohlfahrt, T.; Feldmann, U. Curriculum Design and Sim2Real Transfer for Reinforcement Learning in Robotic Dual-Arm Assembly. Preprints 2024, 2024090214. https://doi.org/10.20944/preprints202409.0214.v1
Wrede, K.; Zarnack, S.; Lange, R.; Donath, O.; Wohlfahrt, T.; Feldmann, U. Curriculum Design and Sim2Real Transfer for Reinforcement Learning in Robotic Dual-Arm Assembly. Preprints2024, 2024090214. https://doi.org/10.20944/preprints202409.0214.v1
APA Style
Wrede, K., Zarnack, S., Lange, R., Donath, O., Wohlfahrt, T., & Feldmann, U. (2024). Curriculum Design and Sim2Real Transfer for Reinforcement Learning in Robotic Dual-Arm Assembly. Preprints. https://doi.org/10.20944/preprints202409.0214.v1
Chicago/Turabian Style
Wrede, K., Tommy Wohlfahrt and Ute Feldmann. 2024 "Curriculum Design and Sim2Real Transfer for Reinforcement Learning in Robotic Dual-Arm Assembly" Preprints. https://doi.org/10.20944/preprints202409.0214.v1
Abstract
Robotic systems are crucial in modern manufacturing. Complex assembly tasks require the collaboration of multiple robots. Their orchestration is challenging due to tight tolerances and precision requirements. In this work we set up two Franka Panda robots performing a peg-in-hole insertion task. We structure the control system hierarchically, planning the robots’s trajectories feedback-based with a central policy trained with reinforcement learning. These trajectories are executed by a low-level impedance controller on each robot. To enhance training convergence, we use reverse curriculum learning incorporating domain randomization, varying initial configurations of the task. After training, we test the system in a simulation, studying the impact of curriculum parameters on emerging process characteristics like process time and variance. Finally, we transfer the trained model to a real-world setup, comparing results with simulation as well as classical path planning and control approaches.
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.