Reinforcement Learning-Based Variable Horizon Model Predictive Control of Multirobot Systems in Dynamic Environments

Gupta, Shreyash; Tripathy, Niladri S.; Shah, Suril V.

Source: Journal of Dynamic Systems, Measurement, and Control:;2025:;volume( 147 ):;issue: 004::page 41013-1

Author:

Gupta, Shreyash

Tripathy, Niladri S.

Shah, Suril V.

DOI: 10.1115/1.4068043

Publisher: The American Society of Mechanical Engineers (ASME)

Abstract: Multirobot systems (MRS) consist of multiple autonomous robots that collaborate to perform tasks more efficiently than single-robot systems. These systems enhance flexibility, enabling applications in areas such as environmental monitoring, search and rescue, and agricultural automation while addressing challenges related to coordination, communication, and task assignment. Model predictive control (MPC) stands out as a promising controller for multirobot control due to its preview capability and effective constraint handling. However, MPC's performance heavily relies on the chosen length of the prediction horizon. Extending the prediction horizon significantly raises computation costs, making its tuning time-consuming and task-specific. To address this challenge, we introduce a framework utilizing a Collective Reinforcement Learning strategy to generate the prediction horizon dynamically based on the states of the robots. We propose that the prediction horizon of any robot in MRS depends on the states of all the robots. Additionally, we propose a versatile on-demand collision avoidance (VODCA) strategy to enable on-the-fly collision avoidance for multiple robots operating under varying prediction horizons. This approach establishes a better tradeoff between performance and computation costs, allowing for adaptable prediction horizons for each robot at every time-step. Numerical studies are performed to investigate the scalability of the proposed framework, the stiffness of the learned reinforcement learning (RL) policy, and the comparison with the fixed horizon and existing variable horizon MPC methods. The framework is also implemented on multiple TurtleBot3 Waffle Pi for various multirobot tasks.

Download: (3.722Mb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Reinforcement Learning-Based Variable Horizon Model Predictive Control of Multirobot Systems in Dynamic Environments

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4308294

Collections

Journal of Dynamic Systems, Measurement, and Control

Show full item record

contributor author	Gupta, Shreyash
contributor author	Tripathy, Niladri S.
contributor author	Shah, Suril V.
date accessioned	2025-08-20T09:26:55Z
date available	2025-08-20T09:26:55Z
date copyright	3/28/2025 12:00:00 AM
date issued	2025
identifier issn	0022-0434
identifier other	ds_147_04_041013.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4308294
description abstract	Multirobot systems (MRS) consist of multiple autonomous robots that collaborate to perform tasks more efficiently than single-robot systems. These systems enhance flexibility, enabling applications in areas such as environmental monitoring, search and rescue, and agricultural automation while addressing challenges related to coordination, communication, and task assignment. Model predictive control (MPC) stands out as a promising controller for multirobot control due to its preview capability and effective constraint handling. However, MPC's performance heavily relies on the chosen length of the prediction horizon. Extending the prediction horizon significantly raises computation costs, making its tuning time-consuming and task-specific. To address this challenge, we introduce a framework utilizing a Collective Reinforcement Learning strategy to generate the prediction horizon dynamically based on the states of the robots. We propose that the prediction horizon of any robot in MRS depends on the states of all the robots. Additionally, we propose a versatile on-demand collision avoidance (VODCA) strategy to enable on-the-fly collision avoidance for multiple robots operating under varying prediction horizons. This approach establishes a better tradeoff between performance and computation costs, allowing for adaptable prediction horizons for each robot at every time-step. Numerical studies are performed to investigate the scalability of the proposed framework, the stiffness of the learned reinforcement learning (RL) policy, and the comparison with the fixed horizon and existing variable horizon MPC methods. The framework is also implemented on multiple TurtleBot3 Waffle Pi for various multirobot tasks.
publisher	The American Society of Mechanical Engineers (ASME)
title	Reinforcement Learning-Based Variable Horizon Model Predictive Control of Multirobot Systems in Dynamic Environments
type	Journal Paper
journal volume	147
journal issue	4
journal title	Journal of Dynamic Systems, Measurement, and Control
identifier doi	10.1115/1.4068043
journal fristpage	41013-1
journal lastpage	41013-15
page	15
tree	Journal of Dynamic Systems, Measurement, and Control:;2025:;volume( 147 ):;issue: 004
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive