Exploiting Environment Configurability in Reinforcement Learning

Exploiting Environment Configurability in Reinforcement Learning
Author :
Publisher : IOS Press
Total Pages : 377
Release :
ISBN-13 : 9781643683638
ISBN-10 : 1643683632
Rating : 4/5 (32 Downloads)

Book Synopsis Exploiting Environment Configurability in Reinforcement Learning by : A.M. Metelli

Download or read book Exploiting Environment Configurability in Reinforcement Learning written by A.M. Metelli and published by IOS Press. This book was released on 2022-12-07 with total page 377 pages. Available in PDF, EPUB and Kindle. Book excerpt: In recent decades, Reinforcement Learning (RL) has emerged as an effective approach to address complex control tasks. In a Markov Decision Process (MDP), the framework typically used, the environment is assumed to be a fixed entity that cannot be altered externally. There are, however, several real-world scenarios in which the environment can be modified to a limited extent. This book, Exploiting Environment Configurability in Reinforcement Learning, aims to formalize and study diverse aspects of environment configuration. In a traditional MDP, the agent perceives the state of the environment and performs actions. As a consequence, the environment transitions to a new state and generates a reward signal. The goal of the agent consists of learning a policy, i.e., a prescription of actions that maximize the long-term reward. Although environment configuration arises quite often in real applications, the topic is very little explored in the literature. The contributions in the book are theoretical, algorithmic, and experimental and can be broadly subdivided into three parts. The first part introduces the novel formalism of Configurable Markov Decision Processes (Conf-MDPs) to model the configuration opportunities offered by the environment. The second part of the book focuses on the cooperative Conf-MDP setting and investigates the problem of finding an agent policy and an environment configuration that jointly optimize the long-term reward. The third part addresses two specific applications of the Conf-MDP framework: policy space identification and control frequency adaptation. The book will be of interest to all those using RL as part of their work.


Exploiting Environment Configurability in Reinforcement Learning Related Books