Overview

RDDL (pronounced as “riddle”) stands for the Relational Dynamic Influence Diagram Language. It is the domain modeling language utilized in the International Conference on Automated Planning and Scheduling (ICAPS) in the years 2011, 2014, 2018, and most recently in 2023 for the Probabilistic Planning and Reinforcement Learning track of the International Planning Competitions. RDDL was designed to efficiently represent real-world stochastic planning problems, specifically Markov Decision Processes (MDPs), with a focus on factored MDPs characterized by highly structured transition and reward functions. This tutorial aims to provide basic understanding of RDDL, including recent language enhancements and features, through a practical example. We will introduce a problem based on a real-world scenario and incrementally build up its representation in RDDL, starting from the raw mathematical equations up to a full RDDL description. Furthermore, we will introduce “pyRDDLGym,” a new Python framework for the generation of Gym environments from RDDL descriptions. This facilitates interaction with Reinforcement Learning (RL) agents via the standard Gym interface, as well as enables planning agents to work with the model. In a series of exercises, we will explore the capabilities of pyRDDLGym, which includes generation of the Dynamic Bayesian Networks (DBN) and eXtended Algebraic Decision Diagrams (XADD)-based conditional probability functions, as well as offering both generic and custom visualization options. We will also generate a functional environment for the example problem. To close the loop from a mathematical representation to a fully operational policy, we will utilize the built-in model-based anytime backpropagation planner, known as “JaxPlan.” This will enable us to obtain solutions for the example problem, effectively closing the gap between theoretical description and a practical working policy.

Outline

Introduction to stochastic planning problem classes and languages

General overview
MDP and stochastic problems with introduction of a running example
POMDP extensions

RDDL overview

Language overview capabilities
Domain and problem components
Modeling of the running example in RDDL (grounded)
Scaling up by lifting the domain specification

Introduction to pyRDDLGym

Vision
Repository structure
Executing the running example with the built-in visualizer and a simple policy
Custom visualization
Auxiliary (diagnostic) tools

Hands-on and interactive exercises

Mathematical modeling
RDDL representation
Interaction with a simple agent
Extending the model
Scalling up from one to many in a heart beat
Using the model based solver JaxPlan
Applying RL tools in pyRDDLGym

Time	Topic
8:30-9:30am	Introduction to stochastic planning problem classes and languages
9:30-10:00am	RDDL overview
10:00-10:30am	Introduction to pyRDDLGym
10:30am-11:00am	Break
11:00am-12:30pm	Hands-on and interactive exercises

Organizers

Scott Sanner

Scott Sanner is an Associate Professor at the University of Toronto specializing in AI topics such as sequential decision-making, recommender systems, and machine/deep learning applications. He is an Associate Editor for three journals and a recipient of four paper awards and a Google Faculty Research Award

Ayal Taitler

Ayal Taitler is a Postdoctoral Fellow at the University of Toronto, working with Prof. Scott Sanner. His research interests lie at the intersection of reinforcement learning, automated planning, and control theory, and its application to robotics and intelligent transportation. With over 10 years of experience in software engineering and AI research.

LH2: Introduction to MDP Modeling and Interaction via RDDL and pyRDDLGym

AAAI-2024 Vancouver

Overview

Outline

Schedule

Audience

Resources

Organizers