Skip to content

I'm looking for...

Courses Research People Something else

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Courses

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Research

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

People

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Something else

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Event

Optimising the Beer Distribution Game: a Reinforcement Learning and Monte Carlo Tree Search Approach.

Thu 9 Dec 21

14:00 - 15:00
Online

Zoom
Event speaker

Felipe Maldonado
Event type

Lectures, talks and seminars
ED-3S
Event organiser

Mathematics, Statistics and Actuarial Science, School of
Contact details

Osama Mahmoud o.mahmoud@esex.ac.uk

These Departmental Seminars are for everyone in Maths. We encourage anyone interested in the subject in general, or in the particular subject of the seminar, to come along. It's a great opportunity to meet people in the Maths Department and join in with our community.

Optimising the Beer Distribution Game: a Reinforcement Learning and Monte Carlo Tree Search Approach.

The Beer Distribution Game (BDG) was originally introduced by Professor Jay Wright Forrester at MIT in 1960, ever since it has worked as a model for multi echelon supply chains, where it has been mainly studied using classic Operational Research techniques.

Some early attempts of applying AI methodologies consisted on tabular Q-learning approaches on simplified settings (Chaharsooghi et al. 2008; Mortazavi et al. 2015), but reached their limits in more realistic environments due to the large state-spaces and the resulting "Curse of Dimensionality”. One idea to break this Curse is the introduction of function approximation for the action value Q of any given action in any given state, instead of saving the value for each of them in a huge table. This was for example done by Oroojlooyjadid et al. (2017) and Geevers et al. (2020), who introduced Deep Neural Networks for this task.

In our research we focus on another approach: (traditional) Reinforcement Learning and Monte Carlo tree search (MCTS) algorithms. While the most famous application of these methods is associated to building intelligent agents for board games like Chess or Go (Silver, Schrittwieser, et al. 2017), it has also strongly impacted other domains which can be modelled as trees of sequential decisions (Browne et al. 2012).

In a recent study, Preil and Krapp (2021) applied MCTS to inventory management for the first time and found that it performed even better than other AI-based approaches which were previously explored. In their research the authors consider an adapted version of the BDG, where the states are fully observable and all decisions are made centrally. In this talk I will present a model for the classical BDG environment with imperfect information. This setting models a multi-echelon supply chain where actors take decentralised decisions about their specific order quantities without being able to observe the inventory levels of the other actors.

Speaker

Felipe Maldonado, University of Essex

How to attend

If not a member of the Dept. Mathematical Science at the University of Essex, you can register your interest in attending the seminar and request the Zoom’s meeting password by emailing Dr Osama Mahmoud (o.mahmoud@essex.ac.uk).