This repository contains the code and resources for my Bachelor Thesis:Non-Stationarity in the Train-Scheduling-Problem: Leveraging Effects of Curriculum Learning
- This Thesis is the basis of the following Paper: Mitigating the Stability-Plasticity Dilemma in Adaptive Train Scheduling with Curriculum-Driven Continual DQN Expansion (Code Here!)
- Simulator used is Flatland-RL by AICrowd
Trains are a long-existing medium of transportation and supply chain management, which are
still of the utmost importance for global and local transportation of goods and individuals to
the present day. With the significance of trains, this bachelor thesis delves into the intricacies
of train scheduling, the problem of controlling multiple trains, and adapting to unforeseen problems.
While Operations Research methodologies currently dominate train scheduling, their limitations
in adaptability and computational efficiency prompt exploration into alternative approaches.
This thesis investigates the application of Multi-Agent Reinforcement Learning in the train
scheduling problem and highlights the advantages of designing training curricula derived from
a deconstruction of the train scheduling problem.
By utilizing this Custom Curriculum, we were able to improve the mean done rate(number of
trains reaching their destination) of a DDDQN algorithm by about 160% compared to using No
Curriculum.
We further explore adaptations to the DDDQN addressing the Non-Stationarity, which is introduced with the changing environments of the curricula to leverage the positive effects of the
custom curricula.
The adaptation that improved mean done rate the most, when evaluated in an environment not
being part of the training data, was the utilization of rational pad´e activation units(a type of
learnable activation function), which increased the mean done rate by roughly 232%, but also
the use of elastic weight consolidation yielded an improvement of 195%, both showing us that
we are able to leverage the effects of a curriculum by using adaptations to Non-Stationarity,
commonly used in continual/lifelong reinforcement learning setting.
The insights gained contribute to making RL more applicable to logistics and supply chain
management tasks, enhancing efficiency and adaptability across them, but they need further
investigation due to the results displaying high variance.