'''Abstract:''' Railway companies face the following (simplified) problem: a fleet of multiple trains must be organized in a way such that each train reaches its destination in some time limit, while deadlocks and congestions must be avoided, and failures can cause temporal railway blockings.
The goal of this project/thesis is to solve the problem with either operations research techniques or reinforcement learning.

'''Your interests/skills:''' Route planning, railway networks, graphs, reinforcement learning (optional)

Train scheduling is the task of deciding when and how to route trains, such that each train reaches its destination in acceptable time. Trains may experience failures on their way, causing temporal blocks of some railways, which makes rerouting the other trains necessary.

In this project/thesis you will use Flatland, a railway simulation environment developed by AIcrowd in cooperation with the railway agencies of Switzerland, Germany and France, which facilitates the development and evaluation of train scheduling algorithms on many different problem instances with varying sizes.

There are to ways to tackle train scheduling:
 1. Using Operations Research (OR): An algorithm globally optimizes the scheduling and (re-)routing of the trains, e.g. by solving an optimization problem, or heuristically based on shortest path computations.
 2. Using Reinforcement Learning (RL): Each train is an individual agent, and a scheduling and (re-)routing policy is learned from experience during many runs of a simulation.

The first part of this project/thesis is to develop a strong baseline using OR, similar to the winning approaches from the Flatland 2019 and 2020 competitions, and evaluate it using the Flatland environment. Actually, the winning approaches are not very complex: they are based on shortest path computations, combined with prioritization strategies, and rerouting in case of train failures. Multiple prioritization heuristics can be implemented and compared, and the impact of each of the baseline's components on the overall performance evaluated. This will give many useful insights into the problem.

From here on, there are (at least) two possible ways to continue:
 1. Improve the OR approach: From your experience with part 1, develop ideas how to improve the OR algorithm. Implement and evaluate the most promising idea (or multiple ideas).
 2. Develop an RL approach: Train an agent that is capable of scheduling and routing the trains, either with local perception (the agent just observes the environment and trains around it) or global perception (the agent observes the entire network and all other trains). Evaluate the RL approach and compare it to the baseline: what are advantages and disadvantages of one and the other?

Useful material:
 * [[https://www.aicrowd.com/challenges/flatland-3|Flatland competition on AIcrowd]], including a guide on how to make a submission in 10 minutes.
 * [[https://slideslive.com/38942744/challenge-design-results|Introduction to flatland at NeurIPS 2020]] (11 minutes presentation)
 * [[https://arxiv.org/pdf/2012.05893.pdf|The arXiv 2020 paper]] introducing the Flatland challenge.
 * [[https://www.aicrowd.com/blogs/flatland-mugurel|Interview with the 2019 winner]]
 * [[https://slideslive.com/38942745/2020-flatland-challenge|Presentation by the 2020 winners]] (5 minutes), all NeurIPS 2020 talks are also found on the competition website.