S2AN: Synthetic Score-based Attention Network. S2AN is an efficient network to learn the priority with high completeness for large-scale MAPF. More details can be found in our paper at AAMAS 2024 [1].
# 1. conda install python 3.8, torch 1.13(do not use torch 2.0!), gym, ray, tensorboard
conda create -n s2an python=3.8
conda install pytorch==1.13.0
pip install gym tensorboard
pip install -U "ray[default]"
# 2. Test the gym_sipps
python test_env.py
# You should get the "Test SIPPS env through." in the end.
# If you cannot use the provided solver.cpython-38-x86_64-linux-gnu.so (not test through),
# you can build it from source code as follows.
# 3.Optional. Build the lowlevel solver from source code.
# 3.1 install pybind11
conda install -c conda-forge pybind11
# 3.2 eigen3 install
sudo apt install libeigen3-dev
# 3.3 build the target by cmake.
mkdir -p MAPF-LNS2/build
cd MAPF-LNS2/build
cmake --DCMAKE_BUILD_TYPE=Release ..
make -j6
- For training or testing.
# training. you can config the parameters in alg_parameters.py
python train_ray.py
# . testing. obs: obstacle rate. o : output file name. use your own model by -m /path/to/your/model.pth .
python test.py --obs 20 -o results/s2an/20obs.csv
benchmark
: testing benchmark generated by the functiongenerate_benchmark
inmap_generater.py
.lowlevel_solvers
: a python wrapper for the sipps planner.MAPF-LNS2
. borrowed. c++ implementation of sipps planner.nets
. the main network of S2AN.PPO_preTrained
. Pretrained model.utils
. visiualization and useful torch related functions.gym_sipps.py
. RL environment based on sipps.learning_sipps.py
. The implemented RL algorithms (PPO and REINFORCE).map_generator.py
. Generate benchmark.rollout_buffer.py
. PPO related.runner.py
. Ray functions.test_env.py
. Test the installation of the MAPF-LNS2.train_ray.py
. Training script.
- The low-level single agent path planning algorithm SIPPS was borrowed from (https://github.com/Jiaoyang-Li/MAPF-LNS2/tree/init-LNS). It is released under USC – Research License. Go to the repo for more detail
- We complete the transformer-based network based on this repo (https://github.com/wouterkool/attention-learn-to-route).
- We complete the RL algorithm based on these two repos, (https://github.com/nikhilbarhate99/PPO-PyTorch) and (https://github.com/marmotlab/SCRIMP).
[1] Yibin Yang, Mingfeng Fan, Chengyang He, Jianqiang Wang, Heye Huang, and Guillaume Sartoretti. 2024. Attention-based Priority Learning for Limited Time Multi-Agent Path Finding. In Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), Auckland, New Zealand, May 6 – 10, 2024, IFAAMAS, 9 pages.