Skip to content

Latest commit

 

History

History
35 lines (28 loc) · 1.6 KB

README.md

File metadata and controls

35 lines (28 loc) · 1.6 KB

ON_OFF_DRL

Code for the paper On-Policy vs. Off-Policy Deep Reinforcement Learning for Resource Allocation in Open Radio Access Network

  • This is the code for the paper: On-Policy vs. Off-Policy Deep Reinforcement Learning for Resource Allocation in Open Radio Access Network

  • It has been tested on Windows 10 and Python 3.8.3

  • To run the code, mainly you need:

  • pip install torch
  • pip install gurobipy
  • To avoid the long training time, you could go directly to the tests folder and run:
  • reward_plot.py to get the step and episode reward figures (first two figures)
  • stability_plot.py to get the ACER and PPO reward figures for different NN architectures
  • plot.py to get the energy and energy per latency figures (last two figures)
  • To start training from scratch, you need to generate the reward files and the trained models weights by running:
  • acer_32.py, then
  • acer_64.py, then
  • acer_256.py, then
  • ppo_32.py, then
  • ppo_64.py, then
  • ppo_256.py, then
  • reward_plot.py and stability_plot.py

As a result you will create folders (PPO_files, PPO_pretrained, ACER_files, ACER_pretrained) that contains reward files and trained models respectively.

  • learn_acer.py and learn_ppo.py load the trained models to test them in energy and latency performance
  • run main.py to to do these tests and plot the energy and energy per latency figures.
  • opt.py and greedy.py implement the optimal MIP solution and the greedy solution respectively.

----- REFERENCES -----