Skip to content

Using Burlap RL library templates for a more modern experience with burlap

Notifications You must be signed in to change notification settings


Repository files navigation

ML Java helper templates for use with Burlap

Implementes templates to use the Java-based Reienforcement Learning alogrithm's provied in the BURLAP libaray from Brown University.

Setup and run

  • This project uses java and gradle. Make sure you have a recent version of Java JDK installed (recommend JDK 15 or higher)
  • Install gradle
  • Run gradle build ./gradlew build

Run the demos

  • ./gradlew helloGridWorld Open the BURLAP GridWorld hellow world explorer, keys:
    • A-West, D-East, W-Up, S-Down
  • ./gradlew blockDudeViewer Run BURLAP's BlockDude, keys:
    • a - West, d - East, w - jump up
    • s - pickup, x - putdown
  • ./gradlew demoExperiment Runs the complete demo experiments in

Import into you IDE

  • Intellij - import new gradle project, select the root directory of this project
  • Eclipse - (no tested)

Create and run your experiments.

A sample experiment has been provided in Edit this file to setup various experiment sizes, current examples:

  1. Setup Large & Small GridWorldExperiments
  2. Setup the Level1 & Level2 BlockDude experiments

Also, three MDP solver alogorithms are provided:

  1. Value Iteraion Experiments (use the VISettings class to set hyperparametrs)
  2. Policy Iteration Experiments (use the PISettings class to set hyperparametrs)
  3. Q-Learning Experimnets (use the QSettings class to set hyperparametrs)

For running your experiments, you can just execute the main() of the class from your IDE.

Experiment output

A CSV writer is attached to each experiment, the output filename of each experiment is controlled by a "shortName" which is configured as part of your experiment type settings, PISettings, VISettings or QSettings. This short name will provide a filename prefix for each of the experiment runs.

Example file output output/smprob-24105858/blockdude/

Metrics Captured: Each experiment type has the ability to capture metrics collected during the iteraions of the experiments here is sample of metrics collected:

  1. "iter" - iteration id
  2. "delta" - delta value found at each iteration
  3. "wallclock" - wallclock time spent in each iteration, milliseconds for VI/PI, but nanosecond for QLearning
  4. "evals" - the number of VI evals done within a single policy step for PI
  5. "numSteps" - for QLearning, number of steps during last episode of learning


Using Burlap RL library templates for a more modern experience with burlap







No releases published


No packages published
