ML Java helper templates for use with Burlap

Implementes templates to use the Java-based Reienforcement Learning alogrithm's provied in the BURLAP libaray from Brown University.

Setup and run

This project uses java and gradle. Make sure you have a recent version of Java JDK installed (recommend JDK 15 or higher)
Install gradle
- Option 1: Gradle recommended gradle.org install instrucitons
- Option 2: use ASDF
Run gradle build ./gradlew build

Run the demos

./gradlew helloGridWorld Open the BURLAP GridWorld hellow world explorer, keys:
- A-West, D-East, W-Up, S-Down
./gradlew blockDudeViewer Run BURLAP's BlockDude, keys:
- a - West, d - East, w - jump up
- s - pickup, x - putdown
./gradlew demoExperiment Runs the complete demo experiments in RunExperiments.java

Import into you IDE

Intellij - import new gradle project, select the root directory of this project
Eclipse - (no tested)

Create and run your experiments.

A sample experiment has been provided in RunExperiments.java Edit this file to setup various experiment sizes, current examples:

Setup Large & Small GridWorldExperiments
Setup the Level1 & Level2 BlockDude experiments

Also, three MDP solver alogorithms are provided:

Value Iteraion Experiments (use the VISettings class to set hyperparametrs)
Policy Iteration Experiments (use the PISettings class to set hyperparametrs)
Q-Learning Experimnets (use the QSettings class to set hyperparametrs)

For running your experiments, you can just execute the main() of the RunExperiments.java class from your IDE.

Experiment output

A CSV writer is attached to each experiment, the output filename of each experiment is controlled by a "shortName" which is configured as part of your experiment type settings, PISettings, VISettings or QSettings. This short name will provide a filename prefix for each of the experiment runs.

Example file output output/smprob-24105858/blockdude/

Metrics Captured: Each experiment type has the ability to capture metrics collected during the iteraions of the experiments here is sample of metrics collected:

"iter" - iteration id
"delta" - delta value found at each iteration
"wallclock" - wallclock time spent in each iteration, milliseconds for VI/PI, but nanosecond for QLearning
"evals" - the number of VI evals done within a single policy step for PI
"numSteps" - for QLearning, number of steps during last episode of learning

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
gradle/wrapper		gradle/wrapper
src/main/java/org/omscs/ml/a4burlap		src/main/java/org/omscs/ml/a4burlap
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Java helper templates for use with Burlap

Setup and run

Run the demos

Import into you IDE

Create and run your experiments.

Experiment output

About

Releases

Packages

Languages

robododge/ml_burlap_templates

Folders and files

Latest commit

History

Repository files navigation

ML Java helper templates for use with Burlap

Setup and run

Run the demos

Import into you IDE

Create and run your experiments.

Experiment output

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages