Releases · Unity-Technologies/ml-agents

13 Apr 22:38

awjuliani

0.3.1

59e6dd2

ML-Agents Beta 0.3.1 Pre-release

Pre-release

Features

We have upgraded our Docker contain, which now supports Brains which contain camera-based Visual Observations.

Documentation

We have added a partial Chinese translation of our documentation. It is available here.

Fixes & Performance Improvements

Missing component reference in BananaRL environment.
Neural Network for multiple visual observations was not properly generated.
Episode time-out value estimate bootstrapping used incorrect observation as input.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3.1, as well as to the following community contributors:

@sterlingcrispin, @andersonaddo, @palomagr, @imankgoyal, @luchris429.

Assets 2

21 Mar 21:13

awjuliani

0.3.0b

72835e8

ML-Agents Beta 0.3.0b Pre-release

Pre-release

Fixes

Fixes internal brain for Banana Imitation.
Fixes Discrete Control training for Imitation Learning.
Fixes Visual Observations in internal brain with non-square inputs.

Assets 2

16 Mar 22:06

vincentpierre

0.3.0a

7271ca5

ML-Agents Beta 0.3.0a Pre-release

Pre-release

Fixes

Added the missing Ray Perception components to the agents in the BananaImitation scene.

Assets 2

15 Mar 00:54

vincentpierre

0.3.0

bbbe2e7

ML-Agents Beta 0.3.0 Pre-release

Pre-release

Environments

To learn more about new and improved environments, see our Example Environments page.

New

Soccer Twos - Multi-agent competitive and cooperative environment where behavior comes about because of reward function. Used to demonstrate multi-brain training.
Banana Collectors - Multi-agent resource collection environment where competitive or cooperative behavior comes about dynamically based on available resources. Used to demonstrate Imitation Learning.
Hallway - Single agent environment in which an agent must explore a room, remember the object within the room, and use that information to navigate to the correct goal. Used to demonstrate LSTM models.
Bouncer - Single agent environment provided as an example of our new On-Demand Decision-Making feature. In this environment, an agent can apply force to itself in order to bounce around a platform, and attempt to collide with floating bananas.

Improved

All environments have been visually refreshed with a consistent color pallet and design language.
Revamped GridWorld to only use visual observations and a 5x5 grid by default.
Revamped Tennis to use continuous actions.
Revamped Push Block to use local perception.
Revamped Wall Jump to use local perception.
Added Hard version of 3DBall which doesn’t contain velocity information in observations.

New Features

[Unity] On Demand Decision Making - It is now possible to have agents only request decisions from their brains when necessary, using RequestDecision() and RequestAction(). For more information, see here.
[Unity] Added vector-observation stacking - The past n vector observations for each agent can now be stored and used as input to a Brain for decision making.
[Python] Added Behavioral Cloning (Imitation Learning) algorithm - Train a neural network to imitate either player behavior or a hand-coded game bot using behavioral cloning. For more info, see here.
[Python] Support for training multiple brains simultaneously - Two or more different brains can now be trained simultaneously using the provided PPO algorithm.
[Python] Added LSTM models - We now support training and embedding recurrent neural networks using the PPO algorithm. This allows for learning temporal dependencies between observations.
[Unity] [Python] Added Docker Image for RL-training - We now provide a Docker image which allows users to train their brains in an isolated environment without the need to install Python, TensorFlow, and other dependencies. For more information, see here.
[Python] Ability to provide random seed to training process and environment - Allows for reproducible experimentation. For more information, see here. (Note: Unity Physics is non-deterministic, as such fully-reproducible experiments are currently not possible when using physics based interactions.)

Changes

[Unity] Memory size has been removed as a user-facing brain parameter. It is now defined when creating models from unitytrainers.
[Unity] [Python] The API as well as the general semantics used throughout ML-Agents has changed. See here for information on these changes, and how to easily adjust current projects to be compatible with these changes.
[Python] Training hyperparameters are now defined in a .yaml file instead of via command line arguments.
[Python] Training now takes place via learn.py, which launches trainers for multiple brains.
[Python] Python 2 is no longer supported.

Documentation

Documentation has been significantly re-written to include many new sections, in addition to updated tutorials and guides. Check it out here.

Fixes & Performance Improvements

[Unity] Improved memory management - Reduced garbage collection memory usage by up to 5x when using External Brain.
[Unity] Time.captureFramerate is now set by default to help sync Update and FixedUpdate frequencies.
[Unity] Added tooltips to relevant inspector objects.
[Unity] It is now possible to instantiate and destroy GameObjects which are Agents.
[Unity] Improved visual observation inference time by 3x.
[Unity] Tooltips added to Unity Inspector for ML-Agents variables and functions.
[Unity] [Python] Epsilon is now a built-in part of PPO graph. It is no longer necessary to specify it additionally in “Graph Placeholders” from Unity.
[Python] Changed value bootstrapping in PPO algorithm to properly calculate returns on episode time-out.
[Python] The neural network graph is now automatically saved as a .bytes file when training is interrupted.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3, as well as:

@asolano, @LionH, @MarcoMeter, @srcnalt, @wouterhardeman, @60days, @floAr, @Coac, @Zamaroht, @slightperturbation

Assets 2

15 Feb 13:04

awjuliani

0.2.1d

ef61887

ML-Agents Beta 0.2.1d Pre-release

Pre-release

Fixes

Fixes bug where visual observations could not be used with PPO.

Assets 2

05 Feb 19:14

awjuliani

0.2.1c

af19862

ML-Agents Beta 0.2.1c Pre-release

Pre-release

Fixes

Require TensorFlow 1.4 to prevent incompatibilities between models built using TensorFlow 1.5 and current TensorFlowSharp bindings.

Assets 2

19 Jan 21:35

awjuliani

0.2.1b

ecdbbd7

ML-Agents Beta 0.2.1b Pre-release

Pre-release

Fixes & Performance Improvements

[Python] Fixes a bug that prevented the creation of network graphs which did not contain visual observations.

Assets 2

19 Jan 01:02

awjuliani

0.2.1a

ecdbbd7

ML-Agents Beta 0.2.1a Pre-release

Pre-release

Features

[Python] Adds support for training brains with multiple visual observations using PPO. Thanks to @asolano for contributing this!

Assets 2

05 Dec 23:29

awjuliani

0.2.0

0077f17

ML-Agents Beta 0.2 Pre-release

Pre-release

Environments

Four new example environments added (learn more):
- Crawler
- Reacher
- Wall Area
- Push Area
Environments no longer use normalized state values due to optional auto-normalizing done in PPO.

Features

Communication API Updated. Be sure both Unity project files and Python api are most current version.

Python

PPO now optionally auto-normalizes states using running-average and running-variance (with --normalize flag).
unityagents package now includes Curriculum Learning support (learn more).
Absolute path to training environments can now be used when running UnityEnvironment().
The Environment now logs errors and exceptions on the Unity side into the unity-environment.log file.

Unity

New more flexible Monitor which allows for displaying arbitrary information (learn more).
Broadcast support for internal, heuristic, and player brains which allows all relevant agent information to be sent to python-side for supervised/imitation learning (learn more).

Bug Fixes & Performance Improvements

Python

Communication code now supports arbitrarily large observation cameras and states.

Unity

Cumulative reward now accurately tracks reward.
AcademyReset() now called before agent reset.
isInference is now correctly set when running in Editor.
Frame-rate is unlocked by default when in isInference is false.

Assets 2

25 Sep 20:42

awjuliani

0.1.2

368ed29

ML-Agents Beta 0.1.2 Pre-release

Pre-release

Features & Additions

Unity

Added Basic Environment for testing discrete state environments

Python

Reconfigured PPO model generation to support:
- Discrete control w/ discrete-state input
- Continuous Control w/ visual and discrete-state input
- Combined visual/state inputs for CC and DC
- Color (3-channel) observations

General

Added pre-configured AWS AMI for cloud-training
Move wiki to docs directory for better community collaboration

Bug Fixes

Unity

Provides message for state size mismatch
Defaults to continuous state space for new brains

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features

Documentation

Fixes & Performance Improvements

Acknowledgements

Fixes

Fixes

Environments

New

Improved

New Features

Changes

Documentation

Fixes & Performance Improvements

Acknowledgements

Fixes

Fixes

Fixes & Performance Improvements

Features

Environments

Features

Python

Unity

Bug Fixes & Performance Improvements

Python

Unity

Features & Additions

Unity

Python

General

Bug Fixes

Unity

Releases: Unity-Technologies/ml-agents

ML-Agents Beta 0.3.1

Features

Documentation

Fixes & Performance Improvements

Acknowledgements

ML-Agents Beta 0.3.0b

Fixes

ML-Agents Beta 0.3.0a

Fixes

ML-Agents Beta 0.3.0

Environments

New

Improved

New Features

Changes

Documentation

Fixes & Performance Improvements

Acknowledgements

ML-Agents Beta 0.2.1d

Fixes

ML-Agents Beta 0.2.1c

Fixes

ML-Agents Beta 0.2.1b

Fixes & Performance Improvements

ML-Agents Beta 0.2.1a

Features

ML-Agents Beta 0.2

Environments

Features

Python

Unity

Bug Fixes & Performance Improvements

Python

Unity

ML-Agents Beta 0.1.2

Features & Additions

Unity

Python

General

Bug Fixes

Unity