Skip to content

Releases: Unity-Technologies/ml-agents

ML-Agents Beta 0.3.1

13 Apr 22:38
59e6dd2
Compare
Choose a tag to compare
ML-Agents Beta 0.3.1 Pre-release
Pre-release

Features

  • We have upgraded our Docker contain, which now supports Brains which contain camera-based Visual Observations.

Documentation

  • We have added a partial Chinese translation of our documentation. It is available here.

Fixes & Performance Improvements

  • Missing component reference in BananaRL environment.
  • Neural Network for multiple visual observations was not properly generated.
  • Episode time-out value estimate bootstrapping used incorrect observation as input.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3.1, as well as to the following community contributors:

@sterlingcrispin, @andersonaddo, @palomagr, @imankgoyal, @luchris429.

ML-Agents Beta 0.3.0b

21 Mar 21:13
72835e8
Compare
Choose a tag to compare
ML-Agents Beta 0.3.0b Pre-release
Pre-release

Fixes

  • Fixes internal brain for Banana Imitation.
  • Fixes Discrete Control training for Imitation Learning.
  • Fixes Visual Observations in internal brain with non-square inputs.

ML-Agents Beta 0.3.0a

16 Mar 22:06
7271ca5
Compare
Choose a tag to compare
ML-Agents Beta 0.3.0a Pre-release
Pre-release

Fixes

Added the missing Ray Perception components to the agents in the BananaImitation scene.

ML-Agents Beta 0.3.0

15 Mar 00:54
bbbe2e7
Compare
Choose a tag to compare
ML-Agents Beta 0.3.0 Pre-release
Pre-release

Environments

To learn more about new and improved environments, see our Example Environments page.

New

  • Soccer Twos - Multi-agent competitive and cooperative environment where behavior comes about because of reward function. Used to demonstrate multi-brain training.

  • Banana Collectors - Multi-agent resource collection environment where competitive or cooperative behavior comes about dynamically based on available resources. Used to demonstrate Imitation Learning.

  • Hallway - Single agent environment in which an agent must explore a room, remember the object within the room, and use that information to navigate to the correct goal. Used to demonstrate LSTM models.

  • Bouncer - Single agent environment provided as an example of our new On-Demand Decision-Making feature. In this environment, an agent can apply force to itself in order to bounce around a platform, and attempt to collide with floating bananas.

Improved

  • All environments have been visually refreshed with a consistent color pallet and design language.
  • Revamped GridWorld to only use visual observations and a 5x5 grid by default.
  • Revamped Tennis to use continuous actions.
  • Revamped Push Block to use local perception.
  • Revamped Wall Jump to use local perception.
  • Added Hard version of 3DBall which doesn’t contain velocity information in observations.

New Features

  • [Unity] On Demand Decision Making - It is now possible to have agents only request decisions from their brains when necessary, using RequestDecision() and RequestAction(). For more information, see here.
  • [Unity] Added vector-observation stacking - The past n vector observations for each agent can now be stored and used as input to a Brain for decision making.
  • [Python] Added Behavioral Cloning (Imitation Learning) algorithm - Train a neural network to imitate either player behavior or a hand-coded game bot using behavioral cloning. For more info, see here.
  • [Python] Support for training multiple brains simultaneously - Two or more different brains can now be trained simultaneously using the provided PPO algorithm.
  • [Python] Added LSTM models - We now support training and embedding recurrent neural networks using the PPO algorithm. This allows for learning temporal dependencies between observations.
  • [Unity] [Python] Added Docker Image for RL-training - We now provide a Docker image which allows users to train their brains in an isolated environment without the need to install Python, TensorFlow, and other dependencies. For more information, see here.
  • [Python] Ability to provide random seed to training process and environment - Allows for reproducible experimentation. For more information, see here. (Note: Unity Physics is non-deterministic, as such fully-reproducible experiments are currently not possible when using physics based interactions.)

Changes

  • [Unity] Memory size has been removed as a user-facing brain parameter. It is now defined when creating models from unitytrainers.
  • [Unity] [Python] The API as well as the general semantics used throughout ML-Agents has changed. See here for information on these changes, and how to easily adjust current projects to be compatible with these changes.
  • [Python] Training hyperparameters are now defined in a .yaml file instead of via command line arguments.
  • [Python] Training now takes place via learn.py, which launches trainers for multiple brains.
  • [Python] Python 2 is no longer supported.

Documentation

Documentation has been significantly re-written to include many new sections, in addition to updated tutorials and guides. Check it out here.

Fixes & Performance Improvements

  • [Unity] Improved memory management - Reduced garbage collection memory usage by up to 5x when using External Brain.
  • [Unity] Time.captureFramerate is now set by default to help sync Update and FixedUpdate frequencies.
  • [Unity] Added tooltips to relevant inspector objects.
  • [Unity] It is now possible to instantiate and destroy GameObjects which are Agents.
  • [Unity] Improved visual observation inference time by 3x.
  • [Unity] Tooltips added to Unity Inspector for ML-Agents variables and functions.
  • [Unity] [Python] Epsilon is now a built-in part of PPO graph. It is no longer necessary to specify it additionally in “Graph Placeholders” from Unity.
  • [Python] Changed value bootstrapping in PPO algorithm to properly calculate returns on episode time-out.
  • [Python] The neural network graph is now automatically saved as a .bytes file when training is interrupted.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3, as well as:

@asolano, @LionH, @MarcoMeter, @srcnalt, @wouterhardeman, @60days, @floAr, @Coac, @Zamaroht, @slightperturbation

ML-Agents Beta 0.2.1d

15 Feb 13:04
ef61887
Compare
Choose a tag to compare
ML-Agents Beta 0.2.1d Pre-release
Pre-release

Fixes

  • Fixes bug where visual observations could not be used with PPO.

ML-Agents Beta 0.2.1c

05 Feb 19:14
Compare
Choose a tag to compare
ML-Agents Beta 0.2.1c Pre-release
Pre-release

Fixes

  • Require TensorFlow 1.4 to prevent incompatibilities between models built using TensorFlow 1.5 and current TensorFlowSharp bindings.

ML-Agents Beta 0.2.1b

19 Jan 21:35
Compare
Choose a tag to compare
ML-Agents Beta 0.2.1b Pre-release
Pre-release

Fixes & Performance Improvements

  • [Python] Fixes a bug that prevented the creation of network graphs which did not contain visual observations.

ML-Agents Beta 0.2.1a

19 Jan 01:02
Compare
Choose a tag to compare
ML-Agents Beta 0.2.1a Pre-release
Pre-release

Features

  • [Python] Adds support for training brains with multiple visual observations using PPO. Thanks to @asolano for contributing this!

ML-Agents Beta 0.2

05 Dec 23:29
Compare
Choose a tag to compare
ML-Agents Beta 0.2 Pre-release
Pre-release

Environments

  • Four new example environments added (learn more):

    • Crawler
    • Reacher
    • Wall Area
    • Push Area
  • Environments no longer use normalized state values due to optional auto-normalizing done in PPO.

Features

Communication API Updated. Be sure both Unity project files and Python api are most current version.

Python

  • PPO now optionally auto-normalizes states using running-average and running-variance (with --normalize flag).
  • unityagents package now includes Curriculum Learning support (learn more).
  • Absolute path to training environments can now be used when running UnityEnvironment().
  • The Environment now logs errors and exceptions on the Unity side into the unity-environment.log file.

Unity

  • New more flexible Monitor which allows for displaying arbitrary information (learn more).
  • Broadcast support for internal, heuristic, and player brains which allows all relevant agent information to be sent to python-side for supervised/imitation learning (learn more).

Bug Fixes & Performance Improvements

Python

  • Communication code now supports arbitrarily large observation cameras and states.

Unity

  • Cumulative reward now accurately tracks reward.
  • AcademyReset() now called before agent reset.
  • isInference is now correctly set when running in Editor.
  • Frame-rate is unlocked by default when in isInference is false.

ML-Agents Beta 0.1.2

25 Sep 20:42
Compare
Choose a tag to compare
ML-Agents Beta 0.1.2 Pre-release
Pre-release

Features & Additions

Unity

  • Added Basic Environment for testing discrete state environments

Python

  • Reconfigured PPO model generation to support:
    • Discrete control w/ discrete-state input
    • Continuous Control w/ visual and discrete-state input
    • Combined visual/state inputs for CC and DC
    • Color (3-channel) observations

General

  • Added pre-configured AWS AMI for cloud-training
  • Move wiki to docs directory for better community collaboration

Bug Fixes

Unity

  • Provides message for state size mismatch
  • Defaults to continuous state space for new brains