Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL example: Platoon #1955

Merged
merged 35 commits into from
Apr 14, 2023
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
3fa2f81
Add driving-smarts-2023
Adaickalavan Apr 10, 2023
83a9a0f
Add scenario attribute.
Adaickalavan Apr 10, 2023
0369996
Merge branch 'master' into driving-smarts-2023
Adaickalavan Apr 10, 2023
4df4c4e
Add changelog.
Adaickalavan Apr 10, 2023
dc748ce
Add platoon rl example.
Adaickalavan Apr 10, 2023
736d749
Add docs.
Adaickalavan Apr 10, 2023
5ac01c2
Merge branch 'master' into platoon-example
Adaickalavan Apr 11, 2023
64240dc
Move ObjDict.
Adaickalavan Apr 11, 2023
e5ce9c1
Add docs.
Adaickalavan Apr 11, 2023
e8f51fa
Add docs.
Adaickalavan Apr 11, 2023
c2914b3
Add docs.
Adaickalavan Apr 11, 2023
5c0eb0b
Add docs.
Adaickalavan Apr 11, 2023
852c58e
Update url.
Adaickalavan Apr 11, 2023
959b5c9
Break up benchmark 2023 into 3 parts.
Adaickalavan Apr 11, 2023
8361f27
Merge branch 'master' into driving-smarts-2023
Adaickalavan Apr 11, 2023
38d5c34
Merge branch 'driving-smarts-2023' into platoon-example
Adaickalavan Apr 11, 2023
c14b2c4
Add changelog.
Adaickalavan Apr 11, 2023
f912a0f
Merge branch 'master' into platoon-example
Adaickalavan Apr 11, 2023
1d78861
Improve docstring.
Adaickalavan Apr 11, 2023
d18231a
Rectify environment locator string.
Adaickalavan Apr 11, 2023
ce3685a
Update docs.
Adaickalavan Apr 12, 2023
011edb8
Add image.
Adaickalavan Apr 12, 2023
c15c3c6
Merge branch 'master' into platoon-example
Adaickalavan Apr 12, 2023
1dfdd61
Disable sumo gui and change leader id into regexp.
Adaickalavan Apr 12, 2023
7e3edbb
Update docs/benchmarks/driving_smarts_2023_3.rst
Adaickalavan Apr 12, 2023
827890d
Address reviews.
Adaickalavan Apr 12, 2023
ae6d763
Merge branch 'platoon-example' of https://github.com/huawei-noah/SMAR…
Adaickalavan Apr 12, 2023
fd08d60
Replaced incorrect image.
Adaickalavan Apr 12, 2023
df9be51
Start envision before running script.
Adaickalavan Apr 12, 2023
2f507f6
Include zip file in package.
Adaickalavan Apr 13, 2023
85276ba
Add vehicle following scenario on merge exit map.
Adaickalavan Apr 13, 2023
298fad8
Remove extra unwanted file.
Adaickalavan Apr 13, 2023
bdade29
Add mapspec.
Adaickalavan Apr 13, 2023
259f3b6
Fix pytype.
Adaickalavan Apr 14, 2023
49146ca
Edit changelog to include addition of new scenario.
Adaickalavan Apr 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ Copy and pasting the git commit messages is __NOT__ enough.
- Added a new entry tactic, `IdEntryTactic`, which provides the scenario the ability to select a specific actor for an agent to take over.
- Registered a new `chase-via-points-agent-v0` agent in agent zoo, which can effectively chase via points across different road sections by using the waypoints.
- Added new driving-smarts-v2023 benchmark consisting of new (i) driving-smarts-v2023 env and (ii) platoon-v0 env.
- Added baseline example, consisting of training, inference, and zoo agent registration, for the platooning task in Driving SMARTS 2023.3 benchmark.
- Documented the challenge objective, desired inference code structure, and use of baseline example, for Driving SMARTS 2023.3 benchmark, i.e., platooning task.
### Changed
- The trap manager, `TrapManager`, is now a subclass of `ActorCaptureManager`.
- Considering lane-change time ranges between 3s and 6s, assuming a speed of 13.89m/s, the via sensor lane acquisition range was increased from 40m to 80m, for better driving ability.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 4 additions & 3 deletions docs/benchmarks/driving_smarts_2022.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,11 @@ This benchmark allows ego agents to use any one of the following action spaces.
+ :attr:`~smarts.core.controllers.ActionSpaceType.TargetPose`
+ :attr:`~smarts.core.controllers.ActionSpaceType.RelativeTargetPose`

Trained agents
--------------
Zoo agents
----------

See the list of :ref:`available zoo agents <available_zoo_agents>` which are compatible with this benchmark. A compatible zoo agent can be run as follows.
See the list of :ref:`available zoo agents <available_zoo_agents>` which are compatible with this benchmark.
A compatible zoo agent can be evaluated in this benchmark as follows.

.. code-block:: bash

Expand Down
255 changes: 255 additions & 0 deletions docs/benchmarks/driving_smarts_2023_3.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
.. _driving_smarts_2023_3:

Driving SMARTS 2023.3
=====================

Objective
---------

Objective is to develop a single-ego policy capable of controlling a single ego to perform a platooning task in the
``platoon-v0`` environment. Refer to :func:`~smarts.env.gymnasium.platoon_env.platoon_env` for environment details.

.. important::

In a scenario with multiple egos, a single-ego policy is copied and pasted into every ego. Each ego is stepped
Adaickalavan marked this conversation as resolved.
Show resolved Hide resolved
independently by calling their respective :attr:`~smarts.core.agent.Agent.act` function. In short, multiple
egos are executed in a distributed manner. The single-ego policy should be capable of accounting for and
interacting with other egos, if any are present.
Adaickalavan marked this conversation as resolved.
Show resolved Hide resolved

Each ego is supposed to track and follow its specified leader (i.e., lead vehicle) in a single file or in a
platoon fashion. Name of the lead vehicle to be followed is given to the ego through its
Adaickalavan marked this conversation as resolved.
Show resolved Hide resolved
:attr:`~smarts.core.agent_interface.ActorsAliveDoneCriteria.actors_of_interest` attribute.

The episode ends for an ego when its assigned leader reaches the leader's destination. Egos do not have prior
Adaickalavan marked this conversation as resolved.
Show resolved Hide resolved
knowledge of the leader's destination.

Any method such as reinforcement learning, offline reinforcement learning, behaviour cloning, generative models,
predictive models, etc, may be used to develop the policy.

Several scenarios are provided for training. Their names and tasks are as follows.
The desired task execution is illustrated in a gif by a trained baseline agent.

+ straight_2lane_agents_1
A single ego must follow a specified leader, with no background traffic.

.. image:: /_static/driving_smarts_2023/platoon_straight_2lane_agents_1.gif

Observation space
-----------------

The underlying environment returns formatted :class:`~smarts.core.observations.Observation` using
:attr:`~smarts.env.utils.observation_conversion.ObservationOptions.multi_agent`
option as observation at each time point. See
:class:`~smarts.env.utils.observation_conversion.ObservationSpacesFormatter` for
a sample formatted observation data structure.

Action space
------------

Action space for each ego is :attr:`~smarts.core.controllers.ActionSpaceType.Continuous`.

Code structure
--------------

Users are free to use any training method and any folder structure for training the policy.

Only the inference code is required for evaluation, and therefore it must follow the folder
structure and certain specified file contents, as explained below. The below files and folders
must be present with identical names. Any additional files may be optionally added by
the user.

.. code-block:: text

inference
├── contrib_policy
│ ├── __init__.py
│ ├── policy.py
| .
| .
| .
├── __init__.py
├── MANIFEST.in
├── setup.cfg
└── setup.py

1. inference/contrib_policy/__init__.py
+ Keep this file unchanged.
+ It is an empty file.

2. inference/contrib_policy/policy.py
+ Must contain a ``class Policy(Agent)`` class which inherits from :class:`~smarts.core.agent.Agent`.

3. inference/__init__.py
+ Must contain the following template code.
+ The template code registers the user's policy in SMARTS agent zoo.

.. code-block:: python

from contrib_policy.policy import Policy

from smarts.core.agent_interface import AgentInterface
from smarts.core.controllers import ActionSpaceType
from smarts.zoo.agent_spec import AgentSpec
from smarts.zoo.registry import register


def entry_point(**kwargs):
interface = AgentInterface(
action=ActionSpaceType.<...>,
drivable_area_grid_map=<...>,
lane_positions=<...>,
lidar_point_cloud=<...>,
occupancy_grid_map=<...>,
road_waypoints=<...>,
signals=<...>,
top_down_rgb=<...>,
)

agent_params = {
"<...>": <...>,
"<...>": <...>,
}

return AgentSpec(
interface=interface,
agent_builder=Policy,
agent_params=agent_params,
)

register(locator="contrib-agent-v0", entry_point=entry_point)

+ User may fill in the ``<...>`` spaces in the template.
+ User may specify the ego's interface by configuring any field of :class:`~smarts.core.agent_interface.AgentInterface`, except

+ :attr:`~smarts.core.agent_interface.AgentInterface.accelerometer`,
+ :attr:`~smarts.core.agent_interface.AgentInterface.done_criteria`,
+ :attr:`~smarts.core.agent_interface.AgentInterface.max_episode_steps`,
+ :attr:`~smarts.core.agent_interface.AgentInterface.neighborhood_vehicle_states`, and
+ :attr:`~smarts.core.agent_interface.AgentInterface.waypoint_paths`.

4. inference/MANIFEST.in
+ Contains any file paths to be included in the package.

5. inference/setup.cfg
+ Must contain the following template code.
+ The template code helps build the user policy into a Python package.

.. code-block:: cfg

[metadata]
name = <...>
version = 0.1.0
url = https://github.com/huawei-noah/SMARTS
description = SMARTS zoo agent.
long_description = <...>. See [SMARTS](https://github.com/huawei-noah/SMARTS).
long_description_content_type=text/markdown
classifiers=
Programming Language :: Python
Programming Language :: Python :: 3 :: Only
Programming Language :: Python :: 3.8

[options]
packages = find:
include_package_data = True
zip_safe = True
python_requires = == 3.8.*
install_requires =
<...>==<...>
<...>==<...>

+ User may fill in the ``<...>`` spaces in the template.
+ User should provide a name for their policy and describe it in the ``name`` and ``long_description`` sections, respectively.
+ Do **not** add SMARTS package as a dependency in the ``install_requires`` section.

6. inference/setup.py
+ Keep this file and its default contents unchanged.
+ Its default contents are shown below.

.. code-block:: python

from setuptools import setup

if __name__ == "__main__":
setup()

Example
-------

An example training and inference code is provided for this benchmark.
See the :examples:`rl/platoon` example. The example uses PPO algorithm from
`Stable Baselines3 <https://github.com/DLR-RM/stable-baselines3>`_ reinforcement learning library.
Instructions for training and evaluating the example is as follows.

Train
^^^^^
+ Setup

.. code-block:: bash

# In terminal-A
$ cd <path>/SMARTS/examples/rl/platoon
$ python3.8 -m venv ./.venv
$ source ./.venv/bin/activate
$ pip install --upgrade pip wheel
$ pip install -e ./../../../.[camera_obs,argoverse]
$ pip install -e ./inference/

+ Train locally without visualization

.. code-block:: bash

# In terminal-A
$ python3.8 train/run.py

+ Train locally with visualization

.. code-block:: bash

# In terminal-A
$ python3.8 train/run.py --head

.. code-block:: bash

# In a different terminal-B
$ scl envision start
# Open http://localhost:8081/

+ Trained models are saved by default inside the ``<path>/SMARTS/examples/rl/platoon/train/logs/`` folder.

Docker
^^^^^^
+ Train inside docker

.. code-block:: bash

$ cd <path>/SMARTS
$ docker build --file=./examples/rl/platoon/train/Dockerfile --network=host --tag=platoon .
$ docker run --rm -it --network=host --gpus=all platoon
(container) $ cd /SMARTS/examples/rl/platoon
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth it to instead set WORKDIR /SMARTS/examples/rl/platoon at the end of the image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the Dockerfile sets WORKDIR /SMARTS , which I try to make consistent across various dockerfiles. This helps when we have examples in different paths, as the dockerfile can be built similarly and we only need to alter the command issued.

I simply reiterated the full path here $ cd /SMARTS/examples/rl/platoon to avoid any confusion for the users.

(container) $ python3.8 train/run.py

Evaluate
^^^^^^^^
+ Choose a desired saved model from the previous training step, rename it as ``saved_model.zip``, and move it to ``<path>/SMARTS/examples/rl/platoon/inference/contrib_policy/saved_model.zip``.
+ Evaluate locally

.. code-block:: bash

$ cd <path>/SMARTS
$ python3.8 -m venv ./.venv
$ source ./.venv/bin/activate
$ pip install --upgrade pip wheel
$ pip install -e .[camera_obs,argoverse]
$ scl zoo install examples/rl/platoon/inference
$ scl benchmark run driving_smarts_2023_3 examples.rl.platoon.inference:contrib-agent-v0 --auto-install

Zoo agents
----------

A compatible zoo agent can be evaluated in this benchmark as follows.

.. code-block:: bash

$ cd <path>/SMARTS
$ scl zoo install <agent path>
$ scl benchmark run driving_smarts_2023_3==0.0 <agent_locator> --auto_install
6 changes: 6 additions & 0 deletions docs/examples/platoon.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.. _platoon:

Platoon
=======

This example was developed in conjunction with the :ref:`Driving SMARTS 2023.3 <driving_smarts_2023_3>` benchmark, hence refer to it for details.
3 changes: 2 additions & 1 deletion docs/examples/rl_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ RL Model
:maxdepth: 1

intersection.md
racing.md
racing.md
platoon.rst
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ If you use SMARTS in your research, please cite the `paper <https://arxiv.org/ab
benchmarks/benchmark.rst
benchmarks/agent_zoo.rst
benchmarks/driving_smarts_2022.rst
benchmarks/driving_smarts_2023_3.rst

.. toctree::
:hidden:
Expand Down
Empty file added examples/rl/__init__.py
Empty file.
Empty file added examples/rl/platoon/__init__.py
Empty file.
1 change: 1 addition & 0 deletions examples/rl/platoon/inference/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include */*.pth
Empty file.
38 changes: 38 additions & 0 deletions examples/rl/platoon/inference/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
from contrib_policy.policy import Policy

from smarts.core.agent_interface import RGB, AgentInterface
from smarts.core.controllers import ActionSpaceType
from smarts.zoo.agent_spec import AgentSpec
from smarts.zoo.registry import register


def entry_point(**kwargs):
interface = AgentInterface(
action=ActionSpaceType.Continuous,
drivable_area_grid_map=False,
lane_positions=True,
lidar_point_cloud=False,
occupancy_grid_map=False,
road_waypoints=False,
signals=False,
top_down_rgb=RGB(
width=128,
height=128,
resolution=80 / 128, # m/pixels
),
)

agent_params = {
"top_down_rgb": interface.top_down_rgb,
"action_space_type": interface.action,
"num_stack": 3, # Number of frames to stack as input to policy network.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth it to integrate the frame stack as part of the agent interface? This seems to be a common case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to include frame stacking inside AgentInterface, we would then have to provide a standard way of stacking the observations. It may or may not fulfill the users' requirements. Users might only need selected (and/or processed) observations to be stacked, whereas our standard frame stacking would have stacked entire observations. Considering the endless possible user requirements, I guess it is better we leave it to the user.

}

return AgentSpec(
interface=interface,
agent_builder=Policy,
agent_params=agent_params,
)


register(locator="contrib-agent-v0", entry_point=entry_point)
Empty file.
Loading