huawei-noah · Adaickalavan · Apr 14, 2023 · Apr 10, 2023 · Apr 10, 2023 · Apr 10, 2023
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,6 +14,8 @@ Copy and pasting the git commit messages is __NOT__ enough.
 - Added a new entry tactic, `IdEntryTactic`, which provides the scenario the ability to select a specific actor for an agent to take over.
 - Registered a new `chase-via-points-agent-v0` agent in agent zoo, which can effectively chase via points across different road sections by using the waypoints.
 - Added new driving-smarts-v2023 benchmark consisting of new (i) driving-smarts-v2023 env and (ii) platoon-v0 env.
+- Added baseline example, consisting of training, inference, and zoo agent registration, for the platooning task in Driving SMARTS 2023.3 benchmark.
+- Documented the challenge objective, desired inference code structure, and use of baseline example, for Driving SMARTS 2023.3 benchmark, i.e., platooning task.
 ### Changed
 - The trap manager, `TrapManager`, is now a subclass of `ActorCaptureManager`.
 - Considering lane-change time ranges between 3s and 6s, assuming a speed of 13.89m/s, the via sensor lane acquisition range was increased from 40m to 80m, for better driving ability.

diff --git a/docs/_static/driving_smarts_2023/platoon_straight_2lane_agents_1.gif b/docs/_static/driving_smarts_2023/platoon_straight_2lane_agents_1.gif
diff --git a/docs/benchmarks/driving_smarts_2022.rst b/docs/benchmarks/driving_smarts_2022.rst
@@ -62,10 +62,11 @@ This benchmark allows ego agents to use any one of the following action spaces.
 + :attr:`~smarts.core.controllers.ActionSpaceType.TargetPose`
 + :attr:`~smarts.core.controllers.ActionSpaceType.RelativeTargetPose`
 
-Trained agents
---------------
+Zoo agents
+----------
 
-See the list of :ref:`available zoo agents <available_zoo_agents>` which are compatible with this benchmark. A compatible zoo agent can be run as follows.
+See the list of :ref:`available zoo agents <available_zoo_agents>` which are compatible with this benchmark. 
+A compatible zoo agent can be evaluated in this benchmark as follows.
 
 .. code-block:: bash
 

diff --git a/docs/benchmarks/driving_smarts_2023_3.rst b/docs/benchmarks/driving_smarts_2023_3.rst
@@ -0,0 +1,255 @@
+.. _driving_smarts_2023_3:
+
+Driving SMARTS 2023.3
+=====================
+
+Objective
+---------
+
+Objective is to develop a single-ego policy capable of controlling a single ego to perform a platooning task in the 
+``platoon-v0`` environment. Refer to :func:`~smarts.env.gymnasium.platoon_env.platoon_env` for environment details. 
+
+.. important::
+
+    In a scenario with multiple egos, a single-ego policy is copied and pasted into every ego. Each ego is stepped 
+    independently by calling their respective :attr:`~smarts.core.agent.Agent.act` function. In short, multiple
+    egos are executed in a distributed manner. The single-ego policy should be capable of accounting for and 
+    interacting with other egos, if any are present.
+
+Each ego is supposed to track and follow its specified leader (i.e., lead vehicle) in a single file or in a 
+platoon fashion. Name of the lead vehicle to be followed is given to the ego through its 
+:attr:`~smarts.core.agent_interface.ActorsAliveDoneCriteria.actors_of_interest` attribute.
+
+The episode ends for an ego when its assigned leader reaches the leader's destination. Egos do not have prior 
+knowledge of the leader's destination.
+
+Any method such as reinforcement learning, offline reinforcement learning, behaviour cloning, generative models,
+predictive models, etc, may be used to develop the policy.
+
+Several scenarios are provided for training. Their names and tasks are as follows. 
+The desired task execution is illustrated in a gif by a trained baseline agent. 
+
++ straight_2lane_agents_1
+    A single ego must follow a specified leader, with no background traffic.
+
+    .. image:: /_static/driving_smarts_2023/platoon_straight_2lane_agents_1.gif
+
+Observation space
+-----------------
+
+The underlying environment returns formatted :class:`~smarts.core.observations.Observation` using 
+:attr:`~smarts.env.utils.observation_conversion.ObservationOptions.multi_agent`
+option as observation at each time point. See 
+:class:`~smarts.env.utils.observation_conversion.ObservationSpacesFormatter` for
+a sample formatted observation data structure.
+
+Action space
+------------
+
+Action space for each ego is :attr:`~smarts.core.controllers.ActionSpaceType.Continuous`.
+
+Code structure
+--------------
+
+Users are free to use any training method and any folder structure for training the policy.
+
+Only the inference code is required for evaluation, and therefore it must follow the folder 
+structure and certain specified file contents, as explained below. The below files and folders
+must be present with identical names. Any additional files may be optionally added by 
+the user.
+
+.. code-block:: text
+
+    inference                   
+    ├── contrib_policy          
+    │   ├── __init__.py         
+    │   ├── policy.py           
+    |   .
+    |   .
+    |   .
+    ├── __init__.py             
+    ├── MANIFEST.in              
+    ├── setup.cfg                
+    └── setup.py                
+
+1. inference/contrib_policy/__init__.py
+    + Keep this file unchanged.
+    + It is an empty file.
+
+2. inference/contrib_policy/policy.py
+    + Must contain a ``class Policy(Agent)`` class which inherits from :class:`~smarts.core.agent.Agent`.
+
+3. inference/__init__.py
+    + Must contain the following template code. 
+    + The template code registers the user's policy in SMARTS agent zoo.
+
+      .. code-block:: python
+
+        from contrib_policy.policy import Policy
+
+        from smarts.core.agent_interface import AgentInterface
+        from smarts.core.controllers import ActionSpaceType
+        from smarts.zoo.agent_spec import AgentSpec
+        from smarts.zoo.registry import register
+
+
+        def entry_point(**kwargs):
+            interface = AgentInterface(
+                action=ActionSpaceType.<...>,
+                drivable_area_grid_map=<...>,
+                lane_positions=<...>,
+                lidar_point_cloud=<...>,
+                occupancy_grid_map=<...>,
+                road_waypoints=<...>,
+                signals=<...>,
+                top_down_rgb=<...>,
+            )
+
+            agent_params = {
+                "<...>": <...>,
+                "<...>": <...>,
+            }
+
+            return AgentSpec(
+                interface=interface,
+                agent_builder=Policy,
+                agent_params=agent_params,
+            )
+
+        register(locator="contrib-agent-v0", entry_point=entry_point)
+
+    + User may fill in the ``<...>`` spaces in the template.
+    + User may specify the ego's interface by configuring any field of :class:`~smarts.core.agent_interface.AgentInterface`, except
+
+      + :attr:`~smarts.core.agent_interface.AgentInterface.accelerometer`, 
+      + :attr:`~smarts.core.agent_interface.AgentInterface.done_criteria`, 
+      + :attr:`~smarts.core.agent_interface.AgentInterface.max_episode_steps`, 
+      + :attr:`~smarts.core.agent_interface.AgentInterface.neighborhood_vehicle_states`, and 
+      + :attr:`~smarts.core.agent_interface.AgentInterface.waypoint_paths`. 
+
+4. inference/MANIFEST.in 
+    + Contains any file paths to be included in the package.
+
+5. inference/setup.cfg
+    + Must contain the following template code. 
+    + The template code helps build the user policy into a Python package.
+
+      .. code-block:: cfg
+
+        [metadata]
+        name = <...>
+        version = 0.1.0
+        url = https://github.com/huawei-noah/SMARTS
+        description = SMARTS zoo agent.
+        long_description = <...>. See [SMARTS](https://github.com/huawei-noah/SMARTS).
+        long_description_content_type=text/markdown
+        classifiers=
+            Programming Language :: Python
+            Programming Language :: Python :: 3 :: Only
+            Programming Language :: Python :: 3.8
+
+        [options]
+        packages = find:
+        include_package_data = True
+        zip_safe = True
+        python_requires = == 3.8.*
+        install_requires = 
+            <...>==<...>
+            <...>==<...>
+
+    + User may fill in the ``<...>`` spaces in the template.
+    + User should provide a name for their policy and describe it in the ``name`` and ``long_description`` sections, respectively.
+    + Do **not** add SMARTS package as a dependency in the ``install_requires`` section.
+
+6. inference/setup.py
+    + Keep this file and its default contents unchanged.
+    + Its default contents are shown below.
+
+      .. code-block:: python
+
+        from setuptools import setup
+
+        if __name__ == "__main__":
+            setup()
+
+Example
+-------
+
+An example training and inference code is provided for this benchmark. 
+See the :examples:`rl/platoon` example. The example uses PPO algorithm from 
+`Stable Baselines3 <https://github.com/DLR-RM/stable-baselines3>`_ reinforcement learning library. 
+Instructions for training and evaluating the example is as follows.
+
+Train
+^^^^^
++ Setup
+
+  .. code-block:: bash
+
+    # In terminal-A
+    $ cd <path>/SMARTS/examples/rl/platoon
+    $ python3.8 -m venv ./.venv
+    $ source ./.venv/bin/activate
+    $ pip install --upgrade pip wheel
+    $ pip install -e ./../../../.[camera_obs,argoverse]
+    $ pip install -e ./inference/
+
++ Train locally without visualization
+
+  .. code-block:: bash
+
+    # In terminal-A
+    $ python3.8 train/run.py
+
++ Train locally with visualization
+
+  .. code-block:: bash
+
+    # In terminal-A
+    $ python3.8 train/run.py --head
+
+  .. code-block:: bash
+
+    # In a different terminal-B
+    $ scl envision start
+    # Open http://localhost:8081/
+
++ Trained models are saved by default inside the ``<path>/SMARTS/examples/rl/platoon/train/logs/`` folder.
+
+Docker
+^^^^^^
++ Train inside docker
+
+  .. code-block:: bash
+
+    $ cd <path>/SMARTS
+    $ docker build --file=./examples/rl/platoon/train/Dockerfile --network=host --tag=platoon .
+    $ docker run --rm -it --network=host --gpus=all platoon
+    (container) $ cd /SMARTS/examples/rl/platoon
+    (container) $ python3.8 train/run.py
+
+Evaluate
+^^^^^^^^
++ Choose a desired saved model from the previous training step, rename it as ``saved_model.zip``, and move it to ``<path>/SMARTS/examples/rl/platoon/inference/contrib_policy/saved_model.zip``. 
++ Evaluate locally
+
+  .. code-block:: bash
+
+    $ cd <path>/SMARTS
+    $ python3.8 -m venv ./.venv
+    $ source ./.venv/bin/activate
+    $ pip install --upgrade pip wheel
+    $ pip install -e .[camera_obs,argoverse]
+    $ scl zoo install examples/rl/platoon/inference
+    $ scl benchmark run driving_smarts_2023_3 examples.rl.platoon.inference:contrib-agent-v0 --auto-install
+
+Zoo agents
+----------
+
+A compatible zoo agent can be evaluated in this benchmark as follows.
+
+.. code-block:: bash
+
+    $ cd <path>/SMARTS
+    $ scl zoo install <agent path>
+    $ scl benchmark run driving_smarts_2023_3==0.0 <agent_locator> --auto_install
diff --git a/docs/examples/platoon.rst b/docs/examples/platoon.rst
@@ -0,0 +1,6 @@
+.. _platoon:
+
+Platoon
+=======
+
+This example was developed in conjunction with the :ref:`Driving SMARTS 2023.3 <driving_smarts_2023_3>` benchmark, hence refer to it for details.
diff --git a/docs/examples/rl_model.rst b/docs/examples/rl_model.rst
@@ -7,4 +7,5 @@ RL Model
    :maxdepth: 1
 
    intersection.md
-   racing.md
+   racing.md
+   platoon.rst
diff --git a/docs/index.rst b/docs/index.rst
@@ -56,6 +56,7 @@ If you use SMARTS in your research, please cite the `paper <https://arxiv.org/ab
    benchmarks/benchmark.rst
    benchmarks/agent_zoo.rst
    benchmarks/driving_smarts_2022.rst
+   benchmarks/driving_smarts_2023_3.rst
 
 .. toctree::
    :hidden:

diff --git a/examples/rl/__init__.py b/examples/rl/__init__.py
diff --git a/examples/rl/platoon/__init__.py b/examples/rl/platoon/__init__.py
diff --git a/examples/rl/platoon/inference/MANIFEST.in b/examples/rl/platoon/inference/MANIFEST.in
@@ -0,0 +1 @@
+include */*.pth
diff --git a/examples/rl/platoon/inference/README.md b/examples/rl/platoon/inference/README.md
diff --git a/examples/rl/platoon/inference/__init__.py b/examples/rl/platoon/inference/__init__.py
@@ -0,0 +1,38 @@
+from contrib_policy.policy import Policy
+
+from smarts.core.agent_interface import RGB, AgentInterface
+from smarts.core.controllers import ActionSpaceType
+from smarts.zoo.agent_spec import AgentSpec
+from smarts.zoo.registry import register
+
+
+def entry_point(**kwargs):
+    interface = AgentInterface(
+        action=ActionSpaceType.Continuous,
+        drivable_area_grid_map=False,
+        lane_positions=True,
+        lidar_point_cloud=False,
+        occupancy_grid_map=False,
+        road_waypoints=False,
+        signals=False,
+        top_down_rgb=RGB(
+            width=128,
+            height=128,
+            resolution=80 / 128,  # m/pixels
+        ),
+    )
+
+    agent_params = {
+        "top_down_rgb": interface.top_down_rgb,
+        "action_space_type": interface.action,
+        "num_stack": 3,  # Number of frames to stack as input to policy network.
+    }
+
+    return AgentSpec(
+        interface=interface,
+        agent_builder=Policy,
+        agent_params=agent_params,
+    )
+
+
+register(locator="contrib-agent-v0", entry_point=entry_point)
diff --git a/examples/rl/platoon/inference/contrib_policy/__init__.py b/examples/rl/platoon/inference/contrib_policy/__init__.py