Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gym_jiminy/rllib] Full refactoring to support ray-rllib 2.38. #832

Merged
merged 16 commits into from
Nov 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
0987e55
[python/viewer] Use firefox instead of chromimum for offscreen render…
duburcqa Nov 15, 2024
194e108
[gym_jiminy/common] Do not permanently alter original simulation opti…
duburcqa Nov 9, 2024
dda233f
[gym_jiminy/common] Disallow switch between evaluation and training m…
duburcqa Nov 9, 2024
89e8fd2
[gym_jiminy/common] Rewrite binary log file automatically when callin…
duburcqa Nov 9, 2024
7e60bd0
[gym_jiminy/common] Fix replay if no simulation is running.
duburcqa Nov 9, 2024
f2a74d6
[gym_jiminy/common] Add previous action as input argument for evaluat…
duburcqa Nov 9, 2024
3cdc4b7
[gym_jiminy/common] Automatic environment pipeline update.
duburcqa Nov 13, 2024
cb7a18d
[gym_jiminy/common] Fix composed reward computation.
duburcqa Nov 17, 2024
5dd635d
[gym_jiminy/common] Use metaclass instead of inheritence for abstract…
duburcqa Nov 17, 2024
728b96a
[gym_jiminy/common] Enable typing of the obs and action spaces for 'g…
duburcqa Nov 17, 2024
18aee6e
[gym_jiminy/common] Fix nested gym space helpers.
duburcqa Nov 17, 2024
352ea40
[gym_jiminy/common] Update documentation.
duburcqa Nov 17, 2024
1df6734
[gym_jiminy/toolbox] Add support of arbitrarily nested task-settable …
duburcqa Nov 17, 2024
fc3e1ce
[gym_jiminy/envs] Add mirror mat to obs/action spaces.
duburcqa Nov 14, 2024
4436b29
[gym_jiminy/rllib] Full refactoring to support ray-rllib 2.38.
duburcqa Nov 17, 2024
73fc83e
[misc] Move to macos-14 on Github Action (forcing panda3d tinydisplay…
duburcqa Nov 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ jobs:
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v

- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
cd "${RootDir}/python/gym_jiminy/examples/rllib"
"${PYTHON_EXECUTABLE}" acrobot_ppo.py
Expand Down
23 changes: 18 additions & 5 deletions .github/workflows/macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,7 @@ jobs:

strategy:
matrix:
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
OS: ['macos-13'] # 'macos-13': Intel (x86), 'macos-14+': Apple Silicon (arm64)
OS: ['macos-14'] # 'macos-13': Intel (x86), 'macos-14+': Apple Silicon (arm64)
PYTHON_VERSION: ['3.10', '3.11', '3.12'] # `setup-python` does not support Python<3.10 on Apple Silicon
BUILD_TYPE: ['Release']
include:
Expand Down Expand Up @@ -128,8 +127,10 @@ jobs:
# (see https://github.com/python/mypy/issues/17396)
"${PYTHON_EXECUTABLE}" -m pip install "numpy<2.0"
stubgen -p jiminy_py -o ${RootDir}/build/pypi/jiminy_py/src
# FIXME: Python 3.10 on Intel x86-64 crashes when generating stubs without any backtrace...
if [[ ("${{ matrix.OS }}" != 'macos-13') || ("${{ matrix.PYTHON_VERSION }}" != '3.10') ]] ; then
# FIXME: Python 3.10 crashes when generating stubs without any backtrace...
if [[ "${{ matrix.PYTHON_VERSION }}" != '3.10' ]] ; then
# lldb --batch -o "settings set target.process.stop-on-exec false" \
# -o "break set -n main" -o "run" -k "bt" -k "quit" -- \
"${PYTHON_EXECUTABLE}" "${RootDir}/build_tools/stubgen.py" \
-o ${RootDir}/build/stubs --ignore-invalid=all jiminy_py
cp ${RootDir}/build/stubs/jiminy_py-stubs/core/__init__.pyi \
Expand Down Expand Up @@ -183,12 +184,20 @@ jobs:
ctest --output-on-failure --test-dir "${RootDir}/build/core/unit"

cd "${RootDir}/python/jiminy_py/unit_py"
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
"${PYTHON_EXECUTABLE}" -m unittest discover -v

- name: Run unit tests for gym jiminy base module
run: |
export LD_LIBRARY_PATH="${InstallDir}/lib/:/usr/local/lib"

# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
# FIXME: Disabling `test_pipeline_control.py` on MacOS because `test_pid_standing` is
# failing for 'panda3d-sync' backend due to meshes still loading at screenshot time.
if [[ "${{ matrix.BUILD_TYPE }}" == 'Debug' ]] ; then
Expand All @@ -197,11 +206,15 @@ jobs:
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v

- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
export LD_LIBRARY_PATH="${InstallDir}/lib/:/usr/local/lib"

cd "${RootDir}/python/gym_jiminy/examples/rllib"
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
"${PYTHON_EXECUTABLE}" acrobot_ppo.py

#########################################################################################
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/manylinux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ jobs:
-DBoost_NO_SYSTEM_PATHS=TRUE -DBoost_NO_BOOST_CMAKE=TRUE \
-DBoost_USE_STATIC_LIBS=ON -DPYTHON_EXECUTABLE="${PYTHON_EXECUTABLE}" \
-DBUILD_TESTING=ON -DBUILD_EXAMPLES=ON -DBUILD_PYTHON_INTERFACE=ON \
-DINSTALL_GYM_JIMINY=${{ (matrix.PYTHON_VERSION == 'cp312' && 'OFF') || 'ON' }} \
-DINSTALL_GYM_JIMINY=${{ (matrix.PYTHON_VERSION == 'cp313' && 'OFF') || 'ON' }} \
-DCMAKE_CXX_FLAGS="${CXX_FLAGS}" -DCMAKE_BUILD_TYPE="$BUILD_TYPE"
make -j4

Expand Down
13 changes: 6 additions & 7 deletions .github/workflows/ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -150,21 +150,20 @@ jobs:
fi
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v

# - name: Run examples for gym jiminy add-on modules
# if: matrix.BUILD_TYPE != 'Debug'
# run: |
# cd "${RootDir}/python/gym_jiminy/examples/rllib"
# "${PYTHON_EXECUTABLE}" acrobot_ppo.py
- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug'
run: |
cd "${RootDir}/python/gym_jiminy/examples/rllib"
"${PYTHON_EXECUTABLE}" acrobot_ppo.py

#####################################################################################

- name: Python linter
if: matrix.OS == 'ubuntu-24.04' && matrix.BUILD_TYPE != 'Debug' && matrix.COMPILER == 'gcc'
run: |
# FIXME: Add back "rllib"
cd "${RootDir}/python/jiminy_py/"
pylint --rcfile="${RootDir}/.pylintrc" "src/"
for name in "common" "toolbox" ; do
for name in "common" "toolbox" "rllib" ; do
cd "${RootDir}/python/gym_jiminy/$name"
pylint --rcfile="${RootDir}/.pylintrc" "gym_jiminy/"
done
Expand Down
5 changes: 1 addition & 4 deletions .github/workflows/win.yml
Original file line number Diff line number Diff line change
Expand Up @@ -201,13 +201,10 @@ jobs:
python -m unittest discover -s "$RootDir/python/gym_jiminy/unit_py" -v

- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.11' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
$RootDir = "${env:GITHUB_WORKSPACE}/workspace" -replace '\\', '/'

# FIXME: Python 3.11 was not supported by ray on Windows until very recently.
# It has been fixed on master but not on the latest available release (2.93).
# See: https://github.com/ray-project/ray/pull/42097
Set-Location -Path "$RootDir/python/gym_jiminy/examples/rllib"
python acrobot_ppo.py

Expand Down
1 change: 1 addition & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ disable =
abstract-method,
protected-access,
useless-parent-delegation,
unbalanced-tuple-unpacking,
use-dict-literal,
unspecified-encoding,
undefined-loop-variable,
Expand Down
4 changes: 2 additions & 2 deletions python/gym_jiminy/common/gym_jiminy/common/bases/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
- the base controller block
- the base observer block
"""
from abc import abstractmethod, ABC
from abc import abstractmethod, ABCMeta
from typing import Any, Union, Generic, TypeVar, cast

import gymnasium as gym
Expand All @@ -26,7 +26,7 @@
BlockState = TypeVar('BlockState', bound=Union[DataNested, None])


class InterfaceBlock(ABC, Generic[BlockState, BaseObs, BaseAct]):
class InterfaceBlock(Generic[BlockState, BaseObs, BaseAct], metaclass=ABCMeta):
"""Base class for blocks used for pipeline control design. Blocks can be
either observers and controllers.

Expand Down
10 changes: 5 additions & 5 deletions python/gym_jiminy/common/gym_jiminy/common/bases/compositions.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
This modular approach allows for standardization of usual metrics. Overall, it
greatly reduces code duplication and bugs.
"""
from abc import ABC, abstractmethod
from abc import abstractmethod, ABCMeta
from enum import IntEnum
from typing import Tuple, Sequence, Callable, Union, Optional, Generic, TypeVar

Expand All @@ -23,7 +23,7 @@
ArrayLikeOrScalar = Union[ArrayOrScalar, Sequence[Union[Number, np.number]]]


class AbstractReward(ABC):
class AbstractReward(metaclass=ABCMeta):
"""Abstract class from which all reward component must derived.

This goal of the agent is to maximize the expectation of the cumulative sum
Expand All @@ -32,7 +32,7 @@ class AbstractReward(ABC):
indefinite (aka. objective).

Defining cost is allowed by not recommended. Although it encourages the
agent to achieve the task at hands as quickly as possible if success is the
agent to achieve the task at hand as quickly as possible if success is the
only termination condition, it has the side-effect to give the opportunity
to the agent to maximize the return by killing itself whenever this is an
option, which is rarely the desired behavior. No restriction is enforced as
Expand Down Expand Up @@ -400,7 +400,7 @@ class EpisodeState(IntEnum):
"""


class AbstractTerminationCondition(ABC):
class AbstractTerminationCondition(metaclass=ABCMeta):
"""Abstract class from which all termination conditions must derived.

Request the ongoing episode to stop immediately as soon as a termination
Expand Down Expand Up @@ -470,7 +470,7 @@ def name(self) -> str:

@abstractmethod
def compute(self, info: InfoType) -> bool:
"""Evaluate the termination condition at hands.
"""Evaluate the termination condition at hand.

:param info: Dictionary of extra information for monitoring. It will be
updated in-place for storing terminated and truncated
Expand Down
47 changes: 37 additions & 10 deletions python/gym_jiminy/common/gym_jiminy/common/bases/interfaces.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
specifically design for Jiminy engine, and defined as mixin classes. Any
observer/controller block must inherit and implement those interfaces.
"""
from abc import abstractmethod, ABC
from abc import abstractmethod, ABCMeta
from collections import OrderedDict
from typing import (
Dict, Any, Tuple, TypeVar, Generic, TypedDict, no_type_check,
Dict, Any, Tuple, TypeVar, Generic, TypedDict, Optional, no_type_check,
TYPE_CHECKING)

import numpy as np
Expand Down Expand Up @@ -53,11 +53,11 @@ class EngineObsType(TypedDict):
"""


class InterfaceObserver(ABC, Generic[Obs, BaseObs]):
class InterfaceObserver(Generic[Obs, BaseObs], metaclass=ABCMeta):
"""Observer interface for both observers and environments.
"""
observe_dt: float = -1
observation_space: gym.Space # [Obs]
observation_space: gym.Space[Obs]
observation: Obs

def __init__(self, *args: Any, **kwargs: Any) -> None:
Expand Down Expand Up @@ -96,11 +96,11 @@ def refresh_observation(self, measurement: BaseObs) -> None:
"""


class InterfaceController(ABC, Generic[Act, BaseAct]):
class InterfaceController(Generic[Act, BaseAct], metaclass=ABCMeta):
"""Controller interface for both controllers and environments.
"""
control_dt: float = -1
action_space: gym.Space # [Act]
action_space: gym.Space[Act]

def __init__(self, *args: Any, **kwargs: Any) -> None:
"""Initialize the controller interface.
Expand Down Expand Up @@ -164,9 +164,9 @@ def compute_reward(self,
return 0.0


# Note that `InterfaceJiminyEnv` must inherit from `InterfaceObserver`
# before `InterfaceController` to initialize the action space before the
# observation space since the action itself may be part of the observation.
# Note that `InterfaceJiminyEnv` must inherit from `InterfaceObserver` before
# `InterfaceController` to initialize the action space before the observation
# space since the action itself may be part of the observation.
# Similarly, `gym.Env` must be last to make sure all the other initialization
# methods are called first.
class InterfaceJiminyEnv(
Expand All @@ -183,6 +183,11 @@ class InterfaceJiminyEnv(
['rgb_array'] + (['human'] if is_display_available() else []))
}

# FIXME: Re-definition in derived class to stop mypy from complaining about
# incompatible types between the multiple base classes.
action_space: gym.Space[Act]
observation_space: gym.Space[Obs]

simulator: Simulator
robot: jiminy.Robot
stepper_state: jiminy.StepperState
Expand Down Expand Up @@ -341,7 +346,7 @@ def _controller_handle(self,
self.__is_observation_refreshed = False

def stop(self) -> None:
"""Stop the episode immediately without waiting for a termination or
"""Stop the episode immediately, without waiting for a termination or
truncation condition to be satisfied.

.. note::
Expand All @@ -351,9 +356,31 @@ def stop(self) -> None:
data will not be available during replay using object-oriented
method `replay`. Helper method `play_logs_data` must be preferred
to replay an episode that cannot be stopped at the time being.

.. warning:
This method is never called internally by the engine.
"""
self.simulator.stop()

@abstractmethod
def update_pipeline(self, derived: Optional["InterfaceJiminyEnv"]) -> None:
"""Dynamically update which blocks are declared as part of the
environment pipeline.

Internally, this method first unregister all blocks of the old
pipeline, then register all blocks of the new pipeline, and finally
notify the base environment that the top-most block of the pipeline as
changed and must be updated accordingly.

.. warning::
This method is not supposed to be called manually nor overloaded.

:param derived: Either the top-most block of the pipeline or None.
If None, unregister all blocks of the old pipeline. If
not None, first unregister all blocks of the old
pipeline, then register all blocks of the new pipeline.
"""

@abstractmethod
def has_terminated(self, info: InfoType) -> Tuple[bool, bool]:
"""Determine whether the episode is over, because a terminal state of
Expand Down
Loading