ToriLLE comes with OpenAI Gym environment and several pre-defined tasks, which can be used as a (almost) drop-in replacements to e.g. Atari environments.
Register these environments by importing torille.envs
.
Limitations:
seed()
function is not implemented (can't change seed of Toribash)render()
function is not implemented (only one type of state available)
Additional functions/modifications for all environments:
set_game_draw(draw)
: Enables/Disables rendering of the game according to boolean parameter.settings
variable: This isToribashState
object used to set Toribash's settings on each reset.
These are tasks where only one character exists (observations/actions only include one player).
Player 2 is set to be immobile and engagement distance is set high to avoid contact between players.
States: 1D vector of player 1 body part positions w.r.t player 1's groin. The z
coordinate
is replaced with absoluate z
to inform agent how far from the ground it is. (gym.spaces.box.Box
).
Actions: Joint states for player 1 (gym.spaces.multi_discrete.MultiDiscrete
).
Reward is specified by the task (see below).
Settings:
Setting | Value |
---|---|
matchframes | 1000 |
turnframes | 5 |
engagement_distance | 1500 |
Reward function: Positive reward for head body-part moving away from the center. See torille.envs.solo_envs.reward_run_away
.
Reward function: Positive reward for damaging the player itself (not the opponent). See torille.envs.solo_envs.reward_self_destruct
.
Reward function: Negative reward for damaging the player itself (not the opponent). See torille.envs.solo_envs.reward_stay_safe
.
Tasks where only one character is controlled by agent, but observations for both characters are provided.
Player 2 is set to be immobile or random, depending on the task.
States: 1D vector of player 1 and player 2 body part positions, w.r.t player 1's groin.
The z
coordinate of player 1's groin is replaced with absolute z
coordinate. (gym.spaces.box.Box
)
Actions: Joint states for player 1 (gym.spaces.multi_discrete.MultiDiscrete
)
Reward is specified by the task (see below).
Settings:
Setting | Value |
---|---|
matchframes | 1000 |
turnframes | 5 |
Reward function: Positive reward for damaging immobile opponent. See torille.envs.uke_envs.reward_destroy_uke
.
Reward function: Positive reward for damaging immobile opponent and negative reward for receiving damage, summed together.
See torille.envs.uke_envs.reward_destroy_uke_with_penalty
.
Reward function: Positive reward for damaging opponent and negative reward for receiving damage, summed together.
Opponent takes random actions each turn. See torille.envs.uke_envs.reward_destroy_uke_with_penalty
.
Tasks where both characters are controlled by agent and observations are provided for both characters.
States: 1D vector of player 1 and player 2 body part positions, w.r.t player 1's and player 2's groin (i.e. 2x longer vector than in solo/uke environments).
The z
coordinate of player's groins is replaced with the absolute z
coordinate, so agents know their location w.r.t floor. (gym.spaces.box.Box
)
Actions: Joint states for player 1 and player 2 (gym.spaces.multi_discrete.MultiDiscrete
)
Reward is specified by the task (see below).
Settings:
Setting | Value |
---|---|
matchframes | 1000 |
turnframes | 5 |
Reward function: Score from the point of view of player 1, in terms of injury:
Positive reward if opponent received damage,
negative if player 1 received damage (summed together).
See torille.envs.duo_envs.reward_injury_player1_pov
.
Reward function: Score from the point of view of player 1, in terms of winning:
+1 reward if player 1 won the game,
-1 reward if player 2 won the game and
0 reward if game was tie.
See torille.envs.duo_envs.reward_win_player1_pov
.
Reward function: Positive reward relative to inverse of distance between two players (distance of center-of-masses).
Negative reward if either of players takes damage. These are summed together for final reward.
See torille.envs.duo_envs.reward_cuddles
.
Setting | Value |
---|---|
turnframes | 2 |