gym-po-taxi

Partially-observable taxi environment, with internal vectorization

Links to look at for own implementation

Gymnax: Classic control, bsuite, MinAtar, FourRooms, MetaMaze, PointRobot, Bandits in JAX. Supports Podracer architecture

Most interesting environments are probably MemoryChain, FourRooms, MetaMaze, PointRobot

ROOMS and C-ROOMS: ROOMS and C-ROOMs for reference

Velocity-based vs just position
Fixed layouts ahead of time. Random agent spawn. Fixed or set or random goal
Discrete action (8 or 4 cardinal directions) vs Continuous (2D)
- 2 forms of action failure. 0.2 chance of taking random action (cardinal) or flipping signs (continuous). 0.2 standard deviation for Gaussian movement
What to do for walls?
- Discrete case is easy. Don't move.
- Continuous case could be the same. Alternatively, draw the vector, stop right at wall.
Observation?
- Non-continuous:
  - Fully observable: grid discrete state. Goal state if random?
  - Partially observable: 4D Hansen (adjacent), 8D Hansen, nxn grid
- Continuous:
  - Fully observable: (x,y) coordinate, Need (dx, dy) if velocity-based. Goal state if random?
  - Partially observable:
    - (x,y) w/o velocity, (x,y) downsampled to grid
    - 4/8D Hansen (0/1 walls in range 1M), 4/8D walls (distance of closest wall)

Pocman/Pacman: Fully/partially-observable pocman from POMCP

Battleship: Partially observable battleship

Rocksample: Also has battleship

Isaacverse: GPU physics control

Mo-Gym: Multi-objective. Fancy fourrooms, reacher with more objectives,

gym-sokoban: pixel-based though...

CARL: Context-adaptive RL, reconfigure envs (Mario, Brax, control)

highway-env: Must infer behaviors of others

Other

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
gym_po		gym_po
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py