Integration of Rule-Based Bot Actions for Imitation Learning #835

vladyskai · 2024-01-07T03:42:40Z

Background:
I am currently working on a custom environment for the Battle City game using Gymnasium and Stable Baselines v3. My objective is to train an agent using the Proximal Policy Optimization (PPO) algorithm. To enhance the learning process, I've developed a rule-based bot which successfully wins the game.

Issue:
The challenge arises with the PPO agent's learning efficacy. Despite incorporating the bot's actions as part of the observation and rewarding the agent for mimicking these actions, the learning outcomes have been suboptimal. My initial impression was that this library would facilitate learning through imitation from my custom bot. However, after reviewing the example codes, it appears that the current setup may necessitate an alternative model for this purpose.

Inquiry:
I seek clarification on the library's capabilities in this context:

Does the current implementation only support learning from another model, rather than a custom rule-based bot?
If my understanding is correct, and learning from a rule-based bot is not supported, I would like to propose this as a feature request. Implementing the ability to use actions from a custom bot for imitation learning would be a valuable addition to this library.
Alternate Request:
In case my interpretation is incorrect, and the library does support learning from a bot's actions, I would greatly appreciate a simple example or guidance on how to utilize my bot's actions for training the PPO agent within this framework.

Looking forward to your response and guidance on this matter.

aPovidlo · 2024-03-26T16:08:26Z

@vladyskai Do you find the solution of your problem? Faced with a similar problem.

vladyskai · 2024-03-26T19:42:47Z

@vladyskai Do you find the solution of your problem? Faced with a similar problem.

Not really, I finally got the RL agent to win the first 3 levels, but then I gave up on the project. You can see the github page if it's helpful:
https://github.com/danisotelo/RL_battle_city

Basically I added all the data I could to the agent and gave it 2 days on my 3060 RTX. It learned, but sloooowly.

vladyskai added the enhancement New feature or request label Jan 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration of Rule-Based Bot Actions for Imitation Learning #835

Integration of Rule-Based Bot Actions for Imitation Learning #835

vladyskai commented Jan 7, 2024

aPovidlo commented Mar 26, 2024

vladyskai commented Mar 26, 2024

Integration of Rule-Based Bot Actions for Imitation Learning #835

Integration of Rule-Based Bot Actions for Imitation Learning #835

Comments

vladyskai commented Jan 7, 2024

aPovidlo commented Mar 26, 2024

vladyskai commented Mar 26, 2024