Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: "Weights & Biases" integration #172

Closed
1 task done
manjavacas opened this issue Mar 3, 2022 · 15 comments · Fixed by #306
Closed
1 task done

[Feature]: "Weights & Biases" integration #172

manjavacas opened this issue Mar 3, 2022 · 15 comments · Fixed by #306
Labels
DRL Deep Reinforcement Learning issues enhancement New feature or request Feature request help wanted Extra attention is needed log New information that should be recordered

Comments

@manjavacas
Copy link
Member

Feature 🚀

The integration of the "Weights & Biases" (wandb) library in Sinergym.

Motivation

This library may be specially interesting, as it integrates both the functionalities of TensorBoard and MLflow.

As explained by @jajimer:

  • It is a standalone package (no Tensorflow or Pytorch installation is needed).
  • It looks like it can be easily integrated with any script or framework. See this.
  • Integrated with SB3!

Solution

Deeply study how wandb could be integrated, even considering if it can completely replace TensorBoard and MLflow or not.

Checklist

  • I have checked that there is no similar issue in the repo (required)

📝 Please, don't forget to include more labels besides Feature request if it is necessary.

@manjavacas manjavacas added enhancement New feature or request help wanted Extra attention is needed log New information that should be recordered DRL Deep Reinforcement Learning issues Feature request labels Mar 3, 2022
@kad99kev
Copy link

kad99kev commented Nov 9, 2022

Hi,

Firstly, thank you for your work on this library.

I have a potential solution to have wandb working with sinergym and sb3. There isn't much to be changed from the example given for MLFlow here.

I have added a minimal working example.

import sys
import gym
import wandb
import numpy as np

from typing import Any, Dict, Tuple, Union

import sinergym
from sinergym.utils.wrappers import LoggerWrapper, NormalizeObservation
from sinergym.utils.constants import RANGES_5ZONE
from sinergym.utils.callbacks import LoggerCallback

from stable_baselines3 import PPO
from stable_baselines3.common.logger import HumanOutputFormat, Logger, KVWriter

# The only addition required.
class WandBOutputFormat(KVWriter):
    """
    Dumps key/value pairs onto WandB.
    """

    def write(
        self,
        key_values: Dict[str, Any],
        key_excluded: Dict[str, Union[str, Tuple[str, ...]]],
        step: int = 0,
    ) -> None:

        for (key, value), (_, excluded) in zip(
            sorted(key_values.items()), sorted(key_excluded.items())
        ):
    
            if excluded is not None and "wandb" in excluded:
                continue

            if isinstance(value, np.ScalarType):
                if not isinstance(value, str):
                    wandb.log({key: value}, step=step)

# Init wandb here.
wandb.init(
        project="project_name",
        entity="entity_name",
        name="random_agent_" + wandb.util.generate_id(),
    )

env = gym.make('Eplus-5Zone-hot-continuous-v1')
env = NormalizeObservation(env, ranges=RANGES_5ZONE)
env = LoggerWrapper(env)

loggers = Logger(
    folder=None,
    output_formats=[HumanOutputFormat(sys.stdout, max_length=90), WandBOutputFormat()],
)

model = PPO("MlpPolicy", env, verbose=2)
# Set custom logger
model.set_logger(loggers)
model.learn(total_timesteps=10_000, log_interval=1, callback=LoggerCallback(sinergym_logger=True))

Please do let me know if this helps with this issue. Let me know if I should make a PR or if I could help in another way. Thanks!

@AlejandroCN7
Copy link
Member

Hi @kad99kev !
I've been testing the code you gave us and it works! The point is that our LoggerCallback doesn't work for wandb, I don't know if you got that error:

[2022-11-16 09:22:44,069] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf ExternalInterface object if it is not present...
[2022-11-16 09:22:44,069] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2022-11-16 09:22:44,071] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Updating idf OutPut:Variable and variables XML tree model for BVCTB connection.
[2022-11-16 09:22:44,072] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2022-11-16 09:22:44,072] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...
/usr/local/lib/python3.10/dist-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(
Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
[2022-11-16 09:22:44,097] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2022-11-16 09:22:44,106] EPLUS_ENV_5Zone-hot-continuous-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/scripts/Eplus-env-5Zone-hot-continuous-v1-res1/Eplus-env-sub_run1
Traceback (most recent call last):
  File "/workspaces/sinergym/prueba.py", line 66, in <module>
    model.learn(
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/ppo/ppo.py", line 310, in learn
    return super().learn(
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 265, in learn
    self.logger.dump(step=self.num_timesteps)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/logger.py", line 486, in dump
    _format.write(self.name_to_value, self.name_to_excluded, step)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/logger.py", line 183, in write
    raise ValueError(
ValueError: Key '   Cooling_Setpoint_RL' truncated to '   Cooling_Setpoint_RL' that already exists. Consider increasing `max_length`.

Then, I removed the callback and we can only register this params currently:
Captura desde 2022-11-16 09-49-47

Do you know why our Loggercallback is not compatible with wandb? Or did the code you put in work for you?
If you know how to solve it, we can see how to integrate it in a PR made by you :)

@kad99kev
Copy link

kad99kev commented Nov 16, 2022

Hi @AlejandroCN7,

I got that error as well. It is because when creating the keys, large names are truncated which often results in duplicate keys. Hence, that error. To solve that I had changed it to max_length=90.

I believe adding this would help solve the error:

loggers = Logger(
    folder=None,
    output_formats=[HumanOutputFormat(sys.stdout, max_length=90), WandBOutputFormat()],
)

Let me know if you already added this. I would be happy to contribute with a PR, let me know what API structure you have in mind and I can work on it. Thanks!

Here is the link to my WandB project - https://wandb.ai/kad99kev/sinergym-tests?workspace=user-kad99kev

@AlejandroCN7
Copy link
Member

Hi @kad99kev

Yes, I had the max_length added as you mentioned, but I still get the same error. I've tried to increase that value more, but I still get the same result.

In any case, looking at your project, I see that it works! It's great because I see that it doesn't need any big changes to make it work, thank you very much!

Let's see if I can make it work for me too and plain how we can integrate it into the tool (maybe substitute tensorboard+mlflow by wandb in DRL_battery.py), once again thank you very much for everything! :)

@kad99kev
Copy link

kad99kev commented Nov 16, 2022

Hi @AlejandroCN7,

That is indeed weird. I removed the HumanOutputFormat completely and it still worked. I only got that error the first time I ran it. I will look into that sometime. I will see if I can get the wandb example to work with DRL_battery.py and then get back to you.

Thank you!

@kad99kev
Copy link

kad99kev commented Dec 1, 2022

Hi @AlejandroCN7,

Sorry for the delay in getting back with this. I have some code ready, I just need to run a few more experiments with different algorithms to make sure it is working correctly. When it is done, should I make a PR or would you like me to share the code another way?

Link to WandB project - https://wandb.ai/kad99kev/sinergym_tests?workspace=user-kad99kev

Thanks!

@AlejandroCN7
Copy link
Member

AlejandroCN7 commented Dec 1, 2022

Hi @kad99kev

From what I can see, it looks very good. Thank you very much! If you agree, please finish testing everything with the algorithms and generate a PR directly, we'll check it out from there ;)

@kad99kev
Copy link

kad99kev commented Dec 8, 2022

Hi @AlejandroCN7,

I have tried running the experiments for PPO, A2C, DDPG, SAC and TD3 algorithms on the Eplus-5Zone-hot-continuous-v1 environment. From these, however, only PPO and A2C completely finish running. DDPG, SAC and TD3 don't complete even after hours of execution. Should I make a PR, anyway? I would appreciate if you could have a look at it as well. Thank you!

@AlejandroCN7
Copy link
Member

Hi @kad99kev,
I think we can make a pull request explaining what you say. Also, I think it is better to add a new script with your wandb integration instead of replacing directly the one we have with MLflow and try to improve it in the future. What do you think?

@kad99kev kad99kev mentioned this issue Dec 17, 2022
16 tasks
@kad99kev
Copy link

Hi @AlejandroCN7, I did create a new script and integrated WandB there. I will make a PR now. It is a bit hacky as it is just a proof-of-concept that WandB works with Sinergym. Let me know what I could do to further improve it.

@AlejandroCN7
Copy link
Member

Hi @kad99kev, I have added a comment in the PR code. If you need help, let me know :)
Thank you so much

@kad99kev
Copy link

Hi @AlejandroCN7, I don't see the comment. Could you please send the link to the comment?

@AlejandroCN7
Copy link
Member

@AlejandroCN7
Copy link
Member

Hi @kad99kev,
I have commented directly on the PR to make it more visible.

@kad99kev
Copy link

Hi @AlejandroCN7, thank you! I can see the comments now. I have replied to them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DRL Deep Reinforcement Learning issues enhancement New feature or request Feature request help wanted Extra attention is needed log New information that should be recordered
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants