Implement #1389 or similar functionality #1939

somedadaism · 2021-12-27T22:09:03Z

🚀 Feature Request

I would like to ask if it is possible to implement the functionality asked for in #1389

Motivation

I believe the feature is extremely useful for ML applications. To be able to have list composition in the defaults list the way we already have dict composition would allow to be less redundant in the configuration by reusing config files more.

Pitch

I know that #1389 had been closed as invalid but it seems that in omegaconf 2.1 the necessary prerequisites maybe have been implemented since the issue had been closed.

OmegaConf 2.1 will support more powerful custom resolvers. It's possible that they will enable generating as_list automatically from dict. (cc @odelalleau).

Not in my current implementation because resolvers systematically wrap their output in a ValueNode. However I suspect that if we used _node_wrap() instead then it should be possible. Not going to change it now though -- I'd rather keep it for later.

I think this should be possible now, with oc.dict.keys, see https://omegaconf.readthedocs.io/en/2.1_branch/custom_resolvers.html#oc-dict-keys-value

This could maybe be used to allow for list composition in the defaults list in the way originally asked for instead of using os.dict.values in a hacky fashion, like proposed e.g. by Max Ehrlich here:
https://stackoverflow.com/questions/64802586/how-to-gather-config-files-in-a-list-with-hydra-fb

Are you willing to open a pull request?

Maybe if I can get sufficient guidance I would try.

The text was updated successfully, but these errors were encountered:

Jasha10 · 2021-12-29T23:13:39Z

Hi @somedadaism,
Thank you for your interest in Hydra and for taking the time to open this issue!

This could maybe be used to allow for list composition in the defaults list in the way originally asked for...

The way originally asked for proposes that, with the following config, the db group would produce a ListConfig object containing each of the mysql and postgresql input configs:

defaults:
  - base_config
  - db: 
    - mysql
    - postgresql

This will not be possible because there is already a different behavior implemented for nested lists within a defaults list (such as the nested list - db: [mysql, postgresql]): the mysql and postgresql configs are merged, and the result of the merge is placed in the db package. See the Hydra docs on Selecting multiple configs from a Config Group.

Admittedly I have not yet had a chance to look closely at later discussion in comments on the linked issue #1389.

somedadaism · 2021-12-30T08:48:24Z

Hi @Jasha10 thanks for your quick reply.
I would like to clarify the functionality I search for. Please consider first the following (simplified) situation:

.
├── hydra_config
│   ├── callbacks
│   │   ├── my_callback_1.yaml
│   │   └── my_callback_2.yaml
│   └── config.yaml
└── main.py

With:

# main.py
import hydra

@hydra.main(config_path="hydra_config", config_name="config_new.yaml")
def main(config):
    print(config)

if __name__ == "__main__":
    main()

# config.yaml
defaults:
  - callbacks:
    - my_callback_1
    - my_callback_2

What currently works is:

# my_callback_1.yaml
my_callback_1:
  my_param1: 1

# my_callback_2.yaml
my_callback_2:
  my_param2: 2

resulting in:

{'callbacks': {'my_callback_1': {'my_param1': 1}, 'my_callback_2': {'my_param2': 2}}}

What I want to create as output is:

{'callbacks':  [{'my_param1': 1}, {'my_param2': 2}]}

Now I understand that:

# my_callback_1.yaml
my_param1: 1

# my_callback_2.yaml
my_param2: 2

Can not work as it (correctly!) results in:

{'callbacks': {'my_param1': 1, 'my_param2': 2}}

which is what I believe you also pointed out.
What is not understandable to me is that for an example:

# my_callback_1.yaml
- my_param1: 1

# my_callback_2.yaml
- my_param2: 2

will result in:

{'callbacks': [{'my_param2': 2}]}

and not in

{'callbacks':  [{'my_param1': 1}, {'my_param2': 2}]}

I believe the ability to merge configs into lists is very important to avoid repeating oneself for ML applications of hydra, but there seems to be no way to merge them.
(Imagine using the same 5 callbacks all the time in a list but being forced to type them out repeatedly for different experimental model runs - if you change one thing you have to change 5 config files if you are lucky enough to remember them all.).

omry · 2021-12-30T16:59:10Z

Take a look at oc.dict.values, it may be what you are looking for.
Basically, you can let Hydra compose a dict, and have a key that dynamically transform it to a list.

somedadaism · 2022-01-01T10:11:43Z

Hello @omry thank you for creating hydra and for taking a look. :)
I am very aware of oc.dict.values actually the answer provided by you was also proposed on StackOverflow and was linked by me in the original question.
Sadly it is specifically what I don't want to do.
My problems with this approach are the following:

It's not straight forward to read - it just feels hacky.
The approach forces us to create a dictionary in first place before obtaining a list. This means the dictionary will also be part of the configuration. If now I just want to use hydra.utils.instantiate on my entire config it will create all objects created vie _target_ twice, once for the dict and then for the list. For some objects, e.g. some of my code for W&B this is a problem.
I know multiple ways to hack around this but I would rather invest time to get a cleaner solution in hydra itself.
I feel the current defaults feel inconsistent. When selecting multiple configs we replace the respective line in the defaults with the content of the referenced file and then merge if necessary. If this would apply for configs with lists too in the situation:

# my_callback_1.yaml
- my_param1: 1

# my_callback_2.yaml
- my_param2: 2

Then:

# config.yaml
defaults:
  - callbacks:
    - my_callback_1
    - my_callback_2

Would be text substituted to:

callbacks:
  - my_param1: 1
  - my_param2: 2

Just as in the situation:

# my_callback_1.yaml
my_callback_1:
  my_param1: 1

# my_callback_2.yaml
my_callback_2:
  my_param2: 2

The defaults list in:

# config.yaml
defaults:
  - callbacks:
    - my_callback_1
    - my_callback_2

are text substituted to:

callbacks:
  my_callback_1:
    my_param1: 1
  my_callback_2:
    my_param2: 2

I know what I ask for is not how it works but I believe it would be the more natural and consistent way.
I have discussed on this with multiple of my coworkers and the ability to merge lists in the defaults list in a more natural way is missed by us very much. Is there really no less hacky way than oc.dict.values?
How about introducing a new keyword like we already have override, e.g.:

# my_callback_1.yaml
my_param1: 1

# my_callback_2.yaml
my_param2: 2

Then:

# config.yaml
defaults:
  - list_merge callbacks:  # list_merge or similar could be a keyword like override
    - my_callback_1
    - my_callback_2

Would be text substituted to:

callbacks:
  - my_param1: 1
  - my_param2: 2

somedadaism · 2022-01-05T13:42:17Z

@omry @Jasha10 Any updates on this?

Jasha10 · 2022-01-05T18:31:20Z

The current behavior is probably due to the implementation of OmegaConf.merge:

>>> from omegaconf import OmegaConf
>>> OmegaConf.merge([1,2], [3,4])
[3, 4]
>>> OmegaConf.merge({"a": [1,2], "b": [3,4]}, {"b": [5,6], "c": [7,8]})
{'a': [1, 2], 'b': [5, 6], 'c': [7, 8]}

mappings on the left are updated by mappings on the right; sequences on the left are overwritten by sequences on the right.

I agree with you that, in the specific case of "Selecting multiple configs from a Config Group" in the defaults list, it would be more useful to concatenate ListConfigs instead of overwriting them (as you have outlined above). The proposed change would technically be breaking, but I doubt anyone is relying on the old behavior.

This being said, I'm not sure that special-casing the "selecting multiple configs from a config group" behavior is the most general solution to the use-case you've highlighted. It still feels like a hack that my_callback_1.yaml has to be written with a leading dash:

# my_callback_1.yaml
- my_param1: 1

Painfully, we are prevented from using a defaults list in my_callback_1.yaml if the file is written as a yaml sequence (as above).

You are trying to achieve the following result:
{'callbacks': [{'my_param1': 1}, {'my_param2': 2}]}.
Inspired by #1547, I propose the following:

# config.yaml
defaults:
  - append callbacks: my_callback_1
  - append callbacks: my_callback_2

Here an "append" keyword would signify that the callbacks package should be a ListConfig, and that the selected my_callback_{1,2} config should be appended to that list. These configs my_callback_1.yaml and my_callback_2.yaml could then be written as yaml mappings, possibly having defaults lists of their own:

# my_callback_1.yaml
defaults: ...  # possible defaults list
my_param1: 1  # no leading dash

Edit: I should mention that I don't personally have the bandwidth to work on implementing such a feature for the next Hydra release.

somedadaism · 2022-01-07T21:25:38Z

I think the proposed syntax is very good.
I would take a look at contributing but only have limited time.
Where could I get hints on how to achieve this?

Jasha10 · 2022-01-11T07:02:42Z

Unfortunately, I fear that the implementation of this feature might not be straight-forward; it will require adding a new keyword to the ConfigRepository.Keywords class and modifying the logic in defaults_list.py.

Jasha10 · 2022-01-11T07:06:49Z

Cross-link to #1547 which is related.

tangbinh · 2022-02-09T17:37:29Z

@Jasha10 @jieru-hu I agree with @somedadaism that this feature is very much needed for ML applications. We often pass a list of callbacks to a trainer, and being able to compose such a list via the command line is necessary. Given the huge number of combinations one can derive from a list (e.g. for 5 callbacks, there are 32 combinations), it's impossible to have a YAML for each combination.

@Jasha10 If it's not easy to implement this feature, is there an acceptable workaround in the meantime? For example, I would like to specify a default list of callbacks in a primary config and couldn't figure out a way to do it yet.

# cb1.yaml
_target_: torch.nn.Identity
arg1: 1

# cb2.yaml
_target_: torch.nn.Identity
arg2: 2

# config.yaml
defaults:
  - callbacks:
    # what to put here?

The desired output looks like

callbacks:
- _target_: torch.nn.Identity
  arg1: 1
- _target_: torch.nn.Identity
  arg2: 2

Jasha10 · 2022-02-10T19:32:10Z

@tangbinh, see Omry's comment above.

Basically, you should use the defaults list to compose a dictionary whose values are the callbacks you want. You then can use the oc.dict.values resolver to get a list of callbacks.

$ tree
├── callbacks
│   ├── cb1.yaml
│   └── cb2.yaml
├── config.yaml
└── my_app.py

# config.yaml
defaults:
  - callbacks@_callback_dict.cb1: cb1
  - callbacks@_callback_dict.cb2: cb2
  - _self_

callbacks: ${oc.dict.values:_callback_dict}

# my_app.py
import hydra
from omegaconf import OmegaConf

@hydra.main(config_path=".", config_name="config")
def app(cfg):
    OmegaConf.resolve(cfg)
    del cfg._callback_dict
    print(OmegaConf.to_yaml(cfg))

if __name__ == "__main__":
    app()

$ python my_app.py
callbacks:
- _target_: torch.nn.Identity
  arg1: 1
- _target_: torch.nn.Identity
  arg2: 2

tangbinh · 2022-02-11T01:07:21Z

@Jasha10 Thank you for working out the example. It's certainly helpful, although I think we all agree that invoking @_callback_dict and ${oc.dict.values:_callback_dict} reduces readability significantly. Can you also work out how I can append another callback to the list via the command line?

If possible, would we be able to prioritize this list composition feature a bit? I believe ML users would run into this issue quite often.

Jasha10 · 2022-02-12T19:23:31Z

If possible, would we be able to prioritize this list composition feature a bit?

Thanks for the nudge :)
I agree completely that improved support for list composition should be a high-priority feature. For the record, I suspect that improvements here will require a significant amount of effort.

Can you also work out how I can append another callback to the list via the command line?

Here's the trick:

$ python my_app.py +callbacks@_callback_dict.cb3=cb3
callbacks:
- _target_: torch.nn.Identity
  arg1: 1
- _target_: torch.nn.Identity
  arg1: 2
- _target_: torch.nn.Identity
  arg1: 3

The plus symbol is for appending to the defaults list.
You can also remove items from the defaults list using the tilde symbol:

$ python my_app.py '~callbacks@_callback_dict.cb1'
callbacks:
- _target_: torch.nn.Identity
  arg1: 2

Yevgnen · 2022-02-17T09:02:26Z

@Jasha10

# config.yaml
defaults:
  - callbacks@_callback_dict.cb1: cb1
  - callbacks@_callback_dict.cb2: cb2
  - _self_

callbacks: ${oc.dict.values:_callback_dict}

Hi, in this example, is there a way to avoid writing callbacks@_callback_dict.cb{x}: cb{x} for every {x} so that we can include all items under callback/ at once?

Jasha10 · 2022-02-17T19:09:19Z

Hi @Yevgnen,
No, there is currently no way to do that.

jzazo · 2022-02-17T21:43:55Z

Thanks for bringing this discussion back, it has been quite useful. I like the append approach for concatenating items into a list, and I think it would work perfectly for ml needs.

I want to point out that if this new functionality is added, it may also be natural to add extend for lists, rather than merge, possibly in omegaconf and hydra? Not sure how this would turn up, just wanted to ask if it would make sense. Something like

defaults:
  - extend callbacks: callbacks_list1
  - extend callbacks: callbacks_list2

Not sure this is great, I personally feel the append mechanism would be more useful, but this may generalize some use cases.

ssnl · 2022-05-23T23:54:40Z

The workaround here doesn't work nicely with structured config. Unless the config also have a _callback_dict entry, OmegaConf will complain in merging. :(

Summary: Currently, flsim cannot handle a list of configs Eg: a json config like the following is not supported. ``` { "trainer": { "clients": [ {"_base_": "base_client", "optimizer": {"lr": 0.1}}, {"_base_": "base_client", "optimizer": {"lr": 0.2}}, ] } ``` The one hiccup in supporting this is hydra still doesn't support overriding/appending lists when values are configs. (see facebookresearch/hydra#1939 (comment)) In order to overcome the above, we treat the list as a dictionary with the key as list index. The above config in yaml format will look as follows: ``` trainer: ... clients: '0': _target_: flsim.clients.base_client.Client _recursive_: false epochs: 1 optimizer: _target_: ??? _recursive_: false lr: 0.1 momentum: 0.0 weight_decay: 0.0 lr_scheduler: _target_: ??? _recursive_: false base_lr: 0.001 max_clip_norm_normalized: null only_federated_params: true random_seed: null shuffle_batch_order: false store_models_and_optimizers: false track_multiple_selection: false '1': _target_: flsim.clients.base_client.Client _recursive_: false epochs: 1 optimizer: _target_: ??? _recursive_: false lr: 0.1 momentum: 0.0 weight_decay: 0.0 lr_scheduler: _target_: ??? _recursive_: false base_lr: 0.001 max_clip_norm_normalized: null only_federated_params: true random_seed: null shuffle_batch_order: false store_models_and_optimizers: false track_multiple_selection: false ... ``` Reviewed By: Anonymani Differential Revision: D37638999 fbshipit-source-id: 5444da742742d4cc976875a6dc055a0c71c186e4

TesnimHadhri · 2022-07-28T14:39:41Z

The workaround here doesn't work nicely with structured config. Unless the config also have a _callback_dict entry, OmegaConf will complain in merging. :(

Do you mind explaining the workaround you used with structured config?

ssnl · 2022-07-28T14:45:52Z

@TesnimHadhri I eventually did not use it, but I remember having a _callback_dict field with dictionary type helped. but YMMV

TesnimHadhri · 2022-07-28T14:49:11Z

@TesnimHadhri I eventually did not use it, but I remember having a _callback_dict field with dictionary type helped. but YMMV

I see, thanks for the quick answer!

libokj · 2023-03-28T15:39:06Z

@Jasha10 Hi, is it possible for you to provide a brief update on the status of the implementation of this feature? Should we look into it as you commented earlier?

Unfortunately, I fear that the implementation of this feature might not be straight-forward; it will require adding a new keyword to the ConfigRepository.Keywords class and modifying the logic in defaults_list.py.

anton-bushuiev · 2023-08-07T21:53:02Z

Note for the ones still seeking for a better solution: Lightning-Hydra-Template implements another workaround which is however still based on a dictionary with redundant keys. The difference between the solution suggested by @Jasha10 above is that the dict is parsed internally by a custom instantiator rather than relying on oc.dict.values which increases readability significantly but loses succinctness.

libokj · 2023-08-30T01:51:28Z

Note for the ones still seeking for a better solution: Lightning-Hydra-Template implements another workaround which is however still based on a dictionary with redundant keys. The difference between the solution suggested by @Jasha10 above is that the dict is parsed internally by a custom instantiator rather than relying on oc.dict.values which increases readability significantly but loses succinctness.

That's the workaround I am resorting to as well. Still hope this feature can be implemented.

jkarns275 · 2024-04-26T18:48:00Z

Bumping this because I've ran into this on two completely unrelated projects now :(

alex-hh · 2024-08-26T18:25:26Z

I also would find this extremely useful: anywhere a functional style is used it is useful to specify an ordered list of items.

As well as callbacks, pipelines of preprocessing transformations seem like a natural use case.

Do the current workarounds preserve the order of the composed items? This is critical in such cases but I'm unclear whether the fact that dictionaries are involved in intermediate steps means that the order is not preserved

somedadaism added the enhancement Enhanvement request label Dec 27, 2021

somedadaism changed the title ~~Implement #1389~~ Implement #1389 or similar functionality Dec 30, 2021

Jasha10 added this to the Hydra 1.3.0 milestone Jan 13, 2022

Jasha10 mentioned this issue Feb 12, 2022

[Feature Request] Concatenate Multiple Config List elements #2027

Open

Jasha10 added the list composition label Feb 12, 2022

ssnl mentioned this issue May 24, 2022

oc.dict.values/keys support relative keys omry/omegaconf#942

Closed

Jasha10 mentioned this issue Jun 21, 2022

[Feature Request] allow repeating elements in list #2267

Open

Jasha10 mentioned this issue Aug 10, 2022

[Feature Request] Composing multiple default packages into a list #2335

Closed

Jasha10 mentioned this issue Aug 25, 2022

[Bug] Parsing list of configurations with a structured config #1389

Closed

2 tasks

Jasha10 mentioned this issue Sep 2, 2022

[Feature Request] Append to a list from command line #1547

Open

Jasha10 mentioned this issue Oct 18, 2022

[Feature Request] Non-mutually exclusive config groups #499

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement #1389 or similar functionality #1939

Implement #1389 or similar functionality #1939

somedadaism commented Dec 27, 2021 •

edited

Loading

Jasha10 commented Dec 29, 2021 •

edited

Loading

somedadaism commented Dec 30, 2021 •

edited

Loading

omry commented Dec 30, 2021

somedadaism commented Jan 1, 2022 •

edited

Loading

somedadaism commented Jan 5, 2022

Jasha10 commented Jan 5, 2022 •

edited

Loading

somedadaism commented Jan 7, 2022

Jasha10 commented Jan 11, 2022

Jasha10 commented Jan 11, 2022

tangbinh commented Feb 9, 2022

Jasha10 commented Feb 10, 2022 •

edited

Loading

tangbinh commented Feb 11, 2022

Jasha10 commented Feb 12, 2022 •

edited

Loading

Yevgnen commented Feb 17, 2022

Jasha10 commented Feb 17, 2022

jzazo commented Feb 17, 2022

ssnl commented May 23, 2022

TesnimHadhri commented Jul 28, 2022

ssnl commented Jul 28, 2022

TesnimHadhri commented Jul 28, 2022

libokj commented Mar 28, 2023

anton-bushuiev commented Aug 7, 2023

libokj commented Aug 30, 2023

jkarns275 commented Apr 26, 2024

alex-hh commented Aug 26, 2024

Implement #1389 or similar functionality #1939

Implement #1389 or similar functionality #1939

Comments

somedadaism commented Dec 27, 2021 • edited Loading

🚀 Feature Request

Motivation

Pitch

Jasha10 commented Dec 29, 2021 • edited Loading

somedadaism commented Dec 30, 2021 • edited Loading

omry commented Dec 30, 2021

somedadaism commented Jan 1, 2022 • edited Loading

somedadaism commented Jan 5, 2022

Jasha10 commented Jan 5, 2022 • edited Loading

somedadaism commented Jan 7, 2022

Jasha10 commented Jan 11, 2022

Jasha10 commented Jan 11, 2022

tangbinh commented Feb 9, 2022

Jasha10 commented Feb 10, 2022 • edited Loading

tangbinh commented Feb 11, 2022

Jasha10 commented Feb 12, 2022 • edited Loading

Yevgnen commented Feb 17, 2022

Jasha10 commented Feb 17, 2022

jzazo commented Feb 17, 2022

ssnl commented May 23, 2022

TesnimHadhri commented Jul 28, 2022

ssnl commented Jul 28, 2022

TesnimHadhri commented Jul 28, 2022

libokj commented Mar 28, 2023

anton-bushuiev commented Aug 7, 2023

libokj commented Aug 30, 2023

jkarns275 commented Apr 26, 2024

alex-hh commented Aug 26, 2024

somedadaism commented Dec 27, 2021 •

edited

Loading

Jasha10 commented Dec 29, 2021 •

edited

Loading

somedadaism commented Dec 30, 2021 •

edited

Loading

somedadaism commented Jan 1, 2022 •

edited

Loading

Jasha10 commented Jan 5, 2022 •

edited

Loading

Jasha10 commented Feb 10, 2022 •

edited

Loading

Jasha10 commented Feb 12, 2022 •

edited

Loading