Robocasa Language Embedding Cuda Out of Memory Error #191

JacobB33 · 2024-08-23T00:35:51Z

In line 222 of the Robocasa branch of robomimic/utils/train_utils.py, upon dataset creation, the dataset kwargs are deepcoppied. Since the language embedding model is one of the dataset_kwargs, this makes a copy of the model as well. This has caused me to run into a cuda out-of-memory issue when you train on a large number of dataset files. For example in Libero if you have 90 datasets, there are 90 copies of the language embedding model in cuda memory.
I made a quick modification that fixed this problem:

for i in range(len(ds_weights)):     
        ds_kwargs_copy = deepcopy(ds_kwargs)
        # Change so that we do not run out of cuda memory
        if "lang_encoder" in ds_kwargs:
                ds_kwargs_copy["lang_encoder"] = ds_kwargs["lang_encoder"]

        keys = ["hdf5_path", "filter_by_attribute"]

        for k in keys:
            ds_kwargs_copy[k] = ds_kwargs[k][i]

        ds_kwargs_copy["dataset_lang"] = ds_langs[i]
        ds_list.append(ds_class(**ds_kwargs_copy))

Should I maybe make this a PR? It might be more efficient to pop the lang_encoder and then not copy it for every dataset (even though with the above fix it gets immediately deleted)

The text was updated successfully, but these errors were encountered:

amandlek assigned snasiriany Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robocasa Language Embedding Cuda Out of Memory Error #191

Robocasa Language Embedding Cuda Out of Memory Error #191

JacobB33 commented Aug 23, 2024 •

edited

Loading

Robocasa Language Embedding Cuda Out of Memory Error #191

Robocasa Language Embedding Cuda Out of Memory Error #191

Comments

JacobB33 commented Aug 23, 2024 • edited Loading

JacobB33 commented Aug 23, 2024 •

edited

Loading