Cache Data to Decrease RAM #52

getshaun24 · 2022-10-31T21:45:23Z

Great job with this repository, it is awesome.

On the ReadMe page you state that:

"One can cache the data returned from prepare_data function to disk but it will increase trianing time due to I/O burden."

How would I implement this?

Thank You!

Yi-Lynn · 2023-02-04T06:51:53Z

I think caching the data returned from prepare_data() to disk refer to saving these huge extracted pixel representations to files in some specified format (e.g. .npy file) on your disk. Later on you can load these files and read the data out for the training of MLPs.

But I have another issue here, which is the extracted pixel representations are so huge that before the prepare_data() function return the data, the process I'm running the script on will crash. For example, the process will be killed when it tries to run the following two lines to faltten the pixel representations.
X = X.transpose(1,0,2,3).reshape(d,-1).transpose(1,0)
y = y.flatten()

My solution is to write another prepare_data() function, which processes one image instead of all labelled training images at one time. Then during the training of the pixel classifier, at each epoch I will create a dataloader for one image and iterate through all training images. But the results I get this way are not quite the same as the results reported in the paper. Wondering if anyone has encountered the same problem or knows how to fix it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache Data to Decrease RAM #52

Cache Data to Decrease RAM #52

getshaun24 commented Oct 31, 2022

Yi-Lynn commented Feb 4, 2023

Cache Data to Decrease RAM #52

Cache Data to Decrease RAM #52

Comments

getshaun24 commented Oct 31, 2022

Yi-Lynn commented Feb 4, 2023