Training Lora on 24gb vram locally

Not as much an issue but an fyi. I was getting OOM errors yesterday when trying to train a Lora on my 3090 (24gb vram) and 64gb ddr5.

One issue was the models which convert the dataset to weights (I believe there's 2) were set to local only and that needed to be fixed (set to False), also I had to comment out a line in the beginning which was trying to set the local rank to 0.

So then the training would try to run but I'd run out of vram. I did some research and realized those 2 models were only being used to preprocess the dataset so they didn't need to be used in training. 

The solution I did was to build my own preprocess script and then modify the ace dataset class to load the data from the cache and modify the main training script to not use those 2 extra models. Took a while to get everything right but the training is working great now! 

In hindsight I probably could have just implemented better memory management in the main training script to make sure the preprocessing models are unloaded from memory before trying to begin training. 

I ran it over night and I'm about at step 40k on a 70 song dataset and it's learning it really well with default Lora settings. Not sure how it generalizes yet but I'll test it out when it's done cooking.

Oh also for some reason I had to run in precision = "32-full"

Thanks for all the work y'all have done, this is awesome and it's exactly what I was looking for at exactly the right time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Lora on 24gb vram locally #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training Lora on 24gb vram locally #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions