Handle more than 2**31 parameters on GPU models #599

benfred · 2022-07-31T00:31:45Z

We had some issues that prevented us from learning models with more than
2**31 parameters in one of the matrices (so ~8GB of total space). This
was because we were using int32 to represent the rows/cols - and
when multiplying together we'd overflow.

Fix by using size_t. Have manually verified I can train a model with
over 25M users and 128 factors with this change (both bpr & als).

We had some issues that prevented us from learning models with more than 2**31 parameters in one of the matrices (so ~8GB of total space). This was because we were using int32 to represent the rows/cols - and when multiplying together we'd overflow. Fix by using size_t. Have manually verified I can train a model with > 25M users and 128 factors with this change (both bpr & als).

benfred added 2 commits July 30, 2022 17:29

Merge branch 'main' into large_gpu

4c11dcf

benfred added the bug label Jul 31, 2022

benfred merged commit 2c3695d into main Jul 31, 2022

benfred deleted the large_gpu branch July 31, 2022 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle more than 2**31 parameters on GPU models #599

Handle more than 2**31 parameters on GPU models #599

benfred commented Jul 31, 2022 •

edited

Loading

Handle more than 2**31 parameters on GPU models #599

Handle more than 2**31 parameters on GPU models #599

Conversation

benfred commented Jul 31, 2022 • edited Loading

benfred commented Jul 31, 2022 •

edited

Loading