- Re-Cythonized cython files to fix compilation errors with newer compilers.
- Fixed
np.object
usage in tests.
- Set the
LIGHTFM_NO_CFLAGS
environment variable when building LightFM to prevent it from setting-ffast-math
or-march=native
compiler flags.
predict
now returns float32 predictions.
- Added a check that there is no overlap between test and train in
predict_ranks
(thanks to @artdgn). - Added dataset builder functionality.
- Fixed error message when item features have the wrong dimensions.
- Predict now checks for overflow in inputs to predict.
- WARP fitting is now numerically stable when there are very few items to draw negative samples from (< max_sampled).
- added additional input checks for non-normal inputs (NaNs, infinites) for features
- added additional input checks for non-normal inputs (NaNs, infinites) for interactions
- cross validation module with dataset splitting utilities
- LightFM model now raises a ValueError (instead of assertion) when the number of supplied features exceeds the number of estimated feature embeddings.
- Warn and delete downloaded file when Movielens download is corrputed. This happens in the wild cofuses users terribly.
- added get_{user/item}_representations functions to facilitate extracting the latent representations out of the model.
- recall_at_k and precision_at_k now work correctly at k=1 (thanks to Zank Bennett).
- Moved Movielens data to data release to prevent grouplens server flakiness from affecting users.
- Fix segfault when trying to predict from a model that has not been fitted.
- Ranks are now computed pessimistically: when two items are tied, the positive item is assumed to have higher rank. This will lead to zero precision scores for models that predict all zeros, for example.
- The model will raise a ValueError if, during fitting, any of the parameters become non-finite (NaN or +/- infinity).
- Added mid-epoch regularization when a lot of regularization is used. This reduces the likelihood of numerical instability at high regularization rates.
- negative samples in BPR are now drawn from the empirical distributions of positives. This improves accuracy slightly on the Movielens 100k dataset.
- incorrect calculation of BPR loss (thanks to @TimonVS for reporting this).
- added recall@k evaluation function
- added >=0.17.0 scipy depdendency to setup.py
- fixed segfaults on when duplicate entries are present in input COO matrices (thanks to Florian Wilhelm for the bug report).
- fixed gradient accumulation in adagrad (the feature value is now correctly used when accumulating gradient). Thanks to Benjamin Wilson for the bug report.
- all interaction values greater than 0.0 are now treated as positives for ranking losses.
- max_sampled hyperparameter for WARP losses. This allows trading off accuracy for WARP training time: a smaller value will mean less negative sampling and faster training when the model is near the optimum.
- Added a sample_weight argument to fit and fit_partial functions. A high value will now increase the size of the SGD step taken for that interaction.
- Added an evaluation module for more efficient evaluation of learning-to-rank models.
- Added a random_state keyword argument to LightFM to allow repeatable model runs.
- By default, an OpenMP-less version will be built on OSX. This allows much easier installation at the expense of performance.
- The default value of the max_sampled argument is now 10. This represents a decent default value that allows fast training.
- fix scipy missing from requirements in setup.py
- remove dependency on glibc by including a translation of the musl rand_r implementation
- fixed bug where item momentum would be incorrectly used in adadelta training for user features (thanks to Jong Wook Kim @jongwook for the bug report).
- user and item features are now floats (instead of ints), allowing fractional feature weights to be used when fitting models.
- when installing into an Anaconda distribution, drop -march=native compiler flag due to assembler issues.
- when installing on OSX, search macports and homebrew install location for gcc version 5.x
- when installing on OSX, search macports install location for gcc
- input matrices automatically converted to correct dtype if necessary