This repository contains the implementation of the paper:
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
Félix Lefebvre and Gaël Varoquaux
NeurIPS 2025
PDF: https://arxiv.org/pdf/2507.00965v2
- Scales to knowledge graphs with millions of entities
- Robust to highly skewed degree distributions
- Produces embeddings for downstream regression and classification tasks
Method details and ablations are in the paper.
- Mini YAGO3 tutorial:
examples/mini_yago3_embeddings.ipynb
- Downstream tables and Mini YAGO3: https://huggingface.co/datasets/inria-soda/sepal-datasets
If you use SEPAL, please cite:
@inproceedings{lefebvre2025scalable,
title={Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning},
author={Lefebvre, Félix and Varoquaux, Gaël},
booktitle={Advances in Neural Information Processing Systems},
volume={38},
year={2025}
}