Poincaré Embeddings for Learning Hierarchical Representations of Modules in Common Python Libraries

This repo examines the application of poincaré embedding in encoding the similarity between python modules (e.g. those found in common libraries such as NumPy, SciPy, Sklearn), measured by the shortest path in the tree constructed based on the hierarchy of these modules, crawled from Python documentation. Modules with the same name, e.g. “scipy.stats._continuous_distns.lomax_gen.fit” and “sklearn.linear_model.base.LinearModel.fit”, are distinguished as different nodes in the tree. During training (multithreaded async SGD), edges connecting modules with the same name are optionally added, with the assumption that the same name signals similar functionality (e.g. estimating the fit of data to a model/ distribution). To evaluate the effectiveness of the representation learned, Pearson correlation is computed for embedding distance versus shortest path distance in the original tree, between nodes that are under the same library (e.g. Numpy).

Adapted from PyTorch implementation of Poincaré Embeddings for Learning Hierarchical Representations by Facebook AI Research

Dependencies

Python 3 with NumPy
PyTorch
Scikit-Learn
NLTK (to generate the WordNet data)

References

If you find this code useful for your research, please cite the following paper in your publication:

@incollection{nickel2017poincare,
  title = {Poincar\'{e} Embeddings for Learning Hierarchical Representations},
  author = {Nickel, Maximilian and Kiela, Douwe},
  booktitle = {Advances in Neural Information Processing Systems 30},
  editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
  pages = {6341--6350},
  year = {2017},
  publisher = {Curran Associates, Inc.},
  url = {http://papers.nips.cc/paper/7213-poincare-embeddings-for-learning-hierarchical-representations.pdf}
}

License

This code is licensed under CC-BY-NC 4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
wordnet		wordnet
LICENSE		LICENSE
README.org		README.org
approach.txt		approach.txt
data.py		data.py
dataset_stats.txt		dataset_stats.txt
embed.py		embed.py
eval_utils.py		eval_utils.py
eval_utils_nouns.py		eval_utils_nouns.py
evaluation.py		evaluation.py
evaluation_closure.py		evaluation_closure.py
evaluation_nouns.py		evaluation_nouns.py
evaluation_old.py		evaluation_old.py
example.sh		example.sh
model.py		model.py
process_results.py		process_results.py
remove_tests.py		remove_tests.py
rsgd.py		rsgd.py
train-debug.sh		train-debug.sh
train-package-0613.sh		train-package-0613.sh
train-package.sh		train-package.sh
train.py		train.py
train_nouns.sh		train_nouns.sh
wn-nouns.jpg		wn-nouns.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Poincaré Embeddings for Learning Hierarchical Representations of Modules in Common Python Libraries

Adapted from PyTorch implementation of Poincaré Embeddings for Learning Hierarchical Representations by Facebook AI Research

Dependencies

References

License

About

Releases

Packages

Languages

License

thaonguyen19/poincare_embeddings

Folders and files

Latest commit

History

Repository files navigation

Poincaré Embeddings for Learning Hierarchical Representations of Modules in Common Python Libraries

Adapted from PyTorch implementation of Poincaré Embeddings for Learning Hierarchical Representations by Facebook AI Research

Dependencies

References

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages