TsneBH.jl

The purpose of this module is to implement the T-SNE dimensionality reduction technique developed by Laurens van der Maaten. This technique is a stochastic algorithm that allows for the reduction of the dimensions from the original space, while trying to maintain intact the relationships among points in the reduced space, especially the nearest neighbors.

T-SNE itself is an extension of SNE technique, introducing the use of the t-distribution in the embedded space instead of the Gaussian and a new way to compute the gradient.

Essentially using T-SNE is solving an optimization problem, where the objective function is the KL divergence between the distributions of points in the original space (loosely speaking) and the ones in the reduced space. Ideally we'd like to minimize this cost, as to make the two distributions as similar as possible. The optimization is done through a gradient descent algorithm.

Trees extension

An evolution of T-SNE is to accelerate the computations by means of two tree-based algorithms: Vantage Point trees and the Barnes-Hut. The first one is a clever way to map the space of points and to quickly retrieve which are the nearest neighbors of a given point. The second one, with the use of QuadTrees, is also a way to map the space of points but with the purpose of speeding up the computation of any interaction among them (in our case the gradient).

They are implemented in the trees.jl file, but as of now the BarnesHut functions are not stable and might give an Overflow error.

Main function documentation

tsne(X::Matrix{Float64}, emb_size::Int64, T::Int64;
                lr::Float64 = 1., perp::Float64 = 30., tol::Float64 = 1e-5,
                max_iter::Int = 50,  momentum::Float64 = 0.01, 
                pca::Bool = true, pca_dim::Int = 50, exag_fact::Float64 = 4.,
                use_trees::Bool = false, ### The BarnesHut algorithm is currently instable, there's a problem with the recursion
                theta_bh::Float64 = 0.2, use_seed::Bool = false, verbose::Bool = true)

Quick RunC

Clone the repo, then cd into it. You can run a simple example(after isntantiating the packages):

julia --project=. ./examples/tsne_run.jl

Otherwise you can open julia --project=. and do

using TsneBH
tsne(...) # follow the documentation above

References

L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008.
L.J.P. van der Maaten. Accelerating t-SNE using Tree-Based Algorithms. Journal of Machine Learning Research 15(Oct):3221-3245, 2014.
lvdmaaten.github.io/tsne

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
examples		examples
src		src
test		test
.gitignore		.gitignore
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TsneBH.jl

Trees extension

Main function documentation

Quick RunC

References

About

Releases

Packages

Languages

PietroDomi/TsneBH.jl

Folders and files

Latest commit

History

Repository files navigation

TsneBH.jl

Trees extension

Main function documentation

Quick RunC

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages