This application is a tool for visualizing and manipulating word embeddings. A word embedding is a collection of points generated by a computer where each point represents the meaning of a unique word.
Word embeddings are generated by neural networks and thus are, by nature, a black box and hard to visualize. There is research currently being done to study the structure and improvement of word embeddings. Our web app seeks to streamline some of this study.
To accomplish this we've preloaded our app with groupings of similar words. The user can choose one of these words lists, and our app will calculate the best view to specifically represent those words. This is accomplished using a technique called Principal Component Analysis (PCA). PCA takes information in a high dimensional space and condenses it into a more friendly space like 2 or 3 dimensions. We lose some details, as with any simplification, while we're condensing the data, but PCA guarantees that we preserve the maximum amount of information (seen in the percentage above) while making it easier for a human to visualize.
Now that we've squashed our word points into a space that we can understand, we can use our human brains to find patterns, reverse engineer the information in the embeddings, and with some effort develop algorithms to improve the next generation of natural language processing. To guarantee the usefulness of this app we tried to make the user experience as friendly as possible. Our goal was to make this app responsive and intuitive to use, and we are really happy with how it turned out.