Vector Representations of Words
In this tutorial we look at the word2vec model by Mikolov et al. This model is used for learning vector representations of words, called “word embeddings”.
This tutorial is meant to highlight the interesting, substantive parts of building a word2vec model in TensorFlow.
We start by giving the motivation for why we would want to represent words as vectors.
We look at the intuition behind the model and how it is trained (with a splash of math for good measure).
We also show a simple implementation of the model in TensorFlow.
Finally, we look at ways to make the naive version scale better.
We walk through the code later during the tutorial, but if you’d prefer to dive straight in, feel free to look at the minimalistic implementation in tensorflow/examples/tutorials/word2vec/word2vec_basic.py This basic example contains the code needed to download some data, train on it a bit and visualize the result. Once you get comfortable with reading and running the basic version, you can graduate to tensorflow/models/embedding/word2vec.py which is a more serious implementation that showcases some more advanced TensorFlow principles about how to efficiently use threads to move data into a text model, how to checkpoint during training, etc.
But first, let’s look at why we would want to learn word embeddings in the first place. Feel free to skip this section if you’re an Embedding Pro and you’d just like to get your hands dirty with the details.