Visualizing Game of Thrones with BERT
This past weekend while watching Game of Thrones at dinner — I had a thought!
How does BERT understand Game of Thrones?
The thought of visualizing all the texts of GOT books with mightly BERT in 3D space.
How can we achieve this —
- First — we will extract the BERT embeddings for each word across all GOT books.
- Then — reduce the dimension of BERT embeddings to visualize it in 3D
- And finally — create a web application to visualize it on the browser
So, Let’s get started
1. Extracting BERT Embeddings for Game of Thrones Books
Extracting BERT embeddings for your custom data can be intimidating at first — but not anymore.
Let’s use this package for our data —
We will extract 5 Game of Thrones books using
Next — We will clean the content of the books. And we will store the content as a list of sentences
Once we have a clean list of sentences for each book, we can extract BERT embeddings using the code below:
These embeddings are out of pre-trained BERT model. You can also fine-tune BERT on GOT texts before fetching the embeddings.
2. Dimensionality Reduction: BERT Embeddings
BERT embeddings are 768 dimension vectors i.e. we have 768 numbers to represent each word or tokens found in the books.
We will reduce the dimensionality of these words from 768 to 3 — to visualize these tokens/words in 3 dimensions using the code below:
Let’s apply the above function to our embeddings
🥳 Voila! Now we have 3 dimension projection of each word in all the GOT books
Extraction of BERT embeddings and dimensionality reduction can be a time-consuming process. You can download Game of Thrones BERT Embeddings from here: Download
3. Building A Web App to visualize on the Browser
This is the final part of this project. We will build a front end to visualize these embeddings in 3 dimensions in pure python.
To do this, we will use
You can install dash :
pip install dash
If you need a end-to-end article on how to build web apps in pure python with
Dash, let me know in the comments
A Dash application consists of 3parts —
1. Dependencies and app instantiation
This section talks about importing dependent packages and starting a Dash app
It lets you define how your web application would look like - widgets, sliders, dropdowns etc. & their alignment
It lets you add interactivity on your charts, visuals or buttons.
Lets write all the 3 parts of the Dash app in a single
app.py file and run the
app.py file in your terminal:
>> python app.py
🥳 Hooray, you’re done!
Now you can explore your GOT characters in 3D.
But, what did I find out from this experiment?
- Were all characters, food items, places, things formed seperate clusters?
- Which all characters were in close proximity?
I will keep it for the next time