Pancake [sth] vtr

pancake

[sth] vtr

/tool

Why "Pancake"?

Do not think of pancake like a delicious dessert.
Pancake is also a verb.

To pancake means to flatten something.

This tool wants to pancake dimensions from many to one to make clearer and easier reading the distance between word embeddings.

Models:

Word Embedding
learns how to represent words in a 300 dimensional vector space to encode semantic similarity

Logistic Regression Classifier
classifier that uses word embeddings as input to accomplish Tribe's classification purposes

Structural components:

[Key-words] first ten words ordered by descending influency
On the left side, the user can find a list of the ten most influential words for the binomial classification task, according to the logistic regression classifier. The font size is directly proportional to this value: the higher the value, the higher the font size.
The purpose of this list is to highlight to the user which are the words that mostly affect the correctness of the classification, so he/she would know in which regions of the plot to point his/her attention.
At the bottom of the list, a search bar is present, in order to look for a specific word that is not part of the most influential ones.
For example, the word 'Dove' should have a high influency value if we are classifying posts to determine if a piece of text is talking about the brand Dove or not.

[word-list] list of all words that have a corresponding word embedding
A list of all the words for which word embeddings have been trained is presented on the right. The words are ordered alphabetically to make easier searching for a specific word. By hovering on the word in the list, the user can see the corresponding dot highlighted in the plot.
For example, if you are looking for the position of a problematic word, you can find it easily in the list and see its position on the scatterplot in relation with other words.

Slider bar /selection setter
A slider bar appears on the left each time a key-word is selected.
The slider represents a selector through which it is possible to enlarge o restrict the circle along the selected word, or move along the line plot at the bottom.
Starting from 1, that corresponds to the word selected itself, the value decreases all the way to 0.

Scatter plot Data-visualization #1
The scatter plot is the first visualisation the user finds during his/her data exploration. Words are positioned on the x and y axes according to the transformation coming from TSNE (model used to reduce the dimensionality of the embeddings). A circle selector is drawn depending on the position of the handle on the slider, which was previously introduced. The slider allows the user to focus only on the words within the circle boundaries.
Limit: Visualisation space can not reproduce all the embeddings dimensions

Line plot Data-visualization #2
Here, words are plotted only on one axis. Starting from the selected word, assumed to be at similarity value equal to 1, all the other words are shown on the line plot, positioned according to their cosine similarity with respect to the selected keyword. Each similarity is computed in the non-reduced 300 dimensions, which implies that no approximation is taking place. The color of the rectangles which represent the words have been coloured with the same gradient that was used for the scatterplot, even if in this case it does not assume a particular meaning, rather than show the equivalence of the data point both in the scatterplot and in the line plot.
Solution: Turn the analysis to the relationship between words in their actual dimensionality using their pair-wise cosine similarity

Visual Components:

Blue color gradient
What we see in 2 dimensions might not be true in 300 dimensions. Blue color gradient is used to bypass this problem. The color is defined by this function:
What we see in 2 dimensions might not be true in 300 dimensions.
Blue color gradient is used to bypass this problem. The color is defined by this function:

color_i = delta_i =

|cosine_similarity(selected word in 300 dimensions, word_i in 300 dimensions) - cosine_similarity(selected word in 2 dimensions, word_i in 2 dimensions)|

This means that if the dot color is dark, it’s delta value is small, which means that what the user is seeing in 2 dimensions is also happening in 300 dimensions. If the dot color is lighter, it’s delta value is large, which means that what the user is seeing in 2 dimensions is only true in 2 dimensions, thanks to the dimensionality reduction, so the user should be warned not to jump to conclusion too fast with that specific data point.

Pink color gradient
The visualization problem of our model is that it works on a lot of dimension.
The visualization problem of our model is that it works on a lot of dimensions. So, we decided to use the scatter plot to allow the user to interact with the data and make him conscious about the entire model results. But it wasn’t enough to go deeper in the analysis. We decided to flatten all dimensions to one, calculating the similarity between words by means of the cosine similarity in their actual dimensionality.
The tick values on the line plot are shown according to the slider. One, on the left, means high similarity score, that corresponds with the word that was selected. Starting from that, the similarity value becomes smaller. The further the word is plotted, the more different is the word from the selected one. Pink is the color assigned to the similarity value. The plot line is colored with a pink gradient to help the user understand immediately the similarity score he/she is setting. Dark pink corresponds to the maximum level of similarity.

How to use it: