In this blog post, we explore two sets of emotion combinations using word2vec. Specifically, one posited by Robert Plutchik in 1980 and the other popular media chart featured in vox.com using characters from Inside Out. We are limiting the scope to only dyads, i.e. the combination of two basic emotions that make up a more complex emotion.
Just as blue and red gives purple; joy and surprise gives delight.
Above: The table has been extracted from the Wikipedia page.
Above: The wheel of emotions contains a superset of the previous table, where similar emotions in the wheel are near to each other. Also extracted from the Wikipedia page.
Above: Popular media ‘best-guess’ on emotions dyads based on characters from Inside Out. From Vox.com
Above: To give some context, the movie depicts how a child develops mixed-emotions when she turns adolescent. Here, a mix of joy and sadness is depicted.
To explore the additive nature of emotions, word2vec is a great candidate model. As illustrated in the previous posts, we saw that distributed representation models such as word2vec can solve the following analogies of varying complexity and suggest the underlined words.
Man : King :: Woman : Queen
Lady Gaga: America :: Ayumi Hamasaki : Japan
Equivalently, the above word-pair analogies can be expressed as equations:
Man + King – Woman = Queen
Lady Gaga + America – Japan = Ayumi Hamasaki
In the following sections, we are demonstrating the additive properties of word vectors to find the underlined word.
Joy + Surprise = Delight
import gensim from gensim.models.keyedvectors import KeyedVectors # Load Google's pre-trained Word2Vec model. model = KeyedVectors.load_word2vec_format('I:/Downloads/GoogleNews-vectors-negative300.bin/GoogleNews-vectors-negative300.bin', binary=True) word1 = 'anger' word2 = 'trust' model.most_similar(positive = [word1, word2]) word_w2v = 'resentment' plutchik = 'dominance' model.wv.similarity(word_w2v,plutchik)
Using the above code, the most similar word for the sum of two emotions can be extracted from word2vec, compute the cosine similarity between the suggested word and human suggestion. This similarity measure ranges from -1 (complete opposite) to 1 (identical meaning), and lastly, check if the suggested emotion from a human is within the top 10 suggested words of word2vec.
The rest of the post is organized into 2 studies, each testing the agreement between word2vec and a specific set of suggestions. In each study, we would first present the results, followed by the discussion.
Study 1 – Plutchik
Study 1 Results
Study 1 Discussion
At this point, it is important to note that the pre-trained word2vec model has been trained on Google News dataset with about 100 billion words. Being a news-centered dataset, the corpus is not expected to produce a model that makes subtle distinctions between emotions.
Encouragingly, word2vec exhibits a general agreement with Plutchik’s suggestion emotions, with positive similar scores across all emotion pairs.
We observe that word2vec suggests the same words for a few of the emotion pairs. For example, sorrow is suggested in pair number 9, 11, 16, 17, 18. Whenever sadness is added with something, sorrow is being suggested. This highlights the limited distinction between emotions for a model trained with a News dataset.
Pair 13 is an interesting one, the sum of surprise and sadness. Plutchik suggested disapproval whilst word2vec suggested disappointment which I personally liked more. However, this is only an opinion of a layperson to psychology.
Pair 7 is another interesting one, word2vec thinks that the fear to trust negates the trust and suggests distrust, whilst Plutchik suggests that submission is the culmination of fear and trust. Both seem to make sense, this shows that words can be meaningfully assembled together in more than one way. The combination could be modified by the lexical structure within the sentence.
Pair 21 receives an n/a because the word2vec dictionary does not have morbidness.
Study 2 inside out
Study 2 results
Study 2 discussion
Findings are very similar to that of Study 1, where we observe a general agreement between the human suggestion and word2vec, and repetition of suggestions across emotion dyads (e.g. sorrow and frustration).
This concludes a short post illustrating another use of the versatile distributed representations of words, and highlights the importance of using a relevant corpus for training the word vectors if it is to be used in a specialized domain.