Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I group multi-word terms when creating a python wordcloud?

I am trying to create a wordcloud using python from a list of ingredients, some of which have more than one word in their name. I would like the wordcloud to consider those names as single elements, but I don't know how to achieve that. For example:

import wordcloud as w
import numpy as np
import matplotlib.pyplot as plt

ingredients = ['cabernet sauvignon', 'apple', 'black pepper',
             'rice', 'smoked salmon',
             'dried tomato', 'butter', 'mushroom', 'goat cheese']
frequencies = [55, 83, 33, 42, 19, 23, 5, 69, 1]

# Wordcloud asks for a string, and I have tried separating the terms with ',' and '~'

text = ''
for i, word in enumerate(ingredients):
    text = text + frequencies[i] * (word + ',') 

wordcloud = w.WordCloud(collocations = False).generate(text)

plt.imshow(wordcloud, interpolation = 'bilinear')
plt.axis("off")
plt.show()

The resulting wordcloud is the following. But, for example, I would like the term "cabernet sauvignon" to appear as only one word.

https://i.sstatic.net/yuHns.png

like image 677
Juan Topo Avatar asked Oct 19 '25 00:10

Juan Topo


1 Answers

Create a dict in the form {phrase: count, ...} and use generate_from_frequencies:

d = dict(zip(ingredients, frequencies))
wordcloud = w.WordCloud(collocations=False).generate_from_frequencies(d)
like image 187
tobias_k Avatar answered Oct 20 '25 13:10

tobias_k