Python data structure for thesaurus

Question

I need to have synonyms that I define for about 100 words of my of my choice. For testing I am adding the entries manually:

t = {}
t.update({'Strong':['Strong', 'Able', 'Active', 'Big',
                    'Energy', 'Firm',
                    'Force', 'Heavy', 'Robust', 'Secure',
                    'Solid', 'Stable', 'Steady',
                    'Tough', 'Vigor', 'Might',
                    'Rugged', 'Sound']})

t.update({'Fast':['Fast', 'Agile', 'Brisk', 'Hot', 'Quick',
              'Rapid', 'Swift', 'Accel', 'Active',
              'Dash', 'Flash', 'Fly', 'Race', 'Snap',
              'Wing', 'Streak', 'Time', 'Chop', 'Jiffy',
              'Split', 'Bat', 'Crazy', 'Double', 'Scream',
              'Sonic', 'Super', 'Ball', 'Speed']})

So I am creating an empty dictionary, and then taking words like "Strong" and "Fast" and mapping it to synonyms (which I need to be able to choose).

Since I need only 100 different word mappings is this a reasonable approach? Or is there a better way to implement this?

I am also looking at using NLTK and the wordnet module. However, this module takes awhile to run and it seems I have no way of adding synonyms like I need.

thyago stall · Accepted Answer

I could organize your thesaurus in a graph fashion. First of all, you keep all the words in a dictionary word -> key and then you make a linked-list graph, since it will be sparse.

w = {}
w = {'Fast': 0, 'Strong': 1, 'Able': 2, 'Active': 3, 'Big': 4, ...}

t = {0: [1, 2, 3, ...], ...}

It would scale better for large data sets, since ints use less memory than strings.

Adrian McCarthy · Answer

In an actual thesaurus, individual words may belong to multiple sets of synonyms. For example, fast as in quick might be one list while fast as in secure might be in another.

I would map each word to a list of "sense groups," and then each sense group would map to a list of words.

Python data structure for thesaurus

Tags:

python

algorithm

data-structures

William Ross

2 Answers

thyago stall

Adrian McCarthy

Recent Activity

Donate For Us

Python data structure for thesaurus

Tags:

python

algorithm

data-structures

William Ross

2 Answers

thyago stall

Adrian McCarthy

Related questions

Recent Activity

Donate For Us