Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manipulating strings in python list

Tags:

python

I have a list of tweets that is grouped into chunks of tweets within the list like so:

[[tweet1, tweet2, tweet3],[tweet4,tweet5,tweet6],[tweet7, tweet8, tweet9]]

I want to count the number of occurences of each word within each subgroup. To do this, I need to split each tweet into individual words. I want to use something similar to str.split(' '), but I receive an error:

AttributeError: 'list' object has no attribute 'split' 

Is there a way to split each tweet into its individual words? The result should looks something like:

[['word1', 'word2', 'word3', 'word2', 'word2'],['word1', 'word1', 'word3', 'word4', 'word5'],['word1', 'word3', 'word3', 'word5', 'word6']]
like image 300
user3745115 Avatar asked Dec 07 '25 09:12

user3745115


1 Answers

If you have a list of strings

tweets = ['a tweet', 'another tweet']

Then you can split each element using a list comprehension

split_tweets = [tweet.split(' ')
                for tweet in tweets]

Since it's a list of lists of tweets:

tweet_groups = [['tweet 1', 'tweet 1b'], ['tweet 2', 'tweet 2b']]
tweet_group_words = [[word
                      for tweet in group
                      for word in tweet.split(' ')]
                     for group in tweet_groups]

Which will give a list of lists of words.

If you want to count distinct words,

words = [set(word 
             for tweet in group
             for word in tweet.split(' '))
         for group in tweet_groups]
like image 134
tobyodavies Avatar answered Dec 09 '25 22:12

tobyodavies