Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: extract the text string from a DataFrame to a long string

I have a pandas.DataFrame: df1 as following.

   date                  text                             name
     1      I like you hair, do you like it              screen1
     2      beautiful sun and wind                       screen2
     3      today is happy, I want to got school         screen3
     4      good movie                                   screen4
     5      thanks god                                   screen1

I want to make a long text string from the text column values in the df1. And the expected result will be as shown below:

    str_long = "I like you hair, do you like it beautiful sun and     
     wind today is happy, I want to got school good movie thanks god"

Could anyone help me with this please?

like image 786
tktktk0711 Avatar asked Oct 21 '25 18:10

tktktk0711


2 Answers

Use the .str.cat() method of a data frame column (Series object):

df["text"].str.cat(sep=" ")

You can apply str.join() on a data frame column as well:

" ".join(df["text"])

Or, you can just call sum() on the Series instance (you may lose the spaces between each individual strings in this case though):

df["text"].sum()
like image 142
alecxe Avatar answered Oct 23 '25 07:10

alecxe


Just use tolist()

' '.join(df['text'].tolist())

Explanation:

df = pd.DataFrame({'date': [1, 2, 3], 'text': ['I like your', 'beautiful sun', 'good movie']})

df
Out[68]: 
   date           text
0     1    I like your
1     2  beautiful sun
2     3     good movie

' '.join(df['text'].tolist())
Out[72]: 'I like your beautiful sun good movie'
like image 36
MaThMaX Avatar answered Oct 23 '25 06:10

MaThMaX



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!