Perform a Python Split on a Pandas Dataframe

Question

I have the following dataframe:

import pandas as pd

data = {'Test_Step_ID': ['9.1.1', '9.1.2', '9.1.3', '9.1.4'],
        'Protocol_Name': ['A', 'B', 'C', 'D'],
        'Req_ID': ['SRS_0081d', 'SRS_0079', 'SRS_0082SRS_0082a', 'SRS_0015SRS_0015cSRS_0015d']
        }
df = pd.DataFrame(data)

I want to duplicate the rows based on the column "Req_ID" based on the "SRS" value keeping all other columns values same; hence I want 2 rows for the SRS_0082, SRS_0082a and then three rows for SRS_0015, SRS_0015c, SRS_0015d

Can someone help me here? appreciate the help. Thanks in advance. [EDITED]:

I want the result to look like this: enter image description here

mozway · Accepted Answer

split on the zero width location between SRS and a preceding character using the '(?<=.)(?=SRS) regex, and explode:

out = (df
  .assign(Req_ID=df['Req_ID'].str.split(r'(?<=.)(?=SRS)'))
  .explode('Req_ID')
 )

Output:

  Test_Step_ID Protocol_Name     Req_ID
0        9.1.1             A  SRS_0081d
1        9.1.2             B   SRS_0079
2        9.1.3             C   SRS_0082
2        9.1.3             C  SRS_0082a
3        9.1.4             D   SRS_0015
3        9.1.4             D  SRS_0015c
3        9.1.4             D  SRS_0015d

Regex:

(?<=.)  # match any character before the split
(?=SRS) # match "SRS" after the split

regex demo

Pravash Panigrahi · Answer

I have modified your code, you can try -

df['Req_ID'] = df['Req_ID'].str.split('SRS_')

df = df.explode('Req_ID')

df['Req_ID'] = df['Req_ID'].str.strip()
df = df[df['Req_ID'].ne('')]

df['Req_ID'] = 'SRS_' + df['Req_ID']

print(df)

Perform a Python Split on a Pandas Dataframe

Tags:

python

pandas

ruser

2 Answers

mozway

Pravash Panigrahi

Recent Activity

Donate For Us

Perform a Python Split on a Pandas Dataframe

Tags:

python

pandas

ruser

2 Answers

mozway

Pravash Panigrahi

Related questions

Recent Activity

Donate For Us