How do I cut a string based on digit first certain digit and the rest
Here's my data
Id actual_pattern
1 100101
2 10101
3 1010101
4 101
Here's the expected output
for cut_pattern1 is the first 4 digits from actual_pattern
for cut_pattern2 is the rest form from cut_pattern1, if the rest from cut_pattern1 is not exist make cut_pattern2 = 0
If any 1 in cut_pattern2, make binary_cut2 = 1 else make binary_cut2 = 0
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
1 100101 1001 01 1
2 10101 1010 1 1
3 1010101 1010 101 1
4 101 101 0 0
To find whether a given string contains a number, convert it to a character array and find whether each character in the array is a digit using the isDigit() method of the Character class.
Create new columns by indexing with str, replace for change empty strings and for new column use Series.str.contains with casting to integers:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 101 101 0 0
EDIT:
Solution for @Rick Hitchcock from comments:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 00001111 0000 1111 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With