sed 's/\t/_tab_/3g'
I have a sed command that basically replaces all excess tab delimiters in my text document. My documents are supposed to be 3 columns, but occasionally there's an extra delimiter. I don't have control over the files.
I use the above command to clean up the document. However all my other operations on these files are in python. Is there a way to do the above sed command in python?
sample input:
Column1 Column2 Column3
James 1,203.33 comment1
Mike -3,434.09 testing testing 123
Sarah 1,343,342.23 there here
sample output:
Column1 Column2 Column3
James 1,203.33 comment1
Mike -3,434.09 testing_tab_testing_tab_123
Sarah 1,343,342.23 there_tab_here
You may read the file line by line, split with tab, and if there are more than 3 items, join the items after the 3rd one with _tab_:
lines = []
with open('inputfile.txt', 'r') as fr:
for line in fr:
split = line.split('\t')
if len(split) > 3:
tmp = split[:2] # Slice the first two items
tmp.append("_tab_".join(split[2:])) # Append the rest joined with _tab_
lines.append("\t".join(tmp)) # Use the updated line
else:
lines.append(line) # Else, put the line as is
See the Python demo
The lines variable will contain something like
Mike -3,434.09 testing_tab_testing_tab_123
Mike -3,434.09 testing_tab_256
No operation here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With