Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a single column into two

Tags:

python

bash

awk

The data format I have is as follows:

###John###
someData1
someData2
SomeData3
###Mike###
someData1
someData2
###Ford###
someData1
someData2
SomeData3
someData4
someData5
SomeData6

I want the output to be:

John  someData1
      someData2
      someData3

Mike  someData1
      someData2

Ford  someData1
      someData2
      someData3
      someData4
      someData5
      someData6

The problem here is the number of data (somedata?) beneath each name differs and is not pre known. The only piece I've to work with is the leading ### characters that signifies the beginning of a new name.

Somedata? is a single word. Any idea on how to accomplish this?

like image 703
0x0 Avatar asked Jan 02 '26 08:01

0x0


2 Answers

I'd use something like:

def fixup(iterable):
    it = iter(iterable)
    for x in it:
        if x.startswith('###'):
            yield '\n{0}\t{1}'.format(x.strip('#'),next(it))
        else:
            yield '\t{0}'.format(x)

This'll give you an extra newline on the first line, but that can easily be stripped off if you really want to.

like image 100
mgilson Avatar answered Jan 03 '26 21:01

mgilson


An itertools approach:

from itertools import groupby

with open('yourfile') as fin:
    for k, g in groupby(fin, lambda L: L.startswith('###')):
        if k:
            name = next(g).strip('#\n')
        else:
            print '{}\t{}'.format(name, next(g)),
            for line in g:
                print '\t{}'.format(line),
            print
like image 24
Jon Clements Avatar answered Jan 03 '26 20:01

Jon Clements



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!