Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting file by columns

Tags:

linux

bash

I know about cut command which can cut a column(s) from a file, but what can I use to split a file into multiple files so that each file would be named as first line in that column and there would be same number of produced files as there was columns in original file

Example (edit)

Columns are separated by TAB and can be of different length. I would like first file to actually have names of rows.

Probe File1.txt File2.txt File3.txt
"1007_s_at" 7.84390328616472 7.60792223630275 7.77487266222512
...

Also thing is that this original file is extremely huge, so I would want some solution that could split this in one run. That is not calling cut repeatedly

like image 228
Sergej Andrejev Avatar asked Jan 29 '26 05:01

Sergej Andrejev


1 Answers

Can do it with one line of awk:

$ cat test.tsv
field1  field2  field3  field4
asdf    asdf    asdf    asdf
lkjlkj  lkjlkj  lkjlkj  lkjlkj
feh     feh     feh     bmeh

$ awk -F'\t' 'NR==1 {  for(i=1;i<=NF;i++) { names[i] = $i }; next } { for(i=1;i<=NF;i++) print $i >> names[i] }' test.tsv

$ ls
field1  field2  field3  field4  test.tsv

$ cat field4
asdf
lkjlkj
bmeh

Edited to include Tab separator courtesy of Glenn Jackman


Addition

Removing double quotes from the fields:

awk -F'\t' 'NR==1 {  for(i=1;i<=NF;i++) { names[i] = $i }; next } { for(i=1;i<=NF;i++) {gsub(/"/,"",$i); print $i >> names[i] }}' example.tsv

Additional Addition

Removing double quotes from fields, only at the start or end of the field:

awk -F'\t' 'NR==1 {  for(i=1;i<=NF;i++) { names[i] = $i }; next } { for(i=1;i<=NF;i++) {gsub(/^"|"$/,"",$i); print $i >> names[i] }}' example.tsv
like image 80
MattH Avatar answered Jan 30 '26 20:01

MattH