I do have a CSV which looks like this*:
system,subject,value1,value2
example.org,thing 1,100,4
exmaple.org,thing 2,90,0
example.com,thing 1,200,0
example.com,thing 5,10,10
The header us actually not included, but shown here to make it easier to read the example.
And I want to split that up into two files:
example.org.csv with:
thing 1,100,4
thing 2,90,0
example.com.csv with:
thing 1,200,0
thing 5,10,10
My current solution works this way:
while read line; do
SYSTEM=$(echo "$line" | cut -d, -f1)
NOTTHESYSTEM=$(echo "$line" | cut -d, -f2-)
echo "${NOTTHESYSTEM}" >> "${SYSTEM}.csv"
done <$INPUT
But this is working very inefficient and doesn't perform well with bigger files.
In numbers this means that a 52050 line/ 9 MB file needs about 250 secounds to finish the split.
Any suggestions how to improve the script above are welcome.
Cheers
Using awk it will be simpler:
awk 'BEGIN{FS=OFS=","} {print $2, $3, $4 > $1 ".csv"}' "$INPUT"
Verification:
cat example.org.csv
thing 1,100,4
thing 2,90,0
cat example.com.csv
thing 1,200,0
thing 5,10,10
One way to do it, assuming you have a reasonably small number of systems:
cut -d, -f1 file.csv | \
sort -u | \
while read -r system; do
fgrep -w "$system" file.csv | cut -d, -f2- >"$system".csv
done
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With