I am trying to have an outfile query run a single process per value in an array to speed up the process of exporting data from mysql, id like to run the script on multiple cores. My bash script is:
dbquery=$(mysql -u user -p -e "SELECT distinct(ticker) FROM db.table")
array=( $( for i in $dbquery ; do echo $i ; done ) )
csv ()
{
dbquery=$(mysql -u user --password=password -e "SELECT * FROM db2.table2 WHERE symbol = '$i' INTO OUTFILE '/tmp/$i.csv' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
}
set -m
for i in 'seq 28'; do #trying to run on 28 cores
for j in ${array[@]}; do
csv $j &
done
sleep 5 &
done
while [ 1 ];
do
fg 2> /dev/null; [ $? == 1 ] && break;
done
Now I ran this and it is not exporting files as i wished it too and i cannot figure out how to kill the processes. Could you help me understand how to fix this so that it will run the outfile query per ticker? Also how do I kill the current script that is running without killing other scripts and programs that are running?
You can use xargs to automatically handle job scheduling:
dbquery=$(mysql -u user -p -e "SELECT distinct(ticker) FROM db.table")
array=( $( for i in $dbquery ; do echo $i ; done ) )
csv ()
{
dbquery=$(mysql -u user --password=password -e "SELECT * FROM db2.table2 WHERE symbol = '$i' INTO OUTFILE '/tmp/$i.csv' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
}
export -f csv
echo "${array[@]}" | xargs -P 28 -n 1 bash -c 'csv "$1"' --
The problem with your approach is that because the loops are nested, you start all processes 28 times each, rather than running them once and 28 at a time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With