We have 500GB file with integer rows. How we can sort it with only 512Mb RAM using Node.js? I think something like that:
Some ideas?
UPDATE: Thanks to user some-random-it-boy This solution based on child-process with native sort util. I think it should work)
var fs = require('fs'),
spawn = require('child_process').spawn,
sort = spawn('sort', ['in.txt']);
var writer = fs.createWriteStream('out.txt');
sort.stdout.on('data', function (data) {
writer.write(data)
});
sort.on('exit', function (code) {
if (code) console.log(code); //if some error
writer.end();
});
I hate to give a non-js solution to a js question. But since you are using the node environment why not delegate this task to a process that has been designed just for that?
With your package child-process, call the sort
(docs here) command with whatever parameters you need.
Quoting from this answer:
According to the algorithm used by sort, it will use memory according to what is available: half of the biggest number between TotalMem/8 and AvailableMem. So, for example, if you have 4 GB of available mem (out of 8 GB), sort will use 2GB of RAM. It should also create many 2 GB files in /bigdisk and finally merge-sort them.
Which essentially was what you suggested doing, already implemented and in C running on bare hardware without any interpreters in between. I guess you can't get faster than that within your constraints :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With