Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript and collections

Imaging you have 3 lists/arrays in javascript/NodeJs

  1. array contains 1.000.000 data-items
  2. array contains 100.000 data-items
  3. array contains 50.000 data-items

Each data-item is a object with 2 properties - Like a date and a price. All items in array 2 and 3 are subsets/sub-list of items from array 1.

My question: How do i in the fastest way - match all dates of each item from array 1 - with all the dates in array 2 and 3 - for each single item in array 1?

I'm used to .NET/C# - where something like 'contains(item)' is nice... right now in NodeJS i use 3 for-loops - which is way to slow...i need some kind of index or the like to speed up the process...

An example of data could be like:

Input:

array 1: 1,2,3,4,5,6,7,8,10
array 2: 2,3,5,7,9
array 3: 1,4,5,10

Out-put (written to a file):

1,'',1
2,2,''
3,3,''
4,'',4
..ect...
like image 829
PabloDK Avatar asked Jun 06 '26 04:06

PabloDK


2 Answers

var t = {}
// loop through once and create a constant time lookup
arr1.forEach(function(item){t[item.date] = {1: item.price, 2: null, 3:null})
arr2.forEach(function(item){if (t[item.date]) t[item.date].2 = item.price})
arr3.forEach(function(item){if (t[item.date]) t[item.date].3 = item.price})

This will be a linear operation, the tradeoff with sorting the data first may or may not be worth the time to do the sorting.

It would be about the same as a triple JOIN either way, the solution I have provided is O(N) where as nested loops might be O(N^3) a sorted solution would still probably be O(Nlog(N)) just a guess.

If the dates are already sorted you could potentially bucketize the dates or do some sort of radix search, it might speed it up a bit.

See: https://en.m.wikipedia.org/wiki/Radix_tree

You might also be able to do it with promises so it runs async:

var t = {}
// loop through once and create a constant time lookup
arr1.forEach(function(item){t[item.date] = {1: item.price, 2: null, 3:null})

var promiseArray = arr2.map(function(item){
    return Promise.resolve(item)
        .then(function(item){
              if (t[item.date]) t[item.date].2 = item.price})
         })
// concat the two promise arrays together 
promiseArray.concat(arr3.map(function(item){
    return Promise.resolve(item)
        .then(function(item){
              if (t[item.date]) t[item.date].3 = item.price})
         }))
// resolve all the promises
Promise.all(promiseArray)
    .then(function(){
        // t has results
        debugger
    })
like image 86
jmunsch Avatar answered Jun 08 '26 18:06

jmunsch


I'd try to firstly sort all arrays by your key property (Date, afaiu) (if not sorted yet), then use a single for loop over the 1st array with cursors in two other arrays, which would move to the next item only when the current item has been written to the output. This way there wouldn't be any "contains" search through the whole arrays.

Example:

var j = 0;
var k = 0;
for( var i = 0; i < array1.length; ++i ) {
    var out1 = array1[i].date;
    var out2 = j < array2.length && array2[j].date == out1 ? array2[j++].value : '';
    var out3 = k < array3.length && array3[k].date == out1 ? array3[k++].value : '';
    output( out1, out2, out3 );
}
like image 20
JustAndrei Avatar answered Jun 08 '26 18:06

JustAndrei



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!