Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve this hash function

Tags:

r

hash

Is there anyway to improve the speed of the initalization of this hash? Currently this takes around 20 minutes on my machine.

#prepare hash()
hash <- list();

mappedV <- # matrix with more than 200,000 elements
for( i in 1:nrow(mappedV) ) {
  hash[[paste(mappedV[i,], collapse = '.')]] <- 0;
}

Before this piece of code, I used a matrix, but this took me more than 3 hours. So I wont complain about the 20 minutes. I am just curious if there are better alternatives. I use the hash function to count each of the 200,000 possible combination.

PS: To concurrency is maybe one option. But this doesn't improve the hashing.

like image 666
Christian Avatar asked Dec 05 '25 06:12

Christian


1 Answers

You'll often save significant time by pre-allocating a list of the desired length, rather than growing it at each iteration.

Behold:

X <- vector(mode="list", 1e5)
Y <- list()

system.time(for(i in 1:1e5) X[[i]] <- 0)
#    user  system elapsed 
#     0.3     0.0     0.3 
system.time(for(i in 1:1e5) Y[[i]] <- 0)
#    user  system elapsed 
#   48.84    0.05   49.34 
identical(X,Y)
# [1] TRUE

Because the entire list Y gets copied each time it's added to, appending additional elements only gets slower and slower as it grows in size.

like image 73
Josh O'Brien Avatar answered Dec 07 '25 04:12

Josh O'Brien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!