Parallelize user-defined function using apply family in R

Question

I have a script that takes too long to compute and I'm trying to paralellize its execution.

The script basically loops through each row of a data frame and perform some calculations as shown below:

my.df = data.frame(id=1:9,value=11:19)

sumPrevious <- function(df,df.id){
    sum(df[df$id<=df.id,"value"])
}

for(i in 1:nrow(my.df)){
    print(sumPrevious(my.df,my.df[i,"id"]))
}

I'm starting to learn to parallelize code in R, this is why I first want to understand how I could do this with an apply-like function (e.g. sapply,lapply,mapply).

I've tried multiple things but nothing worked so far:

mapply(sumPrevious,my.df,my.df$id) # Error in df$id : $ operator is invalid for atomic vectors

tushaR · Accepted Answer

Using theparallel package in R you can use the mclapply() function. You will need to adjust your code a little bit to make it run in parallel.

library(parallel)
my.df = data.frame(id=1:9,value=11:19)

sumPrevious <- function(i,df){df.id = df$id[i]
    sum(df[df$id<=df.id,"value"])
}

mclapply(X = 1:nrow(my.df),FUN = sumPrevious,my.df,mc.preschedule = T,mc.cores = no.of.cores)

This code will run the sumPrevious in parallel on no.of.cores in your machine.

Parallelize user-defined function using apply family in R

Tags:

r

parallel-processing

lapply

sapply

mapply

Victor

1 Answers

tushaR

Recent Activity

Donate For Us

Parallelize user-defined function using apply family in R

Tags:

r

parallel-processing

lapply

sapply

mapply

Victor

1 Answers

tushaR

Related questions

Recent Activity

Donate For Us