Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stdout and stderr of parallel computation in R

I am using the package parallel to do computation. Here is a toy example:

library(parallel)
m = matrix(c(1,1,1,1,0.2,0.2,0.2,0.2), nrow=2)
myFun = function(x) {
  if (any(x<0.5)) {
    write("less than 0.5", stderr())
    return(NA)
  } else {
    write("good", stdout())
    return(mean(x))
  }
}
cl = makeCluster(2, outfile="/tmp/output")
parApply(cl, m, 2, myFun)
stopCluster(cl)

The problem is both the stdout and the stderr will be redirected to /tmp/output. The output file looks like this:

starting worker pid=51083 on localhost:11953 at 11:37:12.966
starting worker pid=51093 on localhost:11953 at 11:37:13.261
good
good
less than 0.5
less than 0.5

Is there any way to setup two separate files for the stdout and the stderr, respectively? and how to ignore the first two lines of "starting worker pid=..."?

like image 890
RNA Avatar asked Oct 24 '25 04:10

RNA


1 Answers

The parallel package doesn't directly support sending stdout and stderr to separate files, but you can do it yourself:

cl = makeCluster(2)

setup = function(outfile, errfile) {
  assign("outcon", file(outfile, open="a"), pos=.GlobalEnv)
  assign("errcon", file(errfile, open="a"), pos=.GlobalEnv)
  sink(outcon)
  sink(errcon, type="message")
}

shutdown = function() {
  sink(NULL)
  sink(NULL, type="message")
  close(outcon)
  close(errcon)
  rm(outcon, errcon, pos=.GlobalEnv)
}

clusterCall(cl, setup, "/tmp/output", "/tmp/errmsg")
parApply(cl, m, 2, myFun)
clusterCall(cl, shutdown)

Since the "starting worker" messages are issued before setup is called, those messages are redirected to "/dev/null", which is the default behavior when outfile isn't specified.

like image 192
Steve Weston Avatar answered Oct 26 '25 18:10

Steve Weston