In order to see the console messages output by a function running in a foreach() loop I followed the advice of this guy and added a sink() call like so:
   library(foreach)    
   library(doMC)
   cores <- detectCores()
   registerDoMC(cores)
   X <- foreach(i=1:100) %dopar%{
   sink("./out/log.branchpies.txt", append=TRUE)
   cat(paste("\n","Starting iteration",i,"\n"), append=TRUE)
   myFunction(data, argument1="foo", argument2="bar")
   }
However, at iteration 77 I got the error 'sink stack is full'. There are well-answered questions about avoiding this error when using for-loops, but not foreach. What's the best way to write the otherwise-hidden foreach output to a file?
This runs without errors on my Mac:
library(foreach)    
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
  sink("log.branchpies.txt", append=TRUE)
  cat(paste("\n","Starting iteration",i,"\n"))
  sink() #end diversion of output
  rnorm(i*1e4)
}
This is better:
library(foreach)    
library(doMC)
cores <- detectCores()
registerDoMC(cores)
sink("log.branchpies.txt", append=TRUE)
X <- foreach(i=1:100) %dopar%{
  cat(paste("\n","Starting iteration",i,"\n"))
    rnorm(i*1e4)
}
sink() #end diversion of output
This works too:
library(foreach)    
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
  cat(paste("\n","Starting iteration",i,"\n"), 
       file="log.branchpies.txt", append=TRUE)
  rnorm(i*1e4)
}
As suggested by this guy , it is quite tricky to keep track of the sink stack. It is, therefore advised to use ability of cat to write to file, such as suggested in the answer above:
cat(..., file="log.txt", append=TRUE)
To save some typing you could create a wrapper function that diverts output to file every time cat is called:
catf <- function(..., file="log.txt", append=TRUE){
  cat(..., file=file, append=append)
}
So that at the end, when you call foreach you would use something like this:
library(foreach)    
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
  catf(paste("\n","Starting iteration",i,"\n"))
  rnorm(i*1e4)
}
Hope it helps!
Unfortunately, none of the abovementioned approaches worked for me: With sink() within the foreach()-loop, it did not stop to throw the "sink stack is full"-error. With sink() outside the loop, the file was created, but never updated.
To me, the easiest way of creating a log-file to keep track of a parallelised foreach()-loop's progress is by applying the good old write.table()-function.
    library(foreach)
    library(doParallel)
    availableClusters <- makeCluster(detectCores() - 1) #use all cpu-threads but one (i.e. one is reserved for the OS)
    registerDoParallel(availableClusters) #register the available cores for the parallisation
    x <- foreach (i = 1 to 100) %dopar% {
           log.text <- paste0(Sys.time(), " processing loop run ", i, "/100")
           write.table(log.text, "loop-log.txt", append = TRUE, row.names = FALSE, col.names = FALSE)
           #your statements here
    }
And don't forget (as I did several times...) to use append = TRUE within write.table().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With