Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in library(functional) : there is no package called ‘functional’ - While running MR using rmr2

I am trying to run a simple MR program using rmr2 in a single node Hadoop cluster. Here is the environment for the setup

Ubuntu 12.04 (32 bit)
R (Ubuntu comes with 2.14.1, so updated to 3.0.2)
Installed the latest rmr2 and rhdfs from here and the corresponding dependencies
Hadoop 1.2.1

Now I am trying to run a simple MR program as

Sys.setenv(HADOOP_HOME="/home/training/Installations/hadoop-1.2.1")
Sys.setenv(HADOOP_CMD="/home/training/Installations/hadoop-1.2.1/bin/hadoop")

library(rmr2)  
library(rhdfs)

ints = to.dfs(1:100)  
calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v))
from.dfs(calc)

The mapreduce job fails with the below error message in hadoop-1.2.1/logs/userlogs/job_201310091055_0001/attempt_201310091055_0001_m_000000_0/stderr

Error in library(functional) : there is no package called ‘functional’  
Execution halted  
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1  
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)  
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)  

But, the sessionInfo() shows that functional package has been loaded

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i686-pc-linux-gnu (32-bit)

>locale:
 [1] LC_CTYPE=en_IN       LC_NUMERIC=C         LC_TIME=en_IN       
 [4] LC_COLLATE=en_IN     LC_MONETARY=en_IN    LC_MESSAGES=en_IN   
 [7] LC_PAPER=en_IN       LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C 

>attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

>other attached packages:
 [1] rhdfs_1.0.6    rJava_0.9-4    rmr2_2.3.0     reshape2_1.2.2 plyr_1.8      
 [6] stringr_0.6.2  **functional_0.4** digest_0.6.3   bitops_1.0-6   RJSONIO_1.0-3 
[11] Rcpp_0.10.5

Update : I am able to run a R MR job reading and writing from STDIO without using the rmr2 and the rhdfs libraries as mentioned here. So, for now my guess is that the problem is isolated to rmr2 and the rhdfs packages.

How to get around this problem?

like image 731
Praveen Sripati Avatar asked Sep 19 '25 11:09

Praveen Sripati


2 Answers

Install the dependencies for rmr2/rhdfs in a system directory instead of a custom directory (~/R/x86_64-pc-linux-gnu-library/3.0). This can be done running R as sudo and then installing the dependencies. Thanks to Antonio for the help in the RHadoop forums.

like image 54
Praveen Sripati Avatar answered Sep 21 '25 02:09

Praveen Sripati


The most common solution of these kind of problem is re-installation since in sesssionInfo() you are getting

**functional_0.4** 

while when i did sessionInfo() i got

functional_0.4

i guess there is some missing dependencies you might be missing so use from your R console

install.packages("functional",dependencies="TRUE") 

to fix any problem due to any other packages .

P.S: Choose cloud-0 mirror from the available ones.

If still that does not help i recommend you use r-base-dev as your R version though i don't have a reason to justify this using http://cran.r-project.org/bin/linux/ubuntu/README

sudo apt-get install r-base-dev

Thanks

like image 22
igauravsehrawat Avatar answered Sep 21 '25 02:09

igauravsehrawat