Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Not able to to convert R data frame to Spark DataFrame

Tags:

r

sparkr

When I try to convert my local dataframe in R to Spark DataFrame using:

raw.data <- as.DataFrame(sc,raw.data)

I get this error:

17/01/24 08:02:04 WARN RBackendHandler: cannot find matching method class org.apache.spark.sql.api.r.SQLUtils.getJavaSparkContext. Candidates are: 17/01/24 08:02:04 WARN RBackendHandler: getJavaSparkContext(class org.apache.spark.sql.SQLContext) 17/01/24 08:02:04 ERROR RBackendHandler: getJavaSparkContext on org.apache.spark.sql.api.r.SQLUtils failed Error in invokeJava(isStatic = TRUE, className, methodName, ...) :

The question is similar to sparkR on AWS: Unable to load native-hadoop library and

like image 344
Abhishek Gupta Avatar asked Oct 18 '25 12:10

Abhishek Gupta


1 Answers

Don't need to use sc if you are using the latest version of Spark. I am using SparkR package having version 2.0.0 in RStudio. Please go through following code (that is used to connect R session with SparkR session):

if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
Sys.setenv(SPARK_HOME = "path-to-spark home/spark-2.0.0-bin-hadoop2.7")
 }
 library(SparkR)
 library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R","lib")))
 sparkR.session(enableHiveSupport = FALSE,master = "spark://master url:7077", sparkConfig = list(spark.driver.memory = "2g"))

Following is the output of R console:

> data<-as.data.frame(iris)
> class(data)
[1] "data.frame"
> data.df<-as.DataFrame(data)
> class(data.df)
[1] "SparkDataFrame"
attr(,"package")
[1] "SparkR"
like image 189
Saurabh Chauhan Avatar answered Oct 21 '25 02:10

Saurabh Chauhan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!