Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Best approach to Cassandra (+ Spark?) for Continuous Queries?

JAVA_HOME error with upgrade to Spark 1.3.0

java scala hadoop apache-spark

How to run spark interactively in cluster mode

scala apache-spark

why Spark is not distributing jobs to all executors, but to only one executer?

PySpark No suitable driver found for jdbc:mysql://dbhost

Why are my Tasks Succeeded above Tasks Total in Spark UI?

apache-spark

Apache Spark Lambda Expression - Serialization Issue

spark-1.4.1 saveAsTextFile to S3 is very slow on emr-4.0.0

amazon-s3 apache-spark emr

Saving Spark DataFrames with nested User Data Types

Create Custom Cross Validation in Spark ML

Spark Connector error: WARN NettyUtil: Found Netty's native epoll transport, but not running on linux-based operating system. Using NIO instead

Why won't this Spark sample code load in spark-shell?

scala apache-spark

too many map keys causing out of memory exception in spark

scala apache-spark

How to improve my recommendation result? I am using spark ALS implicit

How to serialize a pyspark Pipeline object?

Can I create an RDD from a kafka topic if I do not know the until offset?

apache-spark apache-kafka

How to Set spark.sql.parquet.output.committer.class in pyspark

Performance of loading parquet files into case classes in Spark

PySpark how to read file having string with multiple encoding

python apache-spark pyspark

Why does SparkSQL require two literal escape backslashes in the SQL query?