Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Usage of local variables in closures when accessing Spark RDDs

How do you read and write from/into different ElasticSearch clusters using spark and elasticsearch-hadoop?

How to format data for the spark mlib kmeans clustering algorithm?

How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

If the one partition is lost, we can use lineage to reconstruct it. Will the base RDD be loaded again?

apache-spark rdd

Use Serializable lambda in Spark JavaRDD transformation

How does Scala compiler handle unused variable values?

Can I run a Time Series Database (TSDB) over Apache Spark?

Spark Mesos Cluster Mode using Dispatcher

apache-spark mesos

Get SparkUncaughtExceptionHandler when run spark-perf

How to use Analytic/Window Functions in Spark Java?

Zeppelin throws java.lang.OutOfMemoryError: Java heap space

ClassNotFoundException: org.apache.spark.repl.SparkCommandLine

Submitting spark app as a yarn job from Eclipse and Spark Context

apache-spark hadoop-yarn

"java.io.IOException: Class not found" on long running Streaming application

How does Spark decide how to partition an RDD?

apache-spark pyspark rdd

How to resolve : Very large size tasks in spark

python apache-spark

Addressing issues with Apache Spark application run in Client mode from Docker container

Exception when training data in Predictionio

Using aws credentials profiles with spark scala app