apache-spark tutorials and guides

Spark Listener EventLoggingListener threw an exception / ConcurrentModificationException

Sep 18, 2021

apache-spark

spark pivot without aggregation

Aug 23, 2022

apache-spark apache-spark-sql

Spark on K8s - getting error: kube mode not support referencing app depenpendcies in local

Oct 28, 2022

apache-spark kubernetes

How many RDDs does DStream generate for a batch interval?

Dec 26, 2016

apache-spark spark-streaming

Running a Job on Spark 0.9.0 throws error

Nov 21, 2019

java scala hdfs apache-spark

Apache Spark Joins example with Java

Nov 20, 2022

java join apache-spark optional

Spark SQL Stackoverflow

Sep 24, 2022

apache-spark apache-spark-sql

Using spark-submit, what is the behavior of the --total-executor-cores option?

Nov 14, 2022

multithreading hadoop apache-spark pyspark cpu-cores

Spark streaming checkpoints for DStreams

Nov 20, 2022

apache-spark spark-streaming checkpointing

Spark on Windows - What exactly is winutils and why do we need it?

Feb 23, 2022

hadoop apache-spark

why Livy or spark-jobserver instead of a simple web framework?

Mar 26, 2022

apache-spark spark-jobserver livy

Failed to load implementation NativeSystemBLAS HiBench

Mar 25, 2022

apache-spark

Kill a single spark task

Nov 06, 2022

apache-spark distributed-computing mesos

Apache Spark Python Cosine Similarity over DataFrames

Oct 24, 2022

python apache-spark pyspark apache-spark-sql cosine-similarity

Matrix Math With Sparklyr

Jul 08, 2019

r apache-spark apache-spark-mllib sparklyr

How to write JDBC Sink for Spark Structured Streaming [SparkException: Task not serializable]?

Mar 13, 2022

scala apache-spark spark-structured-streaming

Spark Structured Streaming ForeachWriter and database performance

Sep 05, 2022

database scala apache-spark jdbc spark-structured-streaming

Intermittent Timeout Exception using Spark

Nov 02, 2019

scala apache-spark

What is the difference between spark's shuffle read and shuffle write?

Nov 17, 2022

apache-spark apache-spark-sql

Tips for properly using large broadcast variables?

Sep 25, 2021

python apache-spark pyspark pickle rdd

New posts in apache-spark