apache-spark tutorials and guides

GenericRowWithSchema exception in casting ArrayBuffer to HashSet in DataFrame to RDD from Hive table

Jul 04, 2019

Concatenate Sparse Vectors in Spark?

Oct 19, 2022

scala apache-spark

JSON file parsing in Pyspark

Sep 08, 2022

apache-spark dataframe pyspark apache-spark-sql pyspark-sql

How to check if array column is inside another column array in PySpark dataframe

Jun 26, 2022

apache-spark dataframe pyspark apache-spark-sql pyspark-sql

Count number of columns in pyspark Dataframe?

Nov 09, 2022

apache-spark machine-learning pyspark pyspark-sql

How to concatenate/append multiple Spark dataframes column wise in Pyspark?

Jul 02, 2022

python apache-spark pyspark apache-spark-sql pyspark-sql

Spark _temporary creation reason

Jun 26, 2022

apache-spark

How to convert empty arrays to nulls?

Aug 20, 2022

apache-spark pyspark apache-spark-sql pyspark-sql

Escape New line character in Spark CSV read

Jul 09, 2022

python apache-spark dataframe pyspark

Python pandas_udf spark error

Aug 30, 2022

pandas apache-spark pyspark pyarrow

repartition() is not affecting RDD partition size

Apr 20, 2022

apache-spark rdd

Spark - write Avro file

Oct 23, 2022

apache-spark avro

How to create a Dataset from custom class Person?

Jan 23, 2017

apache-spark apache-spark-sql apache-spark-dataset

Running Apache.Spark - log4j:WARN Please initialize the log4j system properly

Oct 31, 2022

java apache-spark log4j

Store aggregate value of a PySpark dataframe column into a variable

Nov 15, 2022

apache-spark pyspark

Spark: sum over list containing None and Some()?

Mar 14, 2022

scala apache-spark

How to set up cluster environment for Spark applications on Windows machines?

Oct 19, 2022

windows apache-spark mesos apache-spark-standalone

Avoiding multiple streaming queries

Feb 22, 2022

apache-spark spark-structured-streaming

Spark getnewargs error ... Method or([class java.lang.String]) does not exist

Apr 15, 2022

apache-spark pyspark apache-spark-sql

How to set YARN queue for spark-shell?

Aug 21, 2022

apache-spark apache-spark-sql

New posts in apache-spark