apache-spark tutorials and guides

Why does calling cache take a long time on a Spark Dataset?

Mar 04, 2026

How to split columns into two sets per type?

Mar 03, 2026

scala apache-spark apache-spark-sql

Spark Structtype for coalesce

Mar 03, 2026

scala apache-spark dataframe struct coalesce

Spark - Scala - Remove Columns from a dataframe based on condition

Mar 03, 2026

scala apache-spark

How to divide the value of current row with the following one?

Mar 03, 2026

scala apache-spark apache-spark-sql window-functions

How to overcome the Spark spark.kryoserializer.buffer.max 2g limit?

Mar 04, 2026

apache-spark

Is there Spark Arrow Streaming = Arrow Streaming + Spark Structured Streaming?

Mar 02, 2026

apache-spark spark-structured-streaming pyarrow apache-arrow

What makes Spark fast if data size exceeds available memory?

Mar 03, 2026

hadoop apache-spark bigdata

How to pass complex Java Class Object as parameter to Scala UDF in Spark?

Mar 03, 2026

java scala apache-spark user-defined-functions

Spark custom aggregation : collect_list+UDF vs UDAF

Mar 03, 2026

apache-spark dataframe aggregate-functions user-defined-functions

Running Spark jobs from Spring RESTful services

Mar 03, 2026

java spring scala rest apache-spark

fast way to process json file in Spark

Mar 03, 2026

json scala apache-spark apache-spark-sql etl

Apache Zeppelin - modify default syntax highlight

Mar 03, 2026

python scala apache-spark syntax-highlighting apache-zeppelin

unable to resize Postgres 10 /dev/shm due to kubernetes limiting shared memory

Mar 02, 2026

postgresql apache-spark kubernetes

Unable to run a jar or sparkApplication on aws EMR

Mar 02, 2026

scala amazon-web-services apache-spark emr

Getting java.lang.UnsupportedOperationException: Cannot evaluate expression in Pyspark

Mar 03, 2026

apache-spark pyspark apache-spark-sql user-defined-functions

using spark to read specific columns data from hbase

Mar 01, 2026

scala hbase apache-spark

How to join two data frames in Apache Spark and merge keys into one column?

Mar 02, 2026

apache-spark dataframe join pyspark apache-spark-sql

How to find out driver IP in databricks cluster?

Mar 03, 2026

apache-spark databricks azure-databricks

New posts in apache-spark