apache-spark tutorials and guides

Issue with Spark Java API, Kerberos, and Hive

Dec 21, 2025

Spark write partition in hdfs having files of the same size

Dec 21, 2025

apache-spark apache-spark-sql

how to convert rdd to list effectively without using collect function

Dec 21, 2025

java scala apache-spark spark-streaming

Details of Stage in Spark

Dec 20, 2025

scala hadoop apache-spark apache-spark-sql rdd

Spark Structured Streaming using sockets, set SCHEMA, Display DATAFRAME in console

Dec 21, 2025

apache-spark pyspark apache-spark-sql spark-structured-streaming

Java 17 solution for Spark - java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.storage.StorageUtils

Dec 20, 2025

java apache-spark java-17

Spark Dataframe API: group by id and compute combinations

Dec 21, 2025

apache-spark apache-spark-sql

Are there alternative solution without cross-join in Spark 2?

Dec 20, 2025

scala apache-spark user-defined-functions

Is it possible to scale data by group in Spark?

Dec 19, 2025

python apache-spark pyspark

How does Spark evict cached partitions?

Dec 20, 2025

apache-spark

Add minutes from another column to string time column in pyspark

Dec 20, 2025

python apache-spark date pyspark timestamp

Spark is not loading all multiline json objects in a single file even with multiline option set to true

Dec 20, 2025

apache-spark apache-spark-sql

How do I set spark.sql.debug.maxToStringFields?

Dec 20, 2025

python scala apache-spark pyspark environment-variables

Unable to perform aggregation on 2 values using groupByKey in spark using scala

Dec 20, 2025

scala apache-spark rdd

DataType.fromJson() Error: java.lang.IllegalArgumentException: Failed to convert the JSON string 'int' to a data type

Dec 18, 2025

json scala apache-spark

Getting java.lang.NoSuchMethodError: org.yaml.snakeyaml.Yaml.<init> while running spark based spring boot application

Dec 20, 2025

spring-boot apache-spark snakeyaml

Common metadata in databricks cluster

Dec 20, 2025

apache-spark databricks azure-databricks databricks-connect

New posts in apache-spark