apache-spark tutorials and guides

Spark custom aggregation : collect_list+UDF vs UDAF

Mar 03, 2026

Running Spark jobs from Spring RESTful services

Mar 03, 2026

java spring scala rest apache-spark

fast way to process json file in Spark

Mar 03, 2026

json scala apache-spark apache-spark-sql etl

Apache Zeppelin - modify default syntax highlight

Mar 03, 2026

python scala apache-spark syntax-highlighting apache-zeppelin

unable to resize Postgres 10 /dev/shm due to kubernetes limiting shared memory

Mar 02, 2026

postgresql apache-spark kubernetes

Unable to run a jar or sparkApplication on aws EMR

Mar 02, 2026

scala amazon-web-services apache-spark emr

Getting java.lang.UnsupportedOperationException: Cannot evaluate expression in Pyspark

Mar 03, 2026

apache-spark pyspark apache-spark-sql user-defined-functions

using spark to read specific columns data from hbase

Mar 01, 2026

scala hbase apache-spark

How to join two data frames in Apache Spark and merge keys into one column?

Mar 02, 2026

apache-spark dataframe join pyspark apache-spark-sql

How to find out driver IP in databricks cluster?

Mar 03, 2026

apache-spark databricks azure-databricks

Spark transactional write operation using temporary directories

Mar 02, 2026

apache-spark amazon-s3 hdfs

Unable to configure ORC properties in Spark

Mar 02, 2026

java hadoop apache-spark hive cloudera

Spark DataFrame ORC Hive table reading issue

Mar 01, 2026

apache-spark hive apache-spark-sql orc hive-table

Grouping data using Scala/Apache Spark

Mar 01, 2026

scala apache-spark

Is there Spark equivalent for Pandas MultiIndex operation like set_index() or unstack()?

Mar 02, 2026

python pandas apache-spark pyspark apache-spark-sql

Python Graphframes: trouble installing dependencies

Mar 02, 2026

python maven apache-spark pyspark graphframes

Is it possible to use a custom hadoop version with EMR?

Mar 01, 2026

amazon-web-services apache-spark hadoop pyspark amazon-emr

New posts in apache-spark