apache-spark-sql tutorials

Spark DataFrame ORC Hive table reading issue

Mar 01, 2026

Is there Spark equivalent for Pandas MultiIndex operation like set_index() or unstack()?

Mar 02, 2026

python pandas apache-spark pyspark apache-spark-sql

How to read a csv into pyspark without a java heap memory error

Feb 28, 2026

java-8 pyspark heap-memory apache-spark-sql

How to get the COUNT of emails for each id in Scala

Feb 28, 2026

sql scala apache-spark apache-spark-sql

how to merge two columns with a condition in pyspark?

Mar 01, 2026

apache-spark pyspark apache-spark-sql

Why does Zeppelin fail with "mismatched input ';' expecting <EOF>" in %spark.sql paragraph?

Feb 28, 2026

apache-spark apache-spark-sql parquet apache-zeppelin

org.apache.spark.sql.AnalysisException: cannot resolve given input column

Feb 28, 2026

apache-spark dataframe apache-spark-sql

How to append collection as new column to DataFrame with many columns?

Feb 28, 2026

scala dataframe apache-spark functional-programming apache-spark-sql

Missing data when ordering Pyspark Window

Feb 28, 2026

apache-spark pyspark apache-spark-sql

How to implement Slowly Changing Dimensions (SCD2) Type 2 in Spark using SQL Join

Feb 27, 2026

apache-spark apache-spark-sql

How to flatten long dataset to wide format (pivot) with no join?

Feb 27, 2026

apache-spark pyspark apache-spark-sql

Efficiently calculate top-k elements in spark

Feb 27, 2026

apache-spark apache-spark-sql window-functions rank approximation

How To Apply Multiple Conditions on Case-Otherwise Statement Using Spark Dataframe API

Feb 24, 2026

r apache-spark dataframe apache-spark-sql

how to change a column type in array struct by pyspark

Feb 26, 2026

pyspark apache-spark-sql pyspark-schema

How to use columns to create queries (e.g. WHERE clause)?

Feb 25, 2026

apache-spark pyspark apache-spark-sql

Convert an Rows or Columns to a dataframe

Feb 25, 2026

scala apache-spark apache-spark-sql data-manipulation

How to run VACUUM and OPTIMIZE SQL statements in Amazon Athena for Apache Iceberg v2 table

Feb 24, 2026

amazon-web-services apache-spark-sql amazon-athena apache-iceberg

Creating a new scala class that relies on GraphFrames without serialization issues

Feb 24, 2026

scala apache-spark apache-spark-sql

New posts in apache-spark-sql