apache-spark-sql tutorials

How to load history data when starting Spark Streaming process, and calculate running aggregations

Dec 06, 2025

Calculate time difference between consecutive rows in pairs per group in pyspark

Dec 05, 2025

apache-spark pyspark apache-spark-sql

Spark Scala Dataframe describe non numeric columns

Dec 05, 2025

scala apache-spark apache-spark-sql apache-spark-mllib data-analysis

Loop through RDD elements, read its content for further processing

Dec 06, 2025

apache-spark pyspark apache-spark-sql rdd

use of frequency argument in percentile function in spark sql

Dec 06, 2025

sql statistics apache-spark-sql percentile

When to use rdd in Spark2.0?

Dec 06, 2025

apache-spark apache-spark-sql apache-spark-2.0

loading data file with 3 spaces as delimiter using Sparks csv reader in java

Dec 06, 2025

java csv apache-spark apache-spark-sql

change Unix(Epoch) time to local time in pyspark

Dec 05, 2025

apache-spark timezone pyspark apache-spark-sql epoch

Counting consecutive occurrences of a specific value in PySpark

Dec 05, 2025

python apache-spark pyspark apache-spark-sql databricks

Remove trailing white space from elements in a list

Dec 05, 2025

python-3.x apache-spark pyspark apache-spark-sql

Simulating UDAF on Pyspark for encapsulation

Dec 04, 2025

python apache-spark pyspark apache-spark-sql

Spark job not ending : Show of dataframe

Dec 03, 2025

python apache-spark apache-spark-sql pyspark

Add empty column to dataframe in Spark with python

Dec 04, 2025

python pyspark apache-spark-sql rdd

PySpark, order of column on write to MySQL with JDBC

Dec 03, 2025

mysql jdbc apache-spark pyspark apache-spark-sql

Pivot in SPARK SQL

Dec 04, 2025

apache-spark-sql

How to get the N most recent dates in Pyspark

Dec 03, 2025

python apache-spark pyspark apache-spark-sql

New posts in apache-spark-sql