apache-spark-sql tutorials

Spark SQL - IN clause

Sep 20, 2022

scala dataframe apache-spark-sql

How to pass a constant value to Python UDF?

Oct 26, 2022

python apache-spark pyspark apache-spark-sql user-defined-functions

Partitioning in spark while reading from RDBMS via JDBC

Sep 20, 2022

apache-spark jdbc apache-spark-sql partitioning

to_date fails to parse date in Spark 3.0

Sep 21, 2022

apache-spark pyspark apache-spark-sql spark3

How to zip two (or more) DataFrame in Spark

Sep 20, 2022

scala apache-spark dataframe apache-spark-sql

How to select and order multiple columns in a Pyspark Dataframe after a join

Sep 20, 2022

python apache-spark pyspark apache-spark-sql

How to split pipe-separated column into multiple rows?

Sep 20, 2022

apache-spark apache-spark-sql

Spark: Find Each Partition Size for RDD

Sep 20, 2022

apache-spark pyspark apache-spark-sql spark-dataframe

How to use collect_set and collect_list functions in windowed aggregation in Spark 1.6?

Sep 20, 2022

scala apache-spark apache-spark-sql apache-spark-1.6

Spark merge/combine arrays in groupBy/aggregate

May 27, 2021

scala apache-spark apache-spark-sql

Spark Data frame search column starting with a string

Nov 10, 2022

apache-spark apache-spark-sql spark-dataframe

How to make the first row as header when reading a file in PySpark and converting it to Pandas Dataframe

Nov 11, 2022

python pandas apache-spark pyspark apache-spark-sql

How to specify the path where saveAsTable saves files to?

Nov 03, 2022

apache-spark pyspark apache-spark-sql

Aggregate function in spark-sql not found

Sep 19, 2022

scala apache-spark apache-spark-sql

How to count number of columns in Spark Dataframe?

Sep 19, 2022

scala apache-spark dataframe apache-spark-sql

How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?

Oct 20, 2022

excel scala apache-spark apache-spark-sql spark-excel

In Apache Spark, how to convert a slow RDD/dataset into a stream?

Sep 19, 2022

scala apache-spark apache-spark-sql spark-streaming

What is happening when Spark is calling ShuffleBlockFetcherIterator?

Sep 14, 2022

apache-spark apache-spark-sql

Spark: Most efficient way to sort and partition data to be written as parquet

Nov 17, 2022

apache-spark pyspark apache-spark-sql pyspark-sql

Read an unsupported mix of union types from an Avro file in Apache Spark

Apr 01, 2019

scala apache-spark apache-spark-sql spark-avro

New posts in apache-spark-sql