apache-spark-sql tutorials

How to sort array of struct type in Spark DataFrame by particular column?

Mar 16, 2022

Partitioning of Data Frame in Pyspark using Custom Partitioner

Aug 24, 2022

pyspark apache-spark-sql

How to expire state of dropDuplicates in structured streaming to avoid OOM?

Nov 21, 2022

apache-spark duplicates apache-spark-sql out-of-memory spark-structured-streaming

Does Kryo help in SparkSQL?

Sep 16, 2022

apache-spark apache-spark-sql kryo

How to write a Dataset to Kafka topic?

Oct 25, 2022

scala apache-spark apache-kafka apache-spark-sql

how to use spark lag and lead over group by and order by

Nov 15, 2022

apache-spark apache-spark-sql apache-spark-dataset

Adding a new column in the first ordinal position in a pyspark dataframe

Mar 06, 2022

python apache-spark pyspark apache-spark-sql

Pyspark Error:- dataType <class 'pyspark.sql.types.StringType'> should be an instance of <class 'pyspark.sql.types.DataType'>

Nov 10, 2022

python apache-spark pyspark apache-spark-sql

Why is repartition faster than partitionBy in Spark?

Sep 12, 2022

apache-spark pyspark apache-spark-sql apache-spark-xml

Spark on embedded mode - user/hive/warehouse not found

Aug 31, 2022

hadoop apache-spark hive apache-spark-sql parquet

pyspark split a column to multiple columns without pandas

Jun 01, 2022

python apache-spark pyspark apache-spark-sql

Can you copy straight from Parquet/S3 to Redshift using Spark SQL/Hive/Presto?

Oct 29, 2022

hadoop amazon-s3 apache-spark apache-spark-sql

Access names of fields in struct Spark SQL

Jun 05, 2022

scala apache-spark apache-spark-sql

Spark SQL's Scala API - TimestampType - No Encoder found for org.apache.spark.sql.types.TimestampType

Mar 07, 2022

scala apache-spark timestamp apache-spark-sql apache-spark-dataset

Spark dataframe add a row for every existing row

Nov 05, 2022

scala apache-spark apache-spark-sql explode

Pyspark transform method that's equivalent to the Scala Dataset#transform method

Aug 23, 2022

apache-spark pyspark apache-spark-sql apache-spark-dataset

How to query datasets in avro format?

Aug 22, 2022

apache-spark apache-spark-sql spark-avro

Hive and SparkSQL do not support datetime type?

Dec 09, 2018

sql hive apache-spark-sql

What's the difference between Dataset.col() and functions.col() in Spark?

Nov 13, 2022

apache-spark apache-spark-sql

How to transpose/pivot the rows data to column in Spark Scala? [duplicate]

Jun 12, 2022

scala apache-spark apache-spark-sql pivot

New posts in apache-spark-sql