apache-spark-sql tutorials

What's the difference between Dataset.col() and functions.col() in Spark?

Nov 13, 2022

apache-spark apache-spark-sql

How to transpose/pivot the rows data to column in Spark Scala? [duplicate]

Jun 12, 2022

scala apache-spark apache-spark-sql pivot

Counting number of nulls in pyspark dataframe by row

Nov 17, 2022

dataframe pyspark apache-spark-sql pyspark-sql

spark: How does salting work in dealing with skewed data

Oct 27, 2022

apache-spark join group-by apache-spark-sql skew

How to calculate size of dataframe in spark scala

Jun 23, 2022

apache-spark apache-spark-sql spark-streaming

compute string length in Spark SQL DSL

Oct 21, 2022

apache-spark apache-spark-sql string-length

How to get default property values in Spark

Mar 31, 2022

scala apache-spark apache-spark-sql

Spark 2.0 DataSets groupByKey and divide operation and type safety

Aug 17, 2019

scala apache-spark apache-spark-sql apache-spark-dataset

Spark Dataframes- Reducing By Key

Oct 09, 2021

scala apache-spark apache-spark-sql apache-spark-dataset

How to use Scala UDF in PySpark?

Nov 16, 2022

python scala apache-spark pyspark apache-spark-sql

Scala/Spark dataframes: find the column name corresponding to the max

Nov 16, 2022

scala apache-spark dataframe apache-spark-sql argmax

Apache Spark how to append new column from list/array to Spark dataframe

Jun 14, 2022

scala apache-spark dataframe apache-spark-sql

How to flatten columns of type array of structs (as returned by Spark ML API)?

Aug 10, 2022

apache-spark apache-spark-sql apache-spark-ml

Spark: Return empty column if column does not exist in dataframe

Nov 06, 2022

apache-spark pyspark apache-spark-sql pyspark-sql

Apache Spark startsWith in SQL expression

Sep 07, 2022

scala apache-spark apache-spark-sql

Spark AnalysisException when "flattening" DataFrame in Spark SQL

Aug 25, 2022

apache-spark apache-spark-sql

How to find the max value of multiple columns?

Nov 07, 2022

scala apache-spark apache-spark-sql

Spark Convert Data Frame Column to dense Vector for StandardScaler() "Column must be of type org.apache.spark.ml.linalg.VectorUDT"

Mar 09, 2022

python apache-spark pyspark apache-spark-sql apache-spark-ml

Pyspark Dataframe Join using UDF

Feb 07, 2022

python apache-spark pyspark apache-spark-sql user-defined-functions

spark sql count(*) query store result

Nov 14, 2022

sql apache-spark apache-spark-sql

New posts in apache-spark-sql