apache-spark tutorials and guides

What are the compression types supported in parquet

Jun 09, 2022

Why is input_file_name() empty for S3 catalog sources in pyspark?

Jun 03, 2022

amazon-web-services apache-spark amazon-s3 pyspark aws-glue

Trouble installing Pyspark

Aug 24, 2022

python apache-spark

pyspark ImportError: cannot import name accumulators

Jun 17, 2022

python pycharm apache-spark

Rename pivoted and aggregated column in PySpark Dataframe

Oct 30, 2022

python apache-spark pyspark apache-spark-sql

Array Intersection in Spark SQL

Mar 22, 2022

apache-spark apache-spark-sql spark-dataframe hiveql apache-spark-dataset

Submit Spark job on Yarn cluster

Nov 20, 2022

scala apache-spark hadoop jobs

Get elements of type structure of row by name in SPARK SCALA

Aug 17, 2022

scala apache-spark apache-spark-sql

PySpark: Add a new column with a tuple created from columns

Feb 04, 2022

python apache-spark pyspark apache-spark-sql spark-dataframe

Caused by: java.lang.NullPointerException at org.apache.spark.sql.Dataset

Feb 24, 2022

scala apache-spark dataframe apache-spark-sql

How divide or multiply every non-string columns of a PySpark dataframe with a float constant?

Jun 19, 2022

python apache-spark pyspark spark-dataframe pyspark-sql

Adding StringType column to existing Spark DataFrame and then applying default values

Oct 30, 2022

scala apache-spark dataframe apache-spark-sql

Why does Spark application fail with "IOException: (null) entry in command string: null chmod 0644"? [duplicate]

Jan 17, 2022

java apache-spark apache-spark-sql

When to use countByValue and when to use map().reduceByKey()

Jul 05, 2022

scala apache-spark rdd word-count

spark dataframe keep most recent record

May 27, 2022

python apache-spark

Difference between two rows in Spark dataframe

Oct 15, 2022

scala apache-spark apache-spark-sql

Add leading zeros to Columns in a Spark Data Frame [duplicate]

Jun 01, 2022

scala apache-spark spark-dataframe

Getting error: Route() in Route cannot be applied to String

Jan 01, 2019

java mongodb apache-spark

How to set timezone to UTC in Apache Spark?

Oct 06, 2022

java apache-spark pyspark apache-spark-sql jvm

How to slice and sum elements of array column?

Aug 22, 2022

scala apache-spark apache-spark-sql

New posts in apache-spark