apache-spark tutorials and guides

How can I use graphframes with pyspark on AWS EMR?

Oct 25, 2022

Save Spark Dataframe into Elasticsearch - Can’t handle type exception

Aug 20, 2021

elasticsearch apache-spark elasticsearch-hadoop apache-spark-1.5

How to iterate records spark scala?

Sep 25, 2022

scala apache-spark avro

Spark SQL performance - JOIN on value BETWEEN min and max

Nov 01, 2021

python apache-spark pyspark apache-spark-sql

Cannot create dataframe from list: pyspark

Sep 18, 2022

python apache-spark pyspark apache-spark-sql

How to modify a column value in a row of a spark dataframe?

Nov 18, 2022

apache-spark pyspark spark-dataframe

UDF to extract only the file name from path in Spark SQL

Jul 11, 2022

java scala apache-spark apache-spark-sql spark-dataframe

How to find mean of grouped Vector columns in Spark SQL?

Sep 29, 2018

apache-spark apache-spark-sql aggregate-functions user-defined-functions apache-spark-ml

Converting dataframe columns into list of tuples

Jun 18, 2022

scala apache-spark dataframe tuples

Add PySpark RDD as new column to pyspark.sql.dataframe

Oct 04, 2022

python apache-spark pyspark

SparkConf settings not used when running Spark app in cluster mode on YARN

May 29, 2022

apache-spark memory-management hadoop-yarn executor

Apache Spark subtract days from timestamp column

May 26, 2020

apache-spark dataframe apache-spark-sql timestamp

pyspark throws TypeError: textFile() missing 1 required positional argument: 'name'

Oct 31, 2021

python python-3.x apache-spark pyspark rdd

Saving dataframe records in a tab delimited file

Oct 04, 2022

apache-spark pyspark

How to extract number from string column?

Dec 13, 2019

scala apache-spark apache-spark-sql

In pyspark, is it possible to fillna with another column?

Nov 16, 2022

apache-spark pyspark

filter only not empty arrays dataframe spark [duplicate]

Oct 30, 2022

scala apache-spark apache-spark-sql

How to set up mesos for running spark on standalone OS/X

Nov 18, 2022

macos scala apache-spark mesos

Ungrouping a (key, list(values)) pair in Spark/Scala

Apr 25, 2022

list scala key apache-spark

Filter out rows with NaN values for certain column

Oct 29, 2022

scala apache-spark apache-spark-sql

New posts in apache-spark