apache-spark tutorials and guides

Why is SparkListenerApplicationStart never fired?

Jan 19, 2022

apache-spark

will Spark support Clojure?

Oct 20, 2022

clojure apache-spark spark-streaming nupic

mapPartitions returns empty array

Sep 14, 2022

apache-spark rdd

How to Get the file name for record in spark RDD (JavaRDD)

Mar 26, 2022

java hadoop apache-spark hdfs

Spark withColumn() performing power functions

Nov 03, 2022

python apache-spark pyspark

how to distinguish an operation in spark is a transformation or an action?

Nov 07, 2022

apache-spark

'SparkContext' object has no attribute 'textfile'

Dec 07, 2019

hadoop apache-spark pyspark

Spark SQL - Generate array of arrays from the sql function

Feb 03, 2022

scala apache-spark apache-spark-sql

PySpark - Add a new column with a Rank by User

Nov 07, 2019

python apache-spark pyspark apache-spark-sql pyspark-sql

Spark Scala: retrieve the schema and store it

Mar 18, 2022

scala apache-spark apache-spark-sql spark-dataframe

How to write a DataFrame schema to file in Scala

Oct 21, 2022

scala apache-spark dataframe apache-spark-sql

How to Create a Database in Spark SQL

May 30, 2022

apache-spark apache-spark-sql

Invalidate metadata/refresh imapala from spark code

Sep 11, 2022

hadoop apache-spark impala

Understanding Representation of Vector Column in Spark SQL

Nov 05, 2022

apache-spark apache-spark-sql apache-spark-mllib apache-spark-ml

How to Read Data from DB in Spark in parallel

Jun 02, 2022

apache-spark jdbc apache-spark-sql

How to do aggregation on multiple columns at once in Spark

Sep 05, 2022

scala apache-spark

spark jdbc df limit... what is it doing?

Sep 30, 2021

apache-spark apache-spark-sql

How to get max length of string column from dataframe using scala?

Sep 06, 2022

scala apache-spark apache-spark-sql max

Custom partitioner in SPARK (pyspark)

May 09, 2022

apache-spark pyspark

Check if arraytype column contains null

May 20, 2022

scala apache-spark dataframe null apache-spark-sql

New posts in apache-spark