apache-spark tutorials and guides

How does Spark 2.0 handle column nullability?

Jun 13, 2022

Spark: Extracting summary for a ML logistic regression model from a pipeline model

Sep 27, 2022

python apache-spark pyspark pipeline logistic-regression

Pyspark, Add a character in the middle of a string

Oct 01, 2022

python apache-spark split pyspark

How to implement Functor[Dataset]

Jan 11, 2022

scala apache-spark scala-cats scala-implicits apache-spark-encoders

Understanding Kryo serialization buffer overflow error

Nov 17, 2022

scala apache-spark kryo

Using UDF ignores condition in when

Oct 15, 2022

python apache-spark pyspark spark-dataframe user-defined-functions

Spark: select with key in map

Apr 19, 2022

apache-spark apache-spark-sql

How to bucketize a group of columns in pyspark?

Jun 29, 2022

python apache-spark pyspark

ERROR : User did not initialize spark context

May 06, 2022

apache-spark hadoop

Why does Spark's Word2Vec return a vector?

Jul 15, 2022

java apache-spark machine-learning word2vec apache-spark-ml

Set spark configuration

Feb 27, 2022

python-3.x apache-spark pyspark apache-spark-sql

PySpark explode stringified array of dictionaries into rows

Sep 25, 2022

python apache-spark dataframe pyspark apache-spark-sql

Convert UTC timestamp to local time based on time zone in PySpark

Oct 25, 2022

apache-spark pyspark apache-spark-sql

Delta Lake without Databricks Runtime

Sep 18, 2022

apache-spark hdfs databricks delta-lake

Stream-Static Join: How to refresh (unpersist/persist) static Dataframe periodically

Sep 25, 2021

scala apache-spark apache-spark-sql spark-streaming spark-structured-streaming

API compatibility between scala and python?

Jul 17, 2022

apache-spark pyspark

Spark fail when running pi.py example with yarn-client mode

May 23, 2022

apache-spark

Spark-csv data source: infer data types

Oct 25, 2022

apache-spark dataframe

Aggregation with Group By date in Spark SQL

Oct 30, 2022

sql group-by apache-spark aggregation

Convert Matrix to RowMatrix in Apache Spark using Scala

May 14, 2017

scala matrix apache-spark distributed

New posts in apache-spark