apache-spark tutorials and guides

SPARK, ML, Tuning, CrossValidator: access the metrics

Nov 15, 2022

apache-spark apache-spark-mllib apache-spark-ml

No suitable driver found for jdbc in Spark

Sep 19, 2018

mysql jdbc apache-spark apache-spark-sql

Why does SparkLauncher return immediately and spawn no job?

Sep 27, 2021

java apache-spark spark-launcher

SQL query Frequency Distribution matrix for product

Oct 24, 2022

sql apache-spark hive hiveql

How to load CSVs with timestamps in custom format?

Oct 15, 2022

apache-spark apache-spark-sql hortonworks-data-platform azure-hdinsight

Spark-shell meaning of displayed Number on Stage

Sep 07, 2022

apache-spark

Spark/Yarn: File does not exist on HDFS

Oct 10, 2021

hadoop apache-spark pyspark hadoop-yarn hadoop2

How to write streaming Dataset to Cassandra?

Mar 07, 2019

apache-spark pyspark spark-cassandra-connector spark-structured-streaming

Why is Spark not using all cores on local machine

Aug 22, 2022

apache-spark parallel-processing mapreduce

Running spark-submit with --master yarn-cluster: issue with spark-assembly

Nov 09, 2020

hadoop apache-spark hadoop-yarn

What controls how much of a Spark Cluster is given to an application?

Aug 30, 2022

resources apache-spark

Error when using multiple python files spark-submit

May 14, 2022

python apache-spark

How to get data from a specific partition in Spark RDD?

Nov 11, 2022

apache-spark rdd

Access to Spark from Flask app

Jan 20, 2018

python flask apache-spark pyspark

Number of Partitions of Spark Dataframe

Oct 15, 2022

apache-spark dataframe apache-spark-sql

Docker Container with Apache Spark in standalone cluster mode

Mar 08, 2018

apache-spark docker dockerfile

How to use a subquery for dbtable option in jdbc data source?

Sep 05, 2022

mysql apache-spark jdbc apache-spark-sql pyspark-sql

Why there are many spark-warehouse folders got created?

Apr 03, 2022

hadoop apache-spark jdbc hive

Pass variables from Scala to Python in Databricks

Apr 20, 2022

python apache-spark pyspark apache-spark-sql databricks

Getting labels from StringIndexer stages within pipeline in Spark (pyspark)

Nov 12, 2022

python apache-spark pyspark

New posts in apache-spark