apache-spark tutorials and guides

Spark Mlib FPGrowth job fails with Memory Error

Nov 01, 2022

apache-spark rdd apache-spark-mllib

Spark local vs hdfs permormance

Nov 01, 2022

performance hadoop apache-spark

How to extract character n-grams based on a large text

Nov 02, 2022

scala apache-spark

Spark: how to get all configuration parameters

Nov 02, 2022

apache-spark

Scala reflection with Serialization (over Spark) - Symbols not serializable

Oct 31, 2022

scala serialization reflection apache-spark

Counting distinct texts in a Spark RDD with array objects

Oct 31, 2022

python apache-spark pyspark rdd

How to submit a python wordcount on HDInsight Spark cluster from Jupyter

Nov 01, 2022

python apache-spark pyspark azure-hdinsight jupyter-notebook

Spark Streaming: Application health

Nov 01, 2022

apache-spark garbage-collection performance-testing spark-streaming

Take part of rdd and keep it rdd

Nov 02, 2022

apache-spark pyspark

How to connect spark-shell to Mesos?

Nov 02, 2022

apache-spark apache-spark-sql mesos mesosphere

PHOENIX SPARK - Load Table as DataFrame

Oct 31, 2022

apache-spark dataframe phoenix

Iterating/looping over Spark parquet files in a script results in memory error/build-up (using Spark SQL queries)

Nov 01, 2022

loops apache-spark pyspark apache-spark-sql pyspark-sql

python send csv data to spark streaming

Nov 02, 2022

python sockets apache-spark streaming

Scala Spark - creating nested json output from simple dataframe

Oct 30, 2022

json apache-spark apache-spark-sql spark-dataframe

Dynamic Set Algebra on Spark

Nov 01, 2022

scala apache-spark set pyspark boolean-expression

Multiprocessing a list of RDDs

Nov 01, 2022

python apache-spark pyspark list-comprehension

How to query on data frame where 1 field of StringType has json value in Spark SQL

Nov 01, 2022

json scala apache-spark apache-spark-sql

SPARK Exception thrown in awaitResult

Nov 01, 2022

sql join apache-spark

Elasticsearch-Hadoop library cannot connect to to docker container

Nov 01, 2022

scala elasticsearch apache-spark docker elasticsearch-hadoop

What are the mandatory options for loading Excel file?

Jun 10, 2021

excel scala apache-spark apache-spark-sql spark-excel

New posts in apache-spark