apache-spark tutorials and guides

How to query when connecting mongodb with apache-spark

Sep 24, 2022

mongodb hadoop apache-spark

Hadoop DistributedCache functionality in Spark

Aug 30, 2022

hadoop apache-spark distribute distributed-cache

Merge more than 32 files in Google Cloud Storage

Jan 06, 2020

google-cloud-storage apache-spark google-compute-engine

reduceByKey using Scala object as key

Dec 03, 2019

scala apache-spark reduce

launching a spark program using oozie workflow

Nov 18, 2022

scala apache-spark workflow oozie

custom join with non equal keys

Nov 10, 2021

join apache-spark

Ordering an RDD[String]

Aug 29, 2022

scala apache-spark

Apache Spark app workflow

Jun 24, 2022

apache-spark workflow

How to create collection of RDDs out of RDD?

Nov 13, 2022

scala apache-spark

How do I install Python libraries automatically on Dataproc cluster startup?

May 12, 2022

hadoop apache-spark google-cloud-platform google-cloud-dataproc

Spark Streaming on EC2: Exception in thread "main" java.lang.ExceptionInInitializerError

Feb 10, 2022

scala maven amazon-ec2 apache-spark spark-streaming

Spark difference between maven Artifacts spark-core_2.10 and spark-core_2.11

Dec 11, 2020

maven apache-spark

Apache Spark: Driver (instead of just the Executors) tries to connect to Cassandra

Oct 26, 2022

scala apache-spark cassandra

Efficient grouping by key using mapPartitions or partitioner in Spark

Nov 13, 2022

apache-spark grouping partition

Multiple Spark Workers on Single Windows Machine

Feb 13, 2019

scala apache-spark cluster-computing

Creating an RDD to collect the results of an iterative calculation

Jun 10, 2022

scala apache-spark akka scalaz apache-spark-mllib

How to determine if object is a valid key-value pair in PySpark

Jul 14, 2019

python apache-spark pyspark key key-value

Apache Spark - Memory Exception Error -IntelliJ settings

Oct 29, 2022

java intellij-idea apache-spark jvm virtual-machine

"error: type mismatch" in Spark with same found and required datatypes

Aug 26, 2020

scala apache-spark spark-graphx

How is the Spark select-explode idiom implemented?

Jan 14, 2020

apache-spark apache-spark-sql

New posts in apache-spark