apache-spark tutorials and guides

How to load Tuple from Cassandra table?

Feb 14, 2022

apache-spark apache-spark-sql spark-cassandra-connector

Spark ML VectorAssembler() dealing with thousands of columns in dataframe

Feb 27, 2020

scala apache-spark classification pipeline

Finding connected components of a particular node instead of the whole graph (GraphFrame/GraphX)

Nov 17, 2022

apache-spark spark-dataframe spark-graphx graphframes

filter pushdown using spark-sql on map type column in parquet

Nov 11, 2022

dictionary apache-spark predicate parquet

How to save file in Feather format\storage from Spark?

Jul 04, 2017

pandas apache-spark dataframe spark-dataframe feather

Pyspark Column.isin() for a large set

May 02, 2022

python-3.x apache-spark ipython pyspark

run Spark-Submit on YARN but Imbalance (only 1 node is working)

Oct 23, 2022

hadoop apache-spark cluster-computing hadoop-yarn

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/spark/Logging

May 05, 2021

apache-spark

Real-time analysis of event logs with Elasticsearch

Sep 08, 2022

hadoop elasticsearch apache-spark machine-learning lambda-architecture

Apache Spark Maven Dependencies for release and develop an app

Feb 27, 2022

java maven apache-spark

How to implement Stanford CoreNLP wrapper for Apache Spark using sparklyr?

Apr 21, 2021

r apache-spark stanford-nlp sparklyr

Using Pycuda with PySpark - nvcc not found

May 08, 2022

apache-spark pyspark pycuda

Spark UI DAG stage disconnected

Sep 09, 2022

scala apache-spark

Large scheduler delay in Apache Spark tasks using deploy mode cluster

May 23, 2022

apache-spark cluster-computing scheduler

Spark HashingTF result explanation

Nov 23, 2021

scala apache-spark apache-spark-mllib apache-spark-ml

About a java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy

Sep 16, 2021

scala apache-spark snappy

Cosine similarity of word2vec more than 1

Nov 09, 2022

python apache-spark pyspark

How to write a dataframe in pyspark having null values to CSV

Sep 10, 2019

python apache-spark pyspark

Spark master memory requirements related to data size

Oct 25, 2022

apache-spark

How to join two spark dataset to one with java objects?

Sep 05, 2022

java apache-spark apache-spark-dataset apache-spark-encoders

New posts in apache-spark