Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

How to fix "A protocol message was rejected because it was too big" from Google Protobuf in Spark on Mesos?

How do I get a PySpark DataFrame made using HiveContext in Spark 1.5.2?

Integrating Spark SQL and Apache Drill through JDBC

How to load Tuple from Cassandra table?

Spark ML VectorAssembler() dealing with thousands of columns in dataframe

Finding connected components of a particular node instead of the whole graph (GraphFrame/GraphX)

filter pushdown using spark-sql on map type column in parquet

How to save file in Feather format\storage from Spark?

Pyspark Column.isin() for a large set

run Spark-Submit on YARN but Imbalance (only 1 node is working)

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/spark/Logging

apache-spark

Real-time analysis of event logs with Elasticsearch

Apache Spark Maven Dependencies for release and develop an app

java maven apache-spark

How to implement Stanford CoreNLP wrapper for Apache Spark using sparklyr?

Using Pycuda with PySpark - nvcc not found

apache-spark pyspark pycuda

Spark UI DAG stage disconnected

scala apache-spark

Large scheduler delay in Apache Spark tasks using deploy mode cluster

Spark HashingTF result explanation

About a java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy

scala apache-spark snappy