Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Remove all records which are duplicate in spark dataframe

Apache Spark and Java error - Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2

Unzip folder stored in Azure Databricks FileStore

Java - Spark SQL DataFrame map function is not working

How do I register a function to sqlContext UDF in scala?

Why is the fold action necessary in Spark?

Spark saveAsTextFile() writes to multiple files instead of one [duplicate]

scala apache-spark

Creating a SparkSQL UDF in Java outside of SQLContext

Extract date from a string column containing timestamp in Pyspark

Spark DataFrames when udf functions do not accept large enough input variables

How to pass a list of paths to spark.read.load?

How can I use graphframes with pyspark on AWS EMR?

Save Spark Dataframe into Elasticsearch - Can’t handle type exception

How to iterate records spark scala?

scala apache-spark avro

Spark SQL performance - JOIN on value BETWEEN min and max

Cannot create dataframe from list: pyspark

How to modify a column value in a row of a spark dataframe?

UDF to extract only the file name from path in Spark SQL

How to find mean of grouped Vector columns in Spark SQL?

Converting dataframe columns into list of tuples