Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

Filtering a spark dataframe based on date

Reading csv files with quoted fields containing embedded commas

Applying UDFs on GroupedData in PySpark (with functioning python example)

DataFrame equality in Apache Spark

GroupBy column and filter rows with maximum value in Pyspark

How to get other columns when using Spark DataFrame groupby?

Fetching distinct values on a column using Spark DataFrame

How to convert DataFrame to RDD in Scala?

get specific row from spark dataframe

Spark - extracting single value from DataFrame

Why Spark SQL considers the support of indexes unimportant?

Get the size/length of an array column

What is the meaning of partitionColumn, lowerBound, upperBound, numPartitions parameters?

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark extracting values from a Row

Joining Spark dataframes on the key

Difference between DataSet API and DataFrame API [duplicate]

Flattening Rows in Spark

dataframe: how to groupBy/count then filter on count in Scala

Spark Window Functions - rangeBetween dates