Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

Re-using A Schema from JSON within a Spark DataFrame using Scala

How to do non-random Dataset splitting on Apache Spark?

How to find first non-null values in groups? (secondary sorting using dataset api)

Can we able to use mulitple sparksessions to access two different Hive servers

Does Spark do one pass through the data for multiple withColumn?

java.lang.AssertionError: assertion failed: No plan for HiveTableRelation

Spark : Union can only be performed on tables with the compatible column types. Struct<name,id> != Struct<id,name>

How to use transform higher-order function?

Why is Scala's Symbol not accepted as a column reference?

scala apache-spark-sql

Zeppelin SqlContext registerTempTable issue

Saving / exporting transformed DataFrame back to JDBC / MySQL

Spark - How can get the Logical / Physical Query execution using - Thirft - Hive Interactor

Spark DataFrame not respecting schema and considering everything as String

Spark Is there any rule of thumb about the optimal number of partition of a RDD and its number of elements?

Spark sql top n per group

Add months to date column in Spark dataframe

How to select multiple columns of dataset, given a list of column names?

Spark decimal type precision loss

How to find the max String length of a column in Spark using dataframe?

Spark: How to aggregate/reduce records based on time difference?