Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

Spark : Size exceeds Integer.MAX_VALUE When Joining 2 Large DFs

Changing column data type to factor with sparklyr

How to add jdbc drivers to classpath when using PySpark?

pyspark apache-spark-sql

When to execute REFRESH TABLE my_table in spark?

PySpark.sql.filter not performing as it should

What problems can arise from a Spark non-deterministic Pandas UDF

Derby version mismatch between Spark and Hive : Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Spark SQL package not found

Re-using A Schema from JSON within a Spark DataFrame using Scala

How to do non-random Dataset splitting on Apache Spark?

How to find first non-null values in groups? (secondary sorting using dataset api)

Can we able to use mulitple sparksessions to access two different Hive servers

Does Spark do one pass through the data for multiple withColumn?

java.lang.AssertionError: assertion failed: No plan for HiveTableRelation

Spark : Union can only be performed on tables with the compatible column types. Struct<name,id> != Struct<id,name>

How to use transform higher-order function?

Why is Scala's Symbol not accepted as a column reference?

scala apache-spark-sql

Zeppelin SqlContext registerTempTable issue

Saving / exporting transformed DataFrame back to JDBC / MySQL

Spark - How can get the Logical / Physical Query execution using - Thirft - Hive Interactor