apache-spark-sql tutorials

Spark SQL 1.5 build failure

Sep 15, 2022

How to get an Iterator of Rows using Dataframe in SparkSQL

Aug 31, 2022

apache-spark apache-spark-sql

How to perform "Lookup" operation on Spark dataframes given multiple conditions

Nov 02, 2022

scala apache-spark dataframe apache-spark-sql lookup

Use the result from Cross tab (spark dataframe) for chi-square test in SparkMlib

Oct 18, 2020

python apache-spark pyspark apache-spark-sql apache-spark-mllib

Zeppelin - Cannot query with %sql a table I registered with pyspark

Jun 10, 2022

apache-spark pyspark apache-spark-sql apache-zeppelin

Bulk data migration through Spark SQL

Dec 22, 2019

apache-spark apache-spark-sql spark-dataframe

SparkSQL on HBase Tables

May 08, 2022

apache-spark hadoop apache-spark-sql hbase

Spark : Size exceeds Integer.MAX_VALUE When Joining 2 Large DFs

Mar 30, 2021

scala apache-spark apache-spark-sql

Changing column data type to factor with sparklyr

Sep 05, 2022

r apache-spark dplyr apache-spark-sql sparklyr

How to add jdbc drivers to classpath when using PySpark?

Aug 23, 2022

pyspark apache-spark-sql

When to execute REFRESH TABLE my_table in spark?

Oct 26, 2022

apache-spark hive apache-spark-sql

PySpark.sql.filter not performing as it should

May 15, 2022

python-2.7 apache-spark pyspark apache-spark-sql spark-dataframe

What problems can arise from a Spark non-deterministic Pandas UDF

Oct 23, 2022

python pandas apache-spark pyspark apache-spark-sql

Derby version mismatch between Spark and Hive : Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Nov 04, 2022

apache-spark apache-spark-sql

Spark SQL package not found

Dec 08, 2018

java maven apache-spark apache-spark-sql

Re-using A Schema from JSON within a Spark DataFrame using Scala

Mar 09, 2022

json scala apache-spark apache-spark-sql

How to do non-random Dataset splitting on Apache Spark?

Jun 06, 2022

apache-spark apache-spark-sql apache-spark-dataset apache-spark-2.0

How to find first non-null values in groups? (secondary sorting using dataset api)

Feb 06, 2022

apache-spark apache-spark-sql apache-spark-dataset

Can we able to use mulitple sparksessions to access two different Hive servers

Sep 08, 2022

scala apache-spark hive apache-spark-sql

Does Spark do one pass through the data for multiple withColumn?

Oct 20, 2022

scala apache-spark apache-spark-sql

New posts in apache-spark-sql