apache-spark-sql tutorials

Is there a Spark SQL jdbc driver?

Feb 22, 2022

spark job keep showing TaskCommitDenied (Driver denied task commit)

Jul 17, 2019

apache-spark pyspark apache-spark-sql pyspark-sql apache-spark-2.0

How to calculate lag difference in Spark Structured Streaming?

Nov 17, 2022

apache-spark pyspark apache-spark-sql spark-structured-streaming

How do I upsert into HDFS with spark?

Sep 21, 2022

apache-spark apache-spark-sql hdfs bigdata

Select specific columns in a PySpark dataframe to improve performance

Nov 17, 2022

apache-spark pyspark apache-spark-sql

Quarter to date growth

Sep 08, 2022

python-3.x apache-spark pyspark apache-spark-sql

How to read and write multiple tables in parallel in Spark?

Oct 23, 2022

scala parallel-processing apache-spark apache-spark-sql

Best approach to check if Spark streaming jobs are hanging

Jan 04, 2022

apache-spark apache-spark-sql bigdata spark-streaming

How to run inference of a pytorch model on pyspark dataframe (create new column with prediction) using pandas_udf?

Oct 30, 2022

pandas apache-spark pyspark apache-spark-sql pytorch

Saving a >>25T SchemaRDD in Parquet format on S3

Feb 24, 2019

amazon-s3 apache-spark parquet apache-spark-sql

Spark - Shuffle Read Blocked Time

Nov 15, 2022

apache-spark pyspark apache-spark-sql

DataFrame partitionBy on nested columns

Sep 12, 2022

apache-spark apache-spark-sql spark-dataframe

Divide elements of column by a sum of elements (of same column) grouped by elements of another column

May 22, 2022

scala apache-spark apache-spark-sql

Implementing MERGE INTO sql in pyspark

Oct 14, 2022

sql merge pyspark apache-spark-sql

TypeError: 'JavaPackage' object is not callable

May 25, 2021

apache-spark pyspark apache-spark-sql

Spark pulling data into RDD or dataframe or dataset

Dec 17, 2020

hadoop apache-spark apache-spark-sql spark-dataframe data-ingestion

Is there any way to get the output of Spark's Dataset.show() method as a string?

Oct 26, 2022

apache-spark apache-spark-sql

UDF cause warning: CachedKafkaConsumer is not running in UninterruptibleThread (KAFKA-1894)

Oct 25, 2022

apache-spark pyspark apache-kafka apache-spark-sql spark-streaming

Does Spark support BigInteger type?

Aug 23, 2019

java scala apache-spark apache-spark-sql

Spark: Prevent shuffle/exchange when joining two identically partitioned dataframes

Mar 04, 2022

apache-spark join pyspark apache-spark-sql pyspark-dataframes

New posts in apache-spark-sql