apache-spark tutorials and guides

Issues with Scala ScriptEngine inside spark submit application

Mar 10, 2026

Delta Lake partitioning strategy for event data

Mar 10, 2026

apache-spark databricks partitioning delta-lake

Type checking on user input Scala Spark

Mar 10, 2026

scala apache-spark typechecking

What is the Master URL in pyspark?

Mar 10, 2026

python apache-spark

How to read sequence files exported from HBase

Mar 10, 2026

apache-spark export hbase sequence pyspark

spark kafka security kerberos

Mar 10, 2026

security apache-spark apache-kafka

Spark: udf to get dirname from path

Mar 09, 2026

scala apache-spark

How to convert spark dataset to scala seq

Mar 10, 2026

scala apache-spark scala-collections apache-spark-dataset

Is it possible to change a column name in Spark SQL in Hive?

Mar 09, 2026

sql apache-spark hive

Spark HiveContext : Insert Overwrite the same table it is read from

Mar 10, 2026

apache-spark hive pyspark hivecontext

Read spark dataset only first n columns

Mar 09, 2026

apache-spark apache-spark-sql

Spark job optimization: Is there a way to tune spark job which has too many joins

Mar 09, 2026

apache-spark apache-spark-sql

No Module Named 'delta.tables'

Mar 09, 2026

python apache-spark pyspark delta-lake

Pyspark write to External Hive table in S3 is not parallel

Mar 09, 2026

apache-spark amazon-s3 hive pyspark emr

Does Spark benefit from `sortBy` in persistent table?

Mar 07, 2026

apache-spark pyspark apache-spark-sql

How to enable Catalyst Query Optimiser in Spark SQL?

Mar 09, 2026

apache-spark query-optimization apache-spark-sql

Spark count number of words with in group by

Mar 09, 2026

sql scala apache-spark apache-spark-sql apache-spark-dataset

Databricks - Create Function (UDF) in Python

Mar 08, 2026

python apache-spark databricks

New posts in apache-spark