Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
Does spark streaming must finish processing previous batch of data, and then it can process the next batch of data, is it right?
Feb 25, 2026
apache-spark
spark-streaming
Programmatically reduce log in a spark shell
Feb 25, 2026
scala
shell
apache-spark
get multiple columns within a map: rdd
Feb 25, 2026
scala
apache-spark
rdd
Python Spark How to find cumulative sum by group using RDD API
Feb 25, 2026
python
apache-spark
pyspark
rdd
Creating a new scala class that relies on GraphFrames without serialization issues
Feb 24, 2026
scala
apache-spark
apache-spark-sql
Spark OutOfMemoryError
Feb 24, 2026
apache-spark
Spark partition by key [duplicate]
Feb 24, 2026
apache-spark
rdd
partitioning
How to find position of substring column in another column using PySpark?
Feb 24, 2026
apache-spark
pyspark
apache-spark-sql
Spark Scala scala.util.control.Exception catching and dropping None in map
Feb 24, 2026
scala
exception
apache-spark
rdd
Can Spark in Foundry use Partition Pruning
Feb 23, 2026
apache-spark
palantir-foundry
Is this a suitable way to implement a lazy `take` on RDD?
Feb 23, 2026
scala
apache-spark
How to List Iceberg Tables in a Catalog
Feb 23, 2026
apache-spark
aws-glue
apache-iceberg
Googld cloud dataproc serverless (batch) pyspark reads parquet file from google cloud storage (GCS) very slow
Feb 22, 2026
apache-spark
google-cloud-platform
google-cloud-storage
google-cloud-dataproc
google-cloud-dataproc-serverless
« Newer Entries
Older Entries »