Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
scala: Handle tuple where second element of tuple is an array of strings
Dec 15, 2025
scala
apache-spark
rdd
spark thrift server uses as many worker threads as much as available
Dec 15, 2025
java
apache-spark
thrift
Save Spark RDD to Hive Table
Dec 14, 2025
hadoop
apache-spark
apache-spark-sql
create a spark dataframe from a nested json file in scala [duplicate]
Dec 14, 2025
scala
apache-spark
dataframe
nested
apache-spark-sql
How to avoid continuous "Resetting offset" and "Seeking to LATEST offset"?
Dec 14, 2025
java
apache-spark
apache-kafka
spark-structured-streaming
Spark aggregations where output columns are functions and rows are columns
Dec 14, 2025
python
apache-spark
apache-spark-sql
pyspark
AnalysisException: Found duplicate column(s) in the data to save
Dec 14, 2025
apache-spark
pyspark
apache-spark-sql
databricks
How can I read LIBSVM models (saved using LIBSVM) into PySpark?
Dec 14, 2025
apache-spark
pyspark
libsvm
apache-spark-ml
Scala to Java 8 MLeap Translation
Dec 14, 2025
apache-spark
machine-learning
real-time
apache-spark-mllib
mleap
ERROR AzureNativeFileSystemStore: DirectoryIsNotEmpty
Dec 13, 2025
scala
azure
apache-spark
hadoop
azure-hdinsight
How can I distribute my task to all worker nodes in gcp? I am using pyspark
Dec 11, 2025
python
apache-spark
google-cloud-platform
pyspark
google-cloud-dataproc
Older Entries »