Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
ModuleNotFoundError in PySpark Worker on rdd.collect()
May 26, 2022
python
apache-spark
pyspark
pyspark-sql
RuntimeError: Unsupported type in conversion to Arrow: VectorUDT
Jan 24, 2022
pandas
apache-spark
dataframe
pyspark
pyarrow
How to print the decision path / rules used to predict sample of a specific row in PySpark?
Sep 05, 2021
apache-spark
pyspark
apache-spark-ml
Table loaded through Spark not accessible in Hive
Dec 15, 2018
apache-spark
hadoop
hive
pyspark
hortonworks-data-platform
pyspark: Method isBarrier([]) does not exist
Mar 25, 2022
python
apache-spark
pyspark
PySpark error: AnalysisException: 'Cannot resolve column name
Oct 16, 2022
apache-spark
exception
pyspark
What problems can arise from a Spark non-deterministic Pandas UDF
Oct 23, 2022
python
pandas
apache-spark
pyspark
apache-spark-sql
attributeerror: 'AioClientCreator' object has no attribute '_register_lazy_block_unknown_fips_pseudo_regions'
Oct 04, 2022
python
python-3.x
amazon-web-services
apache-spark
amazon-s3
How to bundle many files in S3 using Spark
Jun 08, 2022
scala
hadoop
amazon-s3
apache-spark
Spark groupBy OutOfMemory woes
Jan 30, 2019
apache-spark
How to set the number of partitions for newAPIHadoopFile?
Nov 08, 2022
hadoop
apache-spark
How to make Spark Streaming (Spark 1.0.0) read the latest data from Kafka (Kafka Broker 0.8.1)
Apr 04, 2022
apache-spark
apache-kafka
spark-streaming
offset
kafka-consumer-api
Cannot deploy local Spark job, worker fails with EndPointAssociationError
Jan 13, 2020
scala
akka
apache-spark
How to configure automatic restart of the application driver on Yarn
Oct 28, 2022
apache-spark
hadoop-yarn
spark-streaming
Derby version mismatch between Spark and Hive : Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Nov 04, 2022
apache-spark
apache-spark-sql
Spark executor lost because of time out even after setting quite long time out value 1000 seconds
Oct 18, 2022
apache-spark
Run 3000+ Random Forest Models By Group Using Spark MLlib Scala API
Aug 05, 2021
r
scala
apache-spark
apache-spark-mllib
Understanding treeReduce() in Spark
Mar 01, 2022
python
apache-spark
pyspark
rdd
reduce
Find name of currently running SparkContext
Aug 27, 2022
scala
apache-spark
What does the Spark UI light blue part of Tasks progress bar indicate?
Sep 15, 2021
user-interface
hadoop
apache-spark
« Newer Entries
Older Entries »