Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
How to calculate sum and count in a single groupBy?
Sep 06, 2022
scala
apache-spark
apache-spark-sql
How to create a udf in PySpark which returns an array of strings?
Jan 16, 2022
python
apache-spark
pyspark
apache-spark-sql
user-defined-functions
Why does starting StreamingContext fail with “IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute”?
Jan 04, 2019
java
apache-spark
spark-streaming
Rolling your own reduceByKey in Spark Dataset
Sep 06, 2022
scala
apache-spark
mapreduce
In Apache Spark, why does RDD.union not preserve the partitioner?
Sep 06, 2022
apache-spark
partitioning
hadoop-partitioning
PySpark and broadcast join example
Sep 06, 2022
python
apache-spark
apache-spark-sql
pyspark
Spark union column order
Sep 28, 2022
apache-spark
pyspark
apache-spark-sql
pyspark-sql
How to find Spark's installation directory?
Sep 30, 2022
java
ubuntu
apache-spark
Join two ordinary RDDs with/without Spark SQL
Sep 05, 2022
scala
join
apache-spark
rdd
apache-spark-sql
Multiple condition filter on dataframe
Sep 05, 2022
python
apache-spark
dataframe
pyspark
apache-spark-sql
Left Anti join in Spark?
Sep 05, 2022
scala
apache-spark
SQL query in Spark/scala Size exceeds Integer.MAX_VALUE
Jan 20, 2022
sql
apache-spark
amazon-ec2
emr
Why does Spark application fail with “ClassNotFoundException: Failed to find data source: kafka” as uber-jar with sbt assembly?
Oct 19, 2022
scala
apache-spark
sbt
sbt-assembly
spark-structured-streaming
Is it possible to alias columns programmatically in spark sql?
Sep 05, 2022
scala
apache-spark
apache-spark-sql
How to add any new library like spark-csv in Apache Spark prebuilt version
Sep 05, 2022
python
apache-spark
apache-spark-sql
PySpark: modify column values when another column value satisfies a condition
Sep 05, 2022
apache-spark
pyspark
apache-spark-sql
environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
Sep 05, 2022
python
python-3.x
apache-spark
pyspark
How to write the resulting RDD to a csv file in Spark python
Sep 19, 2022
python
csv
apache-spark
pyspark
file-writing
How to configure high performance BLAS/LAPACK for Breeze on Amazon EMR, EC2
Sep 05, 2022
apache-spark
amazon-ec2
amazon-emr
scala-breeze
jblas
How does Spark running on YARN account for Python memory usage?
Sep 05, 2022
python
apache-spark
hadoop
pyspark
hadoop-yarn
« Newer Entries
Older Entries »