Spark Master filling temporary directory

Question

I have a simple Spark app that reads some data, computes some metrics, and then saves the result (input and output are Cassandra table). This piece of code runs at regular intervals (i.e., every minute).

I have a Cassandra/Spark (Spark 1.6.1) and after a few minutes, my temporary directory on the master node of the Spark cluster is filled, and the master refuses to run any more jobs. I am submitting the job with spark-submit.

What is it that I am missing? How do I make sure that the master nodes removes the temporary folder?

tesnik03 · Accepted Answer

Spark uses this directory as the scratch space and outputs temp map output files in there. This can be changed. You should take a look into spark.local.dir.

Spark Master filling temporary directory

Tags:

apache-spark

davideanastasia

1 Answers

tesnik03

Recent Activity

Donate For Us

Spark Master filling temporary directory

Tags:

apache-spark

davideanastasia

1 Answers

tesnik03

Related questions

Recent Activity

Donate For Us