Why does Spark report spark.SparkException: File ./someJar.jar exists and does not match contents of

Question

I sometimes see I see the following error message when running Spark jobs:

13/10/21 21:27:35 INFO cluster.ClusterTaskSetManager: Loss was due to spark.SparkException: File ./someJar.jar exists and does not match contents of ...

What does this mean? How do I diagnose and fix this?

samthebest · Accepted Answer

After digging around in the logs I found "no space left on device" exceptions too, then when I ran df -h and df -i on every node I found a partition that was full. Interestingly this partition does not appear to be used for data, but storing jars temporarily. It's name was something like /var/run or /run.

The solution was to clean the partition of old files and to setup some automated cleaning, I think setting spark.cleaner.ttl to say a day (86400) should prevent it happening again.

cfeduke · Answer

Running on AWS EC2 I periodically encounter disk space issues - even after setting the spark.cleaner.ttl to a few hours (we iterate quickly). I decided to solve them by moving the /root/spark/work directory to the mounted ephemeral disk on the instance (I'm using r3.larges which have a 32GB ephemeral at /mnt):

readonly HOST=some-ec2-hostname-here

ssh -t root@$HOST spark/sbin/stop-all.sh
ssh -t root@$HOST "for SLAVE in \$(cat /root/spark/conf/slaves) ; do ssh \$SLAVE 'rm -rf /root/spark/work && mkdir /mnt/work && ln -s /mnt/work /root/spark/work' ; done"
ssh -t root@$HOST spark/sbin/start-all.sh

As far as I can tell as of Spark 1.5 the work directory still does not make use of the mounted storage by default. I haven't tinkered with the deployment settings enough to see if this is even configurable.

Why does Spark report spark.SparkException: File ./someJar.jar exists and does not match contents of

Tags:

apache-spark

samthebest

2 Answers

samthebest

cfeduke

Recent Activity

Donate For Us

Why does Spark report spark.SparkException: File ./someJar.jar exists and does not match contents of

Tags:

apache-spark

samthebest

2 Answers

samthebest

cfeduke

Related questions

Recent Activity

Donate For Us