In Spark, I know that errors are recovered by doing recomputation of the RDDs unless an RDD is cached. In that case, the computation can start from that cached RDD.
My question is, how errors are recovered in MapReduce frameworks (such as Apache Hadoop). Let us say, a failure occured in the shuffle phase (After map and before reduce that is), how would it be recovered. Would the map step be performed again. Is there any stage in MapReduce where output is stored in the HDFS, so that computation can restart only from there? And what about a Map after Map-Reduce. Is output of reduce stored in HDFS?
MapReduce handles task failures by restarting the failed task and re-computing all input data from scratch, regardless of how much data had already been processed.
Failure of the jobtracker is the most serious failure mode. Hadoop has no mechanism for dealing with failure of the jobtracker—it is a single point of failure—so in this case the job fails.
What happens when the node running the map task fails before the map output has been sent to the reducer? In this case, map task will be assigned a new node and whole task will be run again to re-create the map output.
Given that failures are common at large scale, these frameworks exhibit some fault tolerance and dependability techniques as built-in fea- tures. In particular, MapReduce frameworks tolerate machine failures (crash failures) by re-executing all the tasks of the failed machine by the virtue of data replication.
What you are referring to is classified as failure of task which could be either a map task or reducer task
In case of a particular task failure, Hadoop initiates another computational resource in order to perform failed map or reduce tasks. 
When it comes to failure of shuffle and sort process, it is basically a failure in the particular node where reducer task has failed and it would be  set to run afresh in another resource (btw, reducer phase begin with shuffle and sort process). 
Of course it would not allocate the tasks infinitely if they keep failing. There are two properties below which can determine how many failures or attempts of a task could be acceptable.
mapred.map.max.attempts for Map tasks and a property mapred.reduce.max.attempts for reduce tasks. 
By default, if any task fails four times (or whatever you configure in those properties), the whole job would be considered as failed. - Hadoop Definitive Guide
In short shuffle and sort being a part of reducer, it would only initiate attempt to rerun reducer task. Map tasks would not be re-run as they are considered as completed. 
Is there any stage in MapReduce where output is stored in the HDFS, so that computation can restart only from there?
Only the final output would be stored in HDFS. Map's outputs are classified as intermediate and would not be stored in HDFS as HDFS would replicate the data stored and basically why would you want HDFS to manage intermediate data that's of no use after the completion of job. There would be additional overhead of cleaning it up as well. Hence Maps output are not stored in HDFS.
And what about a Map after Map-Reduce. Is output of reduce stored in HDFS?
The output of reducer would be stored in HDFS. For Map, I hope the above description would suffice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With