what is difference between hadoop and spark [closed]

Question

As spark is growing in market nowadays I can see the Spark’s major use cases over Hadoop like:

Iterative Algorithms in Machine Learning
Interactive Data Mining and Data Processing
Spark is a fully Apache Hive-compatible data warehousing system that can run 100x faster than Hive.
Stream processing: Log processing and Fraud detection in live streams for alerts, aggregates and analysis
Sensor data processing: Where data is fetched and joined from multiple sources, in-memory dataset really helpful as they are easy
and fast to process.

My question is:

Is spark going to replace Hadoop in upcoming days?
Hadoop work concurrently while spark runs in parallel?(is it true?)

eugenio calabrese · Accepted Answer

Spark differ from hadoop in the sense that let you integrate data ingestion, proccessing and real time analytics in one tool. Moreover spark map reduce framework differ from standard hadoop map reduce because in spark intermediate map reduce result are cached, and RDD(abstarction for a distributed collection that ii fault tollerant) can be saved in memory if there is the need to reuse the same results (iterative alghoritms, group by , etc etc).

My answer is really superficial and does not not answer your question completly but just point out some of the main difference (much more in reality) Spark and databricks official site is really well documented and your question is already answered there :

https://databricks.com/spark/about

http://spark.apache.org/faq.html

Arnon Rotem-Gal-Oz · Answer

Hadoop today is a collection of technologies but in its essence it is a distributed file-system (HDFS) and a distributed resource manager (YARN). Spark is a distributed computational framework that is poised to replace Map/Reduce - another distributed computational framework that

used to be synonymous with Hadoop
ships with Hadoop out-of-the-box for backward compatibility (before YARN map/reduce support framework was Hadoop's resource management framework)

Specifically - Spark is not going to replace Hadoop but would probably replace map/reduce and Hadoop, map/reduce and spark are all distributed systems (and run in parallel)

what is difference between hadoop and spark [closed]

Tags:

apache-spark

hadoop

Roshan Bagdiya

2 Answers

eugenio calabrese

Arnon Rotem-Gal-Oz

Recent Activity

Donate For Us

what is difference between hadoop and spark [closed]

Tags:

apache-spark

hadoop

Roshan Bagdiya

2 Answers

eugenio calabrese

Arnon Rotem-Gal-Oz

Related questions

Recent Activity

Donate For Us