Why We Need Apache Spark?
Impediments of Hadoop Developed to overcome limitations of Hadoop/MapReduce. We should look at Spark as an alternative to Hadoop MapReduce rather than a replacement to Hadoop because of below reasons. Intermediate Results gathering Processing Techniques - Almost Realtime Polyglot Deployment and Storage(Scalable) Powerful Cache and Good Speed (In-memory) Parallelize Lazy Evaluation Advantage of Spark over Other Frameworks In-Memory Processing - Many times it is faster than other processing engine like MapReduce, tez, mesos, etc. Keeps data In-Memory (RAM), as it will do In-memory processing. Iterative algorithms are faster as data is not being written to disk between jobs. Intermediate results will be maintained in in-memory. But In hadoop, it will keep intermediate results in disks. Processing Technique Sequencial Processing means, before RDBMS system introduced, FIFO technique. It will process data/job by sequencially. Random Processing means, RDBMS will do this. It will...