I think this article from Oreilly does a great job in giving a short but sweet journey to this awesome Big Data Architecture. Just to understand what does Hadoop means deserves its own blog. Enjoy reading.
Specially loved the line :
"That’s MapReduce: you map the operation out to all of those servers and then you reduce the results back into a single result set."
Well, I didn't really tell you why I loved the line above. I have always been confused about this fancy word called MapReduce and wondered how does it work? Few days back I read about map-reduce python functions and thought.. "The Hadoop MapReduce wouldn't be far from this one" but then I never went to dig deeper on Hadoop side of story. Now that I read the line from the article, it all makes sense. It means that we pass the functions and data to Hadoop framework and it distributes the work within various nodes (called mapping here) and then collect results from various nodes and aggregates the output (called reducing here). The user of the Hadoop framework doesn't have to do the homework. He/She just need to give the work to Hadoop and wait for results. This is over simplifications of what goes in there but I am writing as to what I am thinking right now. I am sure going to come back and write more as I learn more about it. I like writing down my initial thoughts so that I can see how I have grown from knowing nothing to knowing something about the topic.
Go to the link here:
https://www.oreilly.com/ideas/what-is-hadoop
No comments:
Post a Comment