Skip to main content

Mastering Hadoop: Book Review

I came across a book Mastering Hadoop published by Packt and authored by Sandeep Karath. Here is my detail review about the book-

SUMMARY
This book is based on most popular massive parallel programming (MPP) framework "Hadoop" and its eco-system. This is an intermediate level book where author goes in depth on not only the principle subject but also on most of the supporting eco-systems like hive, pig, stream, etc. The book has 374 pages with 12 chapters, the ToC  itself is spanned across 7 pages! It has conceptual as well as hands on lab experiences with lot of code churned into.


OPINION
The book starts with genealogy of Hadoop where the author has nicely narrated the evolution of web search to current state and then various releases of Hadoop. Good reasoning as why Hadoop 2.0 was essential to move ahead from previous version. Touches the architecture starting from high level 3-layered, drilling down step by step to cluster and node level. Describes all the features of Hadoop2.x nicely and then talks about 4 major hadoop distros.

The concepts of MapReduce (MR) algorithm like merge and spills of intermediate outputs, stagglers, job counter, data joins packed in chapter 2. On the labs front, it has explained MapReduce example to great detail. explains custom RecordReader implementation. Some tips are really handy like heuristic formula for calculating optimum number of reducers. It is to be noted that the chapter assumes the reader has basic knowledge of this algo and it talks about the advance concepts.

Chapter 3, Pig talks about in-dept execution process of pig latin script and semantics along with many tips to optimize the query performance. It also shows practicle ways to use Pig for joining, as combiner, as abstract data analyzer for Data Acyclic Graph (DAG).

Hive in ch 4, another way to scoop the data from Hadoop in a conventional SQL-like style from RDBMS world. This is also covered in pretty details starting with its architecture to HiveQL semantics, execution steps and optimization tips like indexing, partioning, etc. It also captures exntensibles like UDF, UDAF and UDTF.

Hadoop Serialization and I/O talks about techniques of SerDe. After talking about Hadoop's own implementation and JDK's implementation, it slowly starts Apache's Avro tool with clearly stating its advantages and detail example. It explains the steps of Avro/Pig and Avro/Hive integration.

Chapter 6 & 7 talks about Yarn and Storm. YARN introduces the new architecture along with example of writing client and scheduling job plus ways to monitor it. Storm talks about low latency processing (aka real time processing), compares between Hadoop MR and Apache Storm with with the help of process diagrams and also explains the concepts of spout, bolt and topology with the help of java based example. It ends with installation on hadoop.

Then it flows with Hadoop off premise offerings (Cloud based!) like Anazon's AWS based EMR and Microsoft's Azure based HDInsight with enough comparison points as well as enough configuration steps.

Hadoop replacements gets into debating pros and cons of HDFS and possible extensions like AWS S3 which can make it more powerful. Actually adding more points on alternative systems likes of Cassandra, Ceph, GlusterFS would gave been value addition here.

Then it delves into features like HDFS Federation, Hadoop Security with its four pillars Authentication, Authorization, Auditing and Data Protection with each explained in great detail.

And here comes ch 12, Analytics using Hadoop: Its Machine Learning is a very interesting topic to have in this book but not sure if to that extent of detail. You need to have some statistical knowledge to understand some tpoics from the chapter as it talks about the terms/algos like tf-idf, k-means clustering. At the end, it talks about data analysis libraries: RHadoop, Mahout. Overall this chapter provides good handles on analytics.

The book ends with the appendix of "Hadoop for MS Windows". Thanks to Hortonworks! Now you can get Hadoop distro on win platform as well as their PaaS offering on MS Azure, more details follow in this chapter.

The author definitely seems having a rich experience in the field and is successful in conveying the depth of the subject through this book. Also the source code for the book is available at github.

In otherwise crowded Hadoop beginners' books, this one is different and catering an intermediate level. I wish all the very best to this efforts...


WHO SHOULD READ THIS BOOK
Anyone who has prior knowledge of Hadoop1.x can easily upgrade himself to Hadoop 2.x YARN. But then even the one with little knowledge of database and java can read this book to explore this new eco-system to enhance existing skills.

Comments

  1. The expansion of internet and intelligence in business process lead the way to huge volume of data. It is important to maintain and process these data to be efficient in data handling. Hadoop Training in Chennai | Big Data Course in Chennai

    ReplyDelete
  2. Thanks for your informative blog!!! Todays more demand on certified Developers and Adminstrators on Hadoop in companies.Keep on updating your with such awesome information about Hadoop.
    Big Data Hadoop Training In Hyderabad

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. Webtrackker Indirapuram offers an inclusive software testing training in Indirapuram. The extensive practical training provided by the Software Testing training institute in Indirapuram, equips live projects and simulations. Such a detailed course in Software Testing has helped our students to obtain work in several multinationals. The Webtrackker trainers are subject to specialized corporate professionals who offer an in-depth study in the Software Testing course in Indirapuram.
    software testing institute in Indirapuram

    ReplyDelete

  5. Webtrackker is the best Salesforce online training in india, Do not assume that all sales employees have understood how the training should be applied. Sales training is largely generic. There may be a gap between knowing how to apply a principle. You want to make sure you close that gap. If necessary, take a new language. If the training requires a new language or terms that you have not used before, adjust the new terms as part of your sales vocabulary. This will help strengthen the training. Webtrackker is the best training in India Do not conduct sales training that is not in line with your sales philosophy. Before investing in a sales training program, make sure the curriculum matches your sales philosophy. For example, if you use a strategic sales process, do not send your salespeople to training that focuses primarily on tactics and not strategies. Keep the goals of the sales team members that you want to achieve with the salesforce training before the salesforce training begins. Knowing what you want to stop training before you start training is very valuable. Aws online training in india
    Salesforce online training in india

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete
  7. This comment has been removed by a blog administrator.

    ReplyDelete
  8. This comment has been removed by a blog administrator.

    ReplyDelete
  9. Sirkus System Bangalore Reviews- Sirkus System IT Services Pvt Ltd a logo name specialized in product improvement & answers for mobile environment and other platforms Sirkus device Bangalore critiques- Quality development, dedicated work approach and professional attitude are some of the traits which outline Sirkus Systems IT Services Pvt Ltd.

    Sirkus system
    sirkus system
    Sirkus Systems
    sirkus system review
    Sirkus System
    Sirkus System Reviews
    Sirkus System
    Sirkus System Review





















    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. Java training in indirapuram- There are multiple structures and streams for developing a product or utility. When we talk of technology and programming languages, Java is the maximum desired platform. It is used to expand a whole lot of programs for the systems and embedded devices like cellular telephones, drugs, laptops, and many others.

    Java training in indirapuram

    Hadoop training in indirapuram

    sas training in indirapuram

    sap training in indirapuram

    linux training in indirapuram

    sap fico training in indirapuram

    web design training in indirapuram

    php training in indirapuram

    ReplyDelete
  12. Great post and informative blog on hadoop. It was awesome to read, thanks for sharing this great content to my vision.
    BE/B.Tech Project Center in Chennai | ME/M.Tech Project Center in Chennai | Final Year Project Center in Chennai

    ReplyDelete
  13. Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
    big-data-hadoop-training-institute-in-bangalore

    ReplyDelete
  14. Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.

    blue prism training in chennai


    ReplyDelete
  15. It is stunning and awesome to visit your site.Thanks for sharing this information,this is helpful to me planet-php

    ReplyDelete
  16. Techonolgy is updated day to day
    Thanks for sharing the info
    ">Salesforce Training

    ReplyDelete
  17. Nice post keep do posting , Hadoop is best platform for the data securty and how the data will flows form one network to another network, There are different modules like HIVE PIG MYSQL and looking for the
    Best Amazon web Services Training Hyderabad
    Learn Online DevOps Training

    ReplyDelete
  18. Thanks for share this information. I have read your blog. Your information
    is really helpful for me. Keep update your blog.
    Guest posting sites
    Technical updates

    ReplyDelete
  19. Extremely Informative post a debt of gratitude is in order for the sharing.
    Education | Article Submission sites | Technology

    ReplyDelete
  20. Wonderful blog & good post.Its really helpful for me, awaiting for more new post. Keep Blogging !!
    Blue Prism Training in Chennai | Blue Prism Training Institute in Chennai

    ReplyDelete
  21. Brilliant article. The information I have been searching precisely. It helped me a lot, thanks. Keep coming with more such informative article. Would love to follow them.
    sap abap training online

    ReplyDelete
  22. It has been simply incredibly generous with you to provide openly what exactly many individuals would’ve marketed for an eBook to end up making some cash for their end, primarily given that you could have tried it in the event you wanted.
    www.besanttechnologies.in/hadoop-training-in-bangalore.

    ReplyDelete
  23. Nice post keep do posting The Info was too good, for more information regarding the technology Click
    Amazon web Services Training
    Professional Salesforce CRM Training

    ReplyDelete
  24. Great blog! Really awesome I got more information from this blog. Thanks for sharing with us.

    salesforce developer training in chennai

    salesforce administrator training in chennai

    ReplyDelete
  25. This comment has been removed by the author.

    ReplyDelete
  26. Webtrackker Technology
    C-67,Noida sec-63
    url: http://webtrackker.com/Oracle-DBA-Training-institute-in-Noida.php
    Oracle Training institute in Noida

    ReplyDelete
  27. wow is what comes to my mind... its amazing that a simple plastic wrap can be turned into something mystical
    Big Data Training in Chennai |
    Big Data Training |
    Big Data Course in Chennai


    ReplyDelete
  28. Thanks for sharing this valuable information. Check on the below link if you are looking for best Hadoop training in chennai.

    Hadoop Training In Chennai

    ReplyDelete
  29. Best course in IT course is hadoop and also msbi is best of these
    best msbi training in chennai

    ReplyDelete
  30. You are doing a great job. I would like to appreciate your work for good accuracy.

    Machine Learning Course in Chennai | Machine Learning Training in Chennai

    ReplyDelete
  31. Great blog thanks for posting keep on posting
    java training in chennai

    ReplyDelete
  32. The way of you expressing your ideas is really good.you gave more useful ideas for us and please update more ideas for the learners.
    Python Training in Chennai
    Digital Marketing Course in Chennai
    Hadoop training in chennai
    Big data training in chennai
    big data training in velachery

    ReplyDelete
  33. I have been following your post past long time. I always found it very interesting and valuable. keep posting it is really helpful.

    cloud computing course in delhi

    cloud computing course in Noida

    cloud computing course in Gurgaon

    ReplyDelete
  34. You have shared amazing post. This post is really helpful for us to know the information of java. Thank you for taking your time to post such a wonderful article. Php coaching in jaipur

    ReplyDelete
  35. Hey Nice Blog!! Thanks For Sharing!!!Wonderful blog & good post.Its really helpful for me, waiting for a more new post. Keep Blogging!
    salesforce Training in Bangalore
    uipath Training in Bangalore
    blueprism Training in Bangalore

    ReplyDelete
  36. Nice article, which you have described very well about Mastering Hadoop . Your article is very useful for those who are looking to buy a python traning. thanks for sharing.
    Python Training Institutes in India

    ReplyDelete
  37. This comment has been removed by the author.

    ReplyDelete
  38. This comment has been removed by the author.

    ReplyDelete

  39. Thanks for your post! Through your pen I found the problem up interesting! I believe there are many other people who are interested in them just like me! Thanks your shared!... I hope you will continue to have similar posts to share with everyone! I believe a lot of people will be surprised to read this article! Best SAP Hybris online training in hyderabad



    ReplyDelete
  40. Well you have shared the best and informative information about education. as we provide learn Arabic online at affordable prices. for more info visit our website.

    ReplyDelete
  41. I read this article, it is really informative one. Your way of writing and making things clear is very impressive. Thanking you for such an informative article. big data certification course

    ReplyDelete
  42. Great job for publishing such a beneficial web site. Your web log isn’t only useful but it is additionally really creative too.
    Keep sharing more blogs like this.
    IELTS Coaching in chennai

    German Classes in Chennai

    GRE Coaching Classes in Chennai

    TOEFL Coaching in Chennai

    spoken english classes in chennai | Communication training

    ReplyDelete
  43. Thanks for sharing information awesome blog-post. Online Education Quiz website For Exam Follow this website Gk in Hindi

    ReplyDelete
  44. Nice content very helpful, It has a very important point which should be noted down. All points mentioned and very well written.Keep Posting & writing such content

    AWS Online Training
    Online AWS Certification Training

    ReplyDelete
  45. Nice article, which you have described very well about Mastering Hadoop . Your article is very useful for those who are looking to buy a python training. thanks for sharing.

    python Training in chennai

    python Course in chennai

    ReplyDelete
  46. Thank you. really you have posted an informative blog. it will be really helpful to many peoples. thank you for sharing this blog. so keep on sharing such kind of useful blogs.
    AI Training in Bangalore

    AI Course in Bangalore

    ReplyDelete
  47. We are offering Python training in Delhi NCR. We are a training institute in Delhi and Noida.We provide industrial training in programming like Python, PHP, Web designing, R Programming etc so if any body is looking to get trained into any skills, just let us know.following is the link to get enrilled into python batch
    Python Training in Delhi

    ReplyDelete
  48. Ammiro questo articolo per il contenuto ben studiato e l'eccellente formulazione. Sono stato così coinvolto in questo materiale che non riuscivo a smettere di leggere. Sono impressionato dal tuo lavoro e dalla tua abilità. Grazie mille. Corsi di segretario Regione Abruzzo

    ReplyDelete
  49. Excellent job, this is necessary information which is shared by you. This information is meaningful and very important for us to increase our knowledge about it. Always keep sharing this type of information. Thanks. Chinese language certification courses online for beginner

    ReplyDelete
  50. MAJOR168 is open for football betting today. There are many big camps together BTi SBOBET IBCBET CMD365 if you are looking for a football betting website. Do not miss this site, there is football, there are all sports in the world. คาสิโนออนไลน์. Betting is available 24 hours a day with the best odds per pair in Thailand. Guaranteed automatic deposit and withdrawal system 10 seconds.

    Live sports betting Online football betting Good price with every football match open for today online football betting SAGAME88 There are many big camps together, SBOBET IBCBET BTi CMD365, the only website complete in online football betting คาสิโนออนไลน์. There are every sport on the planet in here. With the automatic deposit and withdrawal system for 10 seconds, we have a live football system to watch every night.


    We offer a wide variety of services. Called him the only player to finish with everything else does not have to go to the web preview ufabet as online. Online casinos Baccarat online Online betting games, Slotonline and with new technology, you can play ufabet via mobile phone today. Mobile Baccarat, play online via the website



    Ufabet1688 of us again the way we are websites directly , not through a General Services , where customers will know it absolutely was extremely really no cheating possible on site gambling online , it is ufabet1688 of us will hit prices.

    ReplyDelete
  51. Choose to bet on football with us There are more than 100 different sports to choose from, playing live every match for you to have fun 24 hours a day. There is an online casino, Baccarat, สมัคร ufa, Dragon Tiger, Sa Casino Sexy Bacarat, live broadcast directly to your hand 24 hours a day.

    ReplyDelete
  52. บาคาร่าออนไลน์ ถือว่าเป็น เว็บบาคาร่าออนไลน์ ที่นอกจากทำการรวบรวมเหล่า เกมคาสิโนออนไลน์ที่ดีที่สุด 2021 ทุกรูปแบบที่มีแล้วนั้น ทางเว็บเรา ยังพร้อมมอบสิทธิพิเศษ และผลประโยชน์ต่าง ๆ มากมาย เพื่อคืนกำไร และเอื้อสิทธิผลประโยชน์ที่ผู้เล่นควรได้รับ ไม่ว่าจะเป็นสมาชิกเก่าหรือใหม่ก็ตาม อาทิเช่น โปรโมชั่นแรกเข้า 30% ทันที โปรโมชั่นแนะนำเพื่อนรับ 20% และ โปรโมชั่นคืนยอดเสียที่ผู้เล่นสามารถกดรับได้ในทุก ๆ เดือน 5 % อีกทั้งทางเว็บไซต์ยังได้พัฒนาระบบทางการเงิน เข้าสู่รูปแบบออโต้ หรืออัตโนมัติ เพื่อความปลอดภัยและมั่นคงในธุรกรรมทางการเงินของเหล่าสมาชิก รวมไปถึงการให้บริการติดต่อสอบถามตลอด 24 ชม. กับทีมงานเจ้าหน้าที่มืออาชีพ บาทคาร่าออนไลน์ ที่พร้อมให้คำปรึกษาด้วยเช่นกันนั่นเอง.

    ReplyDelete
  53. สล็อตออนไลน์ (Slot Online) คือการนำเครื่องเล่นพนัน ตู้สล็อตแมชชีน ที่กล่าวมาข้างต้นนี้ นำมาทำเป็นเกมอิเล็กทรอนิกส์ที่เรียกกันว่า สล็อตออนไลน์ เนื่องด้วยยุคสมัยพัฒนาคนหันมานิยมเล่นพนันกันผ่านคอมพิวเตอร์ จะได้นำเกมสล็อตมาทำเป็นเกมพนันออนไลน์ ผ่านระบบเครื่อข่ายอินเตอร์เน็ต ซึ่งผู้เล่นสามารถเล่นแบบผ่าน โปรแกรมสล็อต ก็ได้หรือจะเป็นการเข้าเล่น Slots ผ่านหน้าเว็บไซต์ผู้ให้บริการ ซึ่งบริการเกม สล็อตออนไลน์ นั้นก็มีรูปแบบของกติกาการเล่น คล้ายกันกับการเล่นบนตู้สล็อตแมชชีน ทั้งภาพและเสียงสมจริงเร้าใจไม่แพ้กันกับไปนั่งเล่นในคาสิโนเลยทีเดียว.

    ReplyDelete
  54. ufabet Parent company, the most popular online gambling website, whether it is online football betting Online casinos Baccarat online All of them were not less popular than each other. Become a member of UEFA Bet Playing with the parent company Did not pass agent Bet on a variety of casino games Especially the gambler who likes to Online football betting Our website provides 4 football odds, football betting, a minimum of 10 baht , betting is easy

    ReplyDelete
  55. This comment has been removed by the author.

    ReplyDelete
  56. This comment has been removed by the author.

    ReplyDelete
  57. This comment has been removed by the author.

    ReplyDelete
  58. This comment has been removed by the author.

    ReplyDelete
  59. Your post is really good thanks for sharing these kind of post but if anyone looking for Best Consulting Firm for Fake Experience Certificate Providers in hyderabad, India with Complete Documents So Dreamsoft Consultancy is the Best Place.Further Details Here- 9599119376 or VisitWebsite-https://experiencecertificates.com/experience-certificate-provider-in-Hyderabad.html

    ReplyDelete
  60. This blog is very useful it include very knowledgeable information. Thankyou for sharing this blog with us. If anyone want to experience certificate in bangalore can call at 9599119376 or can visit https://experiencecertificates.com/experience-certificate-provider-in-chennai.html

    ReplyDelete
  61. Trade FX At Home On Your PC: exness login Is A Forex Trading Company. The Company States That You Can Make On Average 80 – 300 Pips Per Trade.exness login States That It Is Simple And Easy To Get Started.

    ReplyDelete
  62. UK Close Protection Services is the top residential security in UKbodyguard company in London with the purpose of providing peace of mind and the highest level of protection to our clients. We provide both residential security and business services to ensure we cover all bases of your life.

    ReplyDelete

  63. Your blog is very nice and interesting. Your way of writing this blog forced me to read the full blog. Being a new reader, your blog increased my interest in reading. If anyone is interested for Fake Experience Certificate in Pune here we have the chance for you, Dreamsoft is providing is Fake experience certificate in Pune. To get you experience certificate in Pune you can contact at 9599119376. or can visit our website at https://experiencecertificates.com/experience-certificate-provider-in-pune.html

    ReplyDelete
  64. Very nice blog, with a lot of new information. Thanks for sharing the wonderful blog. Keep going on.
    Ubs accounting
    Myob Singapore
    Best Accounting software Singapore

    ReplyDelete
  65. Your blog is very nice and interesting. Your way of writing this blog forced me to read the full blog. Being a new reader, your blog increased my interest in reading. If anyone is interested for Fake Experience Certificate in Mumbai here we have the chance for you, Dreamsoft is providing is Fake experience certificate in Mumbai. To get you experience certificate in Mumbai you can contact at 9599119376. or can visit our website at https://experiencecertificates.com/experience-certificate-provider-in-mumbai.html

    ReplyDelete
  66. Excellent post. I really enjoy reading and also appreciate your work.E-Learning Online Courses from Home This concept is a good way to enhance knowledge. Keep sharing this kind of articles, Thank you.

    ReplyDelete
  67. This comment has been removed by the author.

    ReplyDelete
  68. THANK YOU for this amazing and for sharing this blog with us, it is very helpful.
    please keep updated us more about like this type of blog.
    If someone is looking for the best java training institute for software training in Ghaziabad, java training institute
    It is the best place from where you get the practical knowledge of java training institute here. You will be an expert in this field after doing the java training.

    ReplyDelete
  69. You have given great content here.Best computer vision course online I am glad to discover this post as I found lots of valuable data in your article. Thanks for sharing an article like this.

    ReplyDelete
  70. I found decent information in your article about Singapore online betting. I am impressed with how nicely you described this subject, It is a gainful article for us. Thanks for share it.

    ReplyDelete
  71. A website that is not secure is open to hackers, viruses, and malware. As a website owner, you want to have a secure site. Worry not! Rely on WS Centre, the ace Website Design Company In Delhi and be assured that your website is safe and secure for the rest of your business years. The experienced and qualified developers at Web Solution Centre how to update the web script constantly and to use strong passwords to enhance website security. With these and many more security tightening approaches, Web Solution Centre is here to offer you the best online security systems for your intellectual properties. Website Development Company In Delhi

    ReplyDelete
  72. We have developed professional tools ready to unlock an iCloud-locked iPhone and iPad and remove the Apple ID account from your device without a password. iCloud Crack

    ReplyDelete
  73. Re-Loader Activator Downloader is an outstanding tool and provides a free interface. It strongly activates total of the Microsoft lists listed over. Download Re-Loader

    ReplyDelete
  74. Here is the compelling selection of the best belated happy birthday images that will help you to cool off the anger of that person https://wishesquotz.com/belated-birthday-wishes/

    ReplyDelete
  75. Python has a remarkable amount of capability and a relatively simple syntax. It may be extended in C or C++ and offers interfaces for many system calls, libraries, and window systems. To learn more about python, join Python Training in Chennai at FITA Academy.
    Python Training in Chennai
    Python Online Course
    Python Training in Bangalore

    ReplyDelete
  76. Python is an object-oriented, analytic programming language. Classes, dynamically typed, high-level volatile data types, exceptions, modules, and exception management are all included. It supports various programming paradigms, including primary and reactive programming and object-oriented programming. Python Training in Chennai
    Python Online Course
    Python Training in Bangalore

    ReplyDelete
  77. Thanks for share this information.
    jewellery erp software
    Jewellery erp software

    ReplyDelete
  78. Keep Your Tata Running Smoothly: Ensure the longevity of your Tata vehicle with our genuine
    Tata Spare Parts. Visit our store or call for more information.

    ReplyDelete

Post a Comment

Popular posts from this blog

Giveaway contest! Win a copy of "Pentaho for Big Data Analytics" book (CLOSED)

I am very excited to launch a giveaway contest for the book Pentaho for Big Data Analytics . HOW TO ENTER: To enter the contest, simply visit the book page  once and leave your comments at the bottom of this blog. To ensure you get a copy of  Pentaho for Big Data Analytics , consider purchasing it on  Amazon . PRIZE: Two lucky winners will receive a paperback copy of  Pentaho for Big Data Analytics  by Manoj R Patil & Feris Thia . (For those not residing at US or Europe would get e-Book.) THE RULES: One entry per e-mail address. Contest will run from 2/27/14 through 3/9/14 Two lucky winners will be selected on or around 3/11/14 Open to all residents. ABOUT BOOK : The book will help you achieving following objectives: Get to grips with the Pentaho suite Explore the basics of Big Data and its business context Set up a Pentaho business analytics server Consume Big Data on HDFS platform using Pentaho Data Integration Create visualization with P

Hadoop Ecosystem

When it comes to Hadoop, still some people believe it as a single out of box system catering all big data problems. Unless you are thinking of some third party commercial distribution, this is not correct. In reality, Hadoop on its own is just HDFS and MapReduce . But if you want production ready Hadoop system, then you will have to also consider Hadoop friends (or components) which makes it a complete big data solution.  Most of the components are coming as apache projects but few of them are non-apache open source or even commercial in some cases. This eco system is continuously evolving with large number of open source contributors. As shown in the above diagram. The following diagram gives high level overview of hadoop ecosystem. Figure 1: Hadoop Ecosystem The Hadoop ecosystem is logically divided into five layers which are self-explanatory. Some of the ecosystem components are explained below: Data Storage is where the raw data will be residing at. There are mul