Skip to main content

Getting Control of BIG DATA

I got a copy of HBR’s October edition and like its featured article “Getting Control of BIG DATA”. A is fantastic article giving deep insight in the subject! ...and the credit goes to authors: Andrew, a research scientist and Erik, an MIT professor. 

Big data is definitely having huge business potential but at the same time poses few business challenges especially from corporate culture's perspective.

Here are few bullets of general interest:

  • Why Big Data? …analyzing it leads to better predictions and better predictions yields better decisions. In sector after sector, companies that figure out how to combine domain expertise with data science will pull away from their rivals.
  • Web Fact: 2.5 exabytes of data created daily and this number is doubling each 40 months
  • Retail fact: Walmart collects 2.5 petabytes of data every hour from its customer transactions
  • Case Study (Retail) - Sears: Sears is using Hadoop cluster for data analytics. (personalized promotion time came down from 8 weeks to 1 week, saving cost and time!) It is interesting to know that they are directly storing the data on Hadoop clusters and doing real time analysis.
  • Case Study (Aviation) - PASSUR : Improved airline ETAs which helped in improving staff efficiency whose worth would be several million dollars a year at each airport
  • Big data can shift the corporate culture by muting the HiPPOs (Highest Paid People's Opinion): Rather than relying on “intuitive” decisions from people high up in the organization, it would be based on real data analytics.
  • It would redefine domain expert: As the big data movement advances, value (of role) of domain experts will shift from their HiPPO style answers to their data specific questions.
  • New lucrative role of Data Scientist: a person who can extract treasure out of messy, unstructured data

The article contains in depth discussion around this and it is worth to grab this copy from the stall and churn it out if you are serious business in upcoming technologies.


  1. Thank You for sharing your article, This is an interesting & informative blog. It is very useful for the developer like me. Kindly keep blogging. Besides that Wisen has established as Best Corporate Training Companies in Chennai .

    Nowadays JavaScript has tons of job opportunities on various vertical industry. Know more about JavaScript Framework Training visit Corporate Training Companies in India.

    This post gives me detailed information about the technology. corporate training in chennai


Post a Comment

Popular posts from this blog

Mastering Hadoop: Book Review

I came across a book Mastering Hadoop published by Packt and authored by Sandeep Karath. Here is my detail review about the book-

This book is based on most popular massive parallel programming (MPP) framework "Hadoop" and its eco-system. This is an intermediate level book where author goes in depth on not only the principle subject but also on most of the supporting eco-systems like hive, pig, stream, etc. The book has 374 pages with 12 chapters, the ToC  itself is spanned across 7 pages! It has conceptual as well as hands on lab experiences with lot of code churned into.

The book starts with genealogy of Hadoop where the author has nicely narrated the evolution of web search to current state and then various releases of Hadoop. Good reasoning as why Hadoop 2.0 was essential to move ahead from previous version. Touches the architecture starting from high level 3-layered, drilling down step by step to cluster and node level. Describes all the features of Hadoo…

Hadoop Ecosystem

When it comes to Hadoop, still some people believe it as a single out of box system catering all big data problems. Unless you are thinking of some third party commercial distribution, this is not correct. In reality, Hadoop on its own is just HDFS and MapReduce. But if you want production ready Hadoop system, then you will have to also consider Hadoop friends (or components) which makes it a complete big data solution. 

Most of the components are coming as apache projects but few of them are non-apache open source or even commercial in some cases. This eco system is continuously evolving with large number of open source contributors. As shown in the above diagram. The following diagram gives high level overview of hadoop ecosystem.

Figure 1: Hadoop Ecosystem

The Hadoop ecosystem is logically divided into five layers which are self-explanatory. Some of the ecosystem components are explained below:

Data Storage is where the raw data will be residing at. There are multiple file systems sup…

Giveaway contest! Win a copy of "Pentaho for Big Data Analytics" book (CLOSED)

I am very excited to launch a giveaway contest for the book Pentaho for Big Data Analytics. HOW TO ENTER: To enter the contest, simply visit the book page once and leave your comments at the bottom of this blog. To ensure you get a copy of Pentaho for Big Data Analytics, consider purchasing it on Amazon. PRIZE: Two lucky winners will receive a paperback copy of Pentaho for Big Data Analytics by Manoj R Patil & Feris Thia. (For those not residing at US or Europe would get e-Book.)THE RULES: One entry per e-mail address.Contest will run from 2/27/14 through 3/9/14Two lucky winners will be selected on or around 3/11/14Open to all residents.
ABOUT BOOK: The book will help you achieving following objectives: