Skip to main content

Amazing Pentaho!

I am using PRD (Pentaho Report Designer) to build few PDF reports. So far I used to generate hundreds of reports manually from Pentaho BI server and that was my big time consuming task. I came across new transformation output step in kettle "Pentaho Reporting Output" which helps in automating the report generation. Find the sample kettle transformation below in the diagram -

Kettle Transformation using Report Output
Initially I was trying to integrate PRD 3.8.x prpt file with PDI (aka kettle) 4.3 and could not make it work. Later on with the help from pentaho forum, it is found that all the family are typically not inter-compatible with all versions, esp. backward compatibility can be issue sometime. The simple reason for this would be cost as it is community edition. When I created report with latest PRD 3.9, I could integrate it with Kettle without any hassle.

Jaspersoft and Pentaho are two leading open source BI solutions available in the market. But I found Pentaho family more matured as they have fantastic integration support with all the leading market solutions in this area like palo, hadoop, mongoDB, Cassendra just to name a few. Also its tools are grown organically like their ETL tool data integration aka Kettle.I think even though Pentaho does not have very good support for eclipse plugin, overall as a BI solution it has tremendous potential to become an enterprise level solution easily.


Popular posts from this blog

Mastering Hadoop: Book Review

I came across a book Mastering Hadoop published by Packt and authored by Sandeep Karath. Here is my detail review about the book- SUMMARY This book is based on most popular massive parallel programming (MPP) framework " Hadoop " and its eco-system. This is an intermediate level book where author goes in depth on not only the principle subject but also on most of the supporting eco-systems like hive, pig, stream, etc. The book has 374 pages with 12 chapters, the ToC  itself is spanned across 7 pages! It has conceptual as well as hands on lab experiences with lot of code churned into. OPINION The book starts with genealogy of Hadoop where the author has nicely narrated the evolution of web search to current state and then various releases of Hadoop. Good reasoning as why Hadoop 2.0 was essential to move ahead from previous version. Touches the architecture starting from high level 3-layered, drilling down step by step to cluster and node level. Describes all the feat

Hadoop Ecosystem

When it comes to Hadoop, still some people believe it as a single out of box system catering all big data problems. Unless you are thinking of some third party commercial distribution, this is not correct. In reality, Hadoop on its own is just HDFS and MapReduce . But if you want production ready Hadoop system, then you will have to also consider Hadoop friends (or components) which makes it a complete big data solution.  Most of the components are coming as apache projects but few of them are non-apache open source or even commercial in some cases. This eco system is continuously evolving with large number of open source contributors. As shown in the above diagram. The following diagram gives high level overview of hadoop ecosystem. Figure 1: Hadoop Ecosystem The Hadoop ecosystem is logically divided into five layers which are self-explanatory. Some of the ecosystem components are explained below: Data Storage is where the raw data will be residing at. There are mul

Giveaway contest! Win a copy of "Pentaho for Big Data Analytics" book (CLOSED)

I am very excited to launch a giveaway contest for the book Pentaho for Big Data Analytics . HOW TO ENTER: To enter the contest, simply visit the book page  once and leave your comments at the bottom of this blog. To ensure you get a copy of  Pentaho for Big Data Analytics , consider purchasing it on  Amazon . PRIZE: Two lucky winners will receive a paperback copy of  Pentaho for Big Data Analytics  by Manoj R Patil & Feris Thia . (For those not residing at US or Europe would get e-Book.) THE RULES: One entry per e-mail address. Contest will run from 2/27/14 through 3/9/14 Two lucky winners will be selected on or around 3/11/14 Open to all residents. ABOUT BOOK : The book will help you achieving following objectives: Get to grips with the Pentaho suite Explore the basics of Big Data and its business context Set up a Pentaho business analytics server Consume Big Data on HDFS platform using Pentaho Data Integration Create visualization with P