Open Source Business Intelligence

1 - I can do better 2 - Jury's out 3 - Pretty darn good 4 - Splendiferous 5 - Awesometastic by 71 people | Log in to rate

Ranked #284 in Tech & Geek, #5,995 overall

Open Source Solutions have become serious alternatives to traditional proprietary licensed software with over 25 open source projects providing a wide variety of tools for data warehousing and full BI suites.  Clarise Z. Doval Santos and Joseph A. di Paolantonio have been studying open source projects related to data analytics, data warehousing and business intelligence for over five years.  This lens, with supporting blog and wiki on the subject, provides the results of that research.  This lens provides an additional tool for finding and recording information related to open source solutions for BI.  This research is sponsored by InterActive Systems & Consuting, Inc, which provides strategic consulting and project management through InterASC Professional Services,, for BI, collaboration & distributed workgroup solutions, and hosting of open source applications through the TeleInterActive Network.  Since 1995, Clarise and Joseph have worked together helping people gather data, turn it into information through analysis, and share the results through collaboration tools.

OSBI Explosion 

2005 was the Year of Starting OSBI Projects

The year 2005 saw the start of 10 new open source projects related to data analytics, specifically business intelligence and data warehousing. Five of these were BI suites, providing either several unified components or comprehensive solutions for BI. We first started to investigate open source projects for data warehouse components in 2000. We found one for ETL, Jetstream, one OLAP engine, Mondrian, and jPivot, which gave a front end to Mondrian. 2006 saw some growth and some convergence. In 2007, there are over 45 projects related to data analytics covering databases, ETL/EAI, reporting, OLAP, portals, dashboards, data mining, GIS and visualization.

Podcasts from TeleInterActive (TIA) Press 

Podcast Links from TIA Press on Open Source and OSBI

Pentaho SQR for Bugzilla This podcast discusses the newly launched Pentaho open source project, Software Quality Reports (SQR) for Bugzilla.

OSC Podcast Pentaho Overview Part 3 concludes our conversation with James Dixon and Lance Walter of Pentaho. We explore the community, which is one of the most important aspects of an open source project, and how Pentaho supports its community.

OSC Podcast Pentaho Overview Part 2 continues our conversation with James Dixon and Lance Walter of Pentaho. We explore open source licensing, advantages and challenges in this second of three parts.

OSC Podcast Pentaho Overview is with James Dixon and Lance Walter of Pentaho. It provides the first of three podcasts giving an overview of Pentaho.

What is Open Source: Talks about licensing, platforms, and uses of Open Source software.

Why Open Source: Talks about why open source is important to businesses, IT shops and software developers. as well as explore the TCO of Open Source projects.

OSBI Links 

Open Source Solutions BI Wiki
The Open Source Solutions BI wiki will provide research, articles and the archive for the OSBI Daily, as they are developed. For more information please contact us.
Open Source Solutions Blog
This blog provides timely information about open source solutions for business intelligence, collaboration, project managment & data analytics, and expose the process of three authors writing a book collaboratively. Posts range from news & status of OSBI projects, interviews with OSBI project teams and communities' members, and describing OSS related events.
Open Source Business Conference
The Open Source Business Conference is a semi-annual event held in the spring on the West Coast and in the fall on the East Coast.
OSBC Wiki
The Open Source Business Conference has a Wiki on SocialText that supplements the information on the website.
Business Intelligence for Business People
This lens by Tom Hudock is a good source of non-technical information about "Business Intelligence (BI), Performance Management, and Data Warehousing (DW)"

Feed from OSS Blog 

Loading Fetching RSS feed... please stand by

Links to OSS BI Suites 

Communities, Projects and Companies supporting OSS BI Suites

One thing with which we're struggling is how to define a BI Suite. Must it be a comprehensive, end-to-end solution? Since we don't know of any commercial BI Suite that started out as a full tilt boogie, everything from ETL to Portal solution, we're not going to demand that of open source software BI Suties. So, if a project unites more than one tool for creating a BI solution, 'tis a suite. We think.
BEE Project
BEE is one of the first open source BI Suites, having been around since 2002. It provides ETL, ROLAP, reporting, integration with the R Project, is written in PERL, and primarily supports MySQL.
JasperSoft BI Suite
The Jasper BI Suite provides a framework for report automation and ad hoc reporting, as well as full OLAP and ETL capabilities. Components include JasperReports, iReport, JasperServer, JasperAnalysis and JasperETL.
Openi
Openi provides a web-driven interface to OLAP, relational, statistical and data mining sources giving BI integrators user interface, report definition and connector tools.
Pentaho
Pentaho BI Suite provides a framework for a full array of capabilities: Reporting, Analysis, Dashboards, Data Integration, Data Mining and Workflow.
SpagoBI
SpagoBI is a BI platform drawing its components from the ObjectWeb consurtium. Tools include metadata management, ETL, Reporting, Analysis, and Dashboards.

Links to OSS ETL Tools 

Communiteis, Projects and Companies supporting OSS ETL Tools

Extract, Transform and Load is often the most difficult and time consuming aspect of a data warehouse project. Tools that help the BI integrator to create, manage and maintain the rules for extraction of disparate data from multiple sources, transformation into a standard and clean data set, and the timely loading into the data repository, ODS or data warehouse is very important. Some of these tools provide EAI capability as well.
KETL
KETL is an ETL for high volume transactions developed by Kinetic Networks.
Enhydra Octopus
Enhydra Octopus is part of the ObjectWeb GForge project, providing JDBC Data Transformations
Pequel ETL
Pequel ETL is, according to their SourceForge description, a comprehensive and high performance data processing/transform system. It features a simple, user-friendly event driven scripting interface that transparently generates & executes highly efficient Perl/C code. Uses: ETL, datawarehousing, statistics, and data-cleansing.
Clover ETL
Clover ETL is an open source Java based framework for building data transformations (ETL applications).
CpluSQL
The cplusql distributed ETL tool extracts and transforms row based data from databases and flat files for terabyte scale datawarehouse loading.
JetStream
JetStream is the first open source ETL tool that we used. It is described as a Java Extraction Transformation Service for Transmitting Records & Exchanging Application Metadata: a Java-based ETL/EAI tool.
Apatar
Apatar ETL tool's modular architecture delivers 1. Visual job designer/mapping 2. Connectivity to all major data sources 3. Flexible Deployment Options (GUI, or server engine with JVM, or embedded).
KETTLE
Don't confuse KETL and KETTLE - they're not related. K.E.T.T.L.E (Kettle ETTL Environment) is a meta-data driven ETTL tool (Extraction, Transformation, Transportation & Loading). Kettle is also available as Pentaho Data Integration.
openDigger
OpenDigger is a java based compiler for the xETL language. xETL is a language specifically projected to read, manipulate and write data in any format and database. With OpenDigger/XETL you can build powerful Extraction-Transformation-Loading (ETL) prograns.
Talend
Talend Open Studio is a mature product, three years in the making before coming out for download.
* developer tools: to create process
* administrator: to manage distributed process on a grid architecture
* launcher tools: to launch process
* PAM: Process Activities Monitor

The ETL language is PERL, and JAVA. But Perl provide many more connectors than do the java libraries.

Links to OSS OLAP Tools 

Communiteis, Projects and Companies supporting OSS OLAP Tools

On-Line Analytical Processing tools comes in several flavors: MDDB OLAP or MOLAP, Relational OLAP or ROLAP and HOLAP - then there is "H is for hybrid" HOLAP, and there are open source software projects for each type. This list below includes engines or servers as well as front-ends for OLAP or MDDB use.
Mondrian
Mondrian is one of the oldest open source BI components, having been registered in 2001. It is also used as the OLAP engine in other open source software OLAP and BI Suite projects such as JasperAnalysis and Pentaho Analysis. Pentaho provides support for the Mondrian forums.
JasperAnalysis
JasperAnalysis is part of the Jasper BI Suite, available from the JasperForge through JasperIntelligence. Based upon Mondrian and jPivot, JasperAnalysis provides full OLAP standards compliance and analytical capabilities.
PALO
PALO is a recent entry to the open source software OLAP field. It's different in that it is esentially an add-in for Micorsoft Excel. PALO provides a MDDB for Excel, with future plans to allow access through other APIs as well. From their homepage... "Palo is an advanced data store for Microsoft Excel that allows you to handle large amounts of Excel data on a small number of worksheets. In addition, it also allows you to share Excel data real-time with your collegues."
Pentaho Analysis Mondrian
Pentaho Analysis uses Mondrian at its core to provide for variable analysis, graphical representations of data, and drill down.
JPivot
JPivot is a JSP tag library supporting XMLA that provides a front-end OLAP table to the Mondrian OLAP engine, allowing typical OLAP functions such as slice-and-dice, drill-down and roll-up.
pocOLAP
pocOLAP is a web-based, cross-tab reporting tool written in Java, that also allows for drill-down. The name comes from "poco", meaning "little" in the Italian and Spanish.
OpenOLAP for MySQL
Currently a Japanese only version of OpenOLAP ported from PostgreSQL to MySQL. The PostgreSQL version is hosted on sourceforge.jp.
OpenOLAP
OLAP tool for PostgreSQL

Links to OSS Reporting Tools 

Communiteis, Projects and Companies supporting OSS OLAP Tools

Reporting tools can be simple or complex, web-based or not, with designers or not. Here's the list.
JasperReports
JasperReports is one of the oldies as well, starting in 2001. More recently a company, JasperSoft has been formed to invest in JasperReports, as well as to provide support, training and various other services.
Agata Report
From their web site..."Agata Report is a Database Reporting Tool and EIS tool, MIS tool (graph generation), like Crystal Reports. Its written in PHP-GTK and allows you to edit and get SQL results from several databases (PostgreSQL, MySQL, Oracle, SyBase, MsSql, FrontBase, DB2, Informix and InterBase) as as PostScript, plain text, HTML, XML, PDF, or spreadsheet (CSV) formats through its graphical interface. You can also define levels, subtotals, and a grand total for the report, merge the data into a document, generate address labels, or even generate a complete ER-diagram from your database."
DataVision
DataVision is an Open Source Report Writer that allows drag-and-drop report design through its GUI. It is written in Java and can connect to any database supporting JDBC.
OpenReports
From their website... "OpenReports is a flexible open source web reporting solution that allows users to generate dynamic reports in a browser. OpenReports uses JasperReports, an excellent full featured open source reporting engine, and was developed using leading open source components including WebWork, Velocity, Quartz, and Hibernate."
OpenRPT
OpenRPT is a full featured, cross-platform SQL report writer that stores its report definitions as XML, and has a WYWIWYG report writer that can be used in stand-alone or embedded fashion.
JFreeReport
jFreeReport is standalone Java report library with a nice series of capabilities and a decent community around it. In January, 2006, jFreeReport became a part of the Pentaho suite.
iReport
iReport is now part of the JasperSoft tools and is available as an individual download or as part of the Jasper BI Suite.

Links to OSS Databases Projects 

Communiteis, Projects and Companies supporting OSS RDBMS Projects

There are quite a few open source RDBMS, though few are optimized for query within a VLDB environment.
EnterpriseDB
The EnterpriseDB project takes PostgreSQL and adds Oracle and PL/SQL compatibility to it, making a rather powerful RDBMS open source solution.
Derby
Derby is the Apache database project, written in pure java to have a small footprint.
Firebird
Firebird is a RDBMS written C and C++ that provides many ANSI SQL-99 features as well as stored procedures and triggers. It is basd on the source code released by Borland -> Inprise -> Borland in 2000, and has exsisted in one form or another since 1981
Ingres
CA released the source code for the vernerable Ingres RDBMS and formed the new Ingres corporation in the November of 2005 under their own CATOS license.
MySQL
MySQL is reportedly the most deployed open source RDBMS out there. It has proven suitable for VLDB implementations allowing a robust store now, query anytime architecture.
PostgreSQL
Professor Michael Stonebraker of the Univeristy of California at Berkeley created Postgres as a successor to his other database, Ingres, in 1986. Postgres became Ilustra joined Informix acquired by IBM and found new life in the lab as Postgres95, which was redone and open sourced in 1996 as PostgreSQL Click on the "About" and then "History" link from the main site. Fun stuff, and I even remember it all happening. PostgreSQL is being revamped in a branch distribution specifically for data warehousing in the Bizgres project - see the BI Suites links.
Oracle Berkeley DB ( Sleepycat)
Oracle bought Sleepycat Software. Sleepycat's open source DB came is now released as Oracle Berkeley DB. The three flavors are:
Berkeley DB: A transactional storage engine for un-typed data in basic key/value data structures
Berkeley DB Java Edition: A pure Java version of Berkeley DB optimized for the Java environment
Berkeley DB XML: A native XML database with XQuery-based access to documents stored in containers and indexed based on their content

Links to OSS BI Development Tools 

Communiteis, Projects and Companies supporting OSS Roll Your Own

There are open source tools, platforms and standards to help you "roll your own" BI solutions. These are often very good starting points for experienced development teams, and can help to fill in the gaps in both proprietary and open source solutions.
Eclipse BIRT
Eclipse is the IDE for Java and J2EE, and BIRT is, basically, its reporting plug-in.
EFEU
EFEU is a programming environment to develop C-programs and libraries. It is often pointed to as facilitating the development of ETL and reporting software.
JpGraph
JpGraph is an OO Graph drawing library for PHP that is very useful for data visualization and presentation.
PostgreSQL MDDB
The linked article describes using EFEU with PostgreSQL to create a multi-dimensional database for use in OLAP.

Links to DW Sources 

TDWI News
Developments in the business intelligence and data warehousing industry tracked by The Data Warehouse Institute (TDWI)
DM Review Portal
DMReview.com, website of the DM Review magazine, provides portal of information on business intelligence, analytics, integration and data warehousing

Technorati OSBI Tags Feed 

Loading Fetching RSS feed... please stand by

Technorati All Open Source Related Posts 

Loading Fetching RSS feed... please stand by

Google News - Open Source 

Loading Fetching RSS feed... please stand by

Books on Amazon for Open Source, DW or BI 

Succeeding with Open Source (Addison-Wesley Information Technology Series)

Avg. Customer Rating: Amazon Rating

Amazon Price: (as of 07/10/2009) Buy Now

Essential Open Source Toolset

Avg. Customer Rating: Amazon Rating

Amazon Price: $42.85 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Open Source Software: Implementation and Management

Avg. Customer Rating: Amazon Rating

Amazon Price: $61.95 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Open Source Enterprise Solutions: Developing an E-Business Strategy

Avg. Customer Rating: Amazon Rating

Amazon Price: (as of 07/10/2009) Buy Now

The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)

Avg. Customer Rating: Amazon Rating

Amazon Price: $52.98 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Building the Data Warehouse (3rd Edition)

Avg. Customer Rating: Amazon Rating

Amazon Price: $56.70 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications (Addison-Wesley Information Technology Series)

Avg. Customer Rating: Amazon Rating

Amazon Price: $52.67 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Open Source Solutions For Small Business Problems (Networking Series)

Avg. Customer Rating: Amazon Rating

Amazon Price: $26.37 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Open Source GIS: A GRASS GIS Approach (The Springer International Series in Engineering and Computer Science)

Avg. Customer Rating: Amazon Rating

Amazon Price: $55.29 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Beginning MapServer: Open Source GIS Development (Expert's Voice in Open Source)

Avg. Customer Rating: Amazon Rating

Amazon Price: $40.49 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin

Avg. Customer Rating: Amazon Rating

Amazon Price: $39.80 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Business Intelligence: The Savvy Manager's Guide (The Savvy Manager's Guides)

Avg. Customer Rating: Amazon Rating

Amazon Price: $33.50 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

e-Data: Turning Data Into Information With Data Warehousing (Addison-Wesley Information Technology Series)

Avg. Customer Rating: Amazon Rating

Amazon Price: $32.54 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Open Source for the Enterprise: Managing Risks, Reaping Rewards

Avg. Customer Rating: Amazon Rating

Amazon Price: $17.21 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Data Mining and Business Intelligence: A Guide to Productivity

Avg. Customer Rating: Amazon Rating

Amazon Price: $50.58 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

Mastering Data Warehouse Design: Relational and Dimensional Techniques

Avg. Customer Rating: Amazon Rating

Amazon Price: $32.85 (as of 07/10/2009) Buy Now

Usually ships in 24 hours

TailRank Posts Tagged as OpenSource 

Loading Fetching RSS feed... please stand by

Google News Feed on OSBI Projects 

Loading Fetching RSS feed... please stand by

OSS DW Jobs on SimplyHired 

This is an RSS feed from SimplyHired resulting from the search for "Open Source" "Data Warehouse" job postings, without regard to location.

Loading Fetching RSS feed... please stand by

Archive of OSBI Daily 

This module will display past entries from OSBI Daily

The entire archive is hosted on our Open Source Solutions Wiki.

2007.05.09
SpagoBI 2.2.0 with some spanking new features is available for download. Mule 1.4 has been released with new BPM support features. JasperSoft wins the 2007 Duke's Choice Award at JavaOne Developer Conference with James Gosling name the JasperReports Open Source project among Coolest products in the World for Java Technology.

2007.05.04
The Open Source ThinkTank 2007 was held in March by the Olliance Group and DLA Piper, who have available for immediate download a PDF of their 14-page Executive Summary Report. The Executive Summary is very informative, showing increased acceptance of Open Source Solutions, continued barriers to adoption, and impact upon proprietary software vendors.

2007.04.27-05.03
Just didn't find anything that was all that interesting, and didn't feel like making something up.

2007.04.24-26
Here's the feeds you need to find the goodies from MySQL2007 PlanetMySQL Feed, MySQL from Technorati, MySQL on IceRocket, and MySQL on Google.

2007.04.23
We've put together what we hope is a comprehensive guide to open source BI and data warehousing at the 2007 MySQL Conference and Exposition.

Feeds we Read on Enterprise Open Source 

Communities, Bloggers, Analysts and the Press Talking about Open Source

This feed is our blogroll of all the feeds we've found that discuss commercial or enterprise open source solutions

Loading Fetching RSS feed... please stand by