Key Roles involved in a BI Data Warehouse Project
Source: Dylan's BI Study Notes [link]
To develop or deploy a BI solution for your organizations, you need to have the right people involved in the time time. Here are typical roles involved in a BI data warehouse project.
- Project Sponsor
- Project Manager
- Functional Analyst
- SME
- BI Architect
- ETL Developers
- DBA
The job description and responsibilities are listed in this table: (more…)
Yet another theme change
Source: Pete-s random notes [link]
Emplea Libre, Feria de empleo del Software Libre
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
Cada vez son mas las empresas que se dedican al Software Libre y no son tantas las personas cualificadas con estos conocimientos o suele ser dificil que entren en contacto los demandantes con los ofertantes de empleo en este campo.
Para mejorar esta situación, se ha creado Emplea Libre, Feria de empleo del Software Libre ; es la primera Feria de Empleo que se desarrolla en el mercado del Software Libre, del día 13 al 16 de diciembre en Cáceres.
Gracias a ella, desarrolladores, estudiantes y empresarios del entorno Software Libre contarán con un espacio de encuentro y puesta en común que, sin duda, será útil a unos y a otros y servirá asimismo para la mejora de la Comunidad.
Que duda cabe, que nosotros estamos muy interesados en poder participar, aunque no sabemos si nos va a ser factible desplazarnos en aquellas fechas a Cáceres. En cualquier caso, si alguien esta interesado, que consulte nuestras ofertas y no dude en escribirnos.
Toda la informacion del evento Emplea Libre:
¿Qué es?
El evento
Objetivos
Dónde y Cuándo
¿A quién va dirigido?
Empresas participantes
The SaaS Analytics Sales Process
Source: Keep It Simple [link]
Part 3 of Bob Warfield’s interview with LucidEra’s CEO Ken Rudin was posted today.
Topics covered:
- How does SaaS affect the sales process?
- The LucidEra technology platform
- Multitenancy
The conclusion?
“LucidEra is a fascinating business. They’re in an interesting place now that the old school BI vendors have largely been bought. Even more interesting is their approach of delivering a solution rather than a tool, and getting the customer up and running as the first stage of the “sales” cycle. Add to that some radical new technology in the form of the column store LucidDB, and we’re seeing a total reinvention of what BI means.”
Mini minis from SAS UK
Source: The sascom magazine blog [link]
Take a course, get a car. Enjoy the ride.
Details on The RedDatabase Symposium
Source: Mark Rittman's Oracle Weblog [link]
I’ve finally got around to updating the events page on our website, taking down the old Open World and ODTUG events and uploading details of the UKOUG Conference and some of the engagements already booked in for 2008.
One of the events next year that I’m particularly looking forward to is the RedDatabase Symposium in The Hague on Monday 10th March - Wednesday 12th March 2008, and Munich on June 2nd - 4th 2008.

The speaker list looks like a who’s who of database, middleware and BI development (and me, of course…) and includes Jonathan Lewis, Mogens Norgaard, Julian Dyke, Lucas Jellema, Carel-Jan Engel and Daniel Fink, and with my session on the subject of “Analytics and Data Warehousing Using Oracle 11g and OBIEE “.
Here’s the abstract for the session:
“Analytics and Data Warehousing Using Oracle 11g and OBIEE
Mark Rittman, Rittman Mead Consulting
This one-day seminar takes delegates through the new BI & data warehousing features found in Oracle Database 11g, Oracle Business Intelligence Enterprise Edition 10g and Oracle’s OLAP and analytical tools.
Topics:
- ETL and DW development using Oracle 11g and OWB 11g
- New Data Warehousing and ETL features in Oracle Database 11g
- New features in Oracle Warehouse Builder 11g
- ETL techniques using Oracle’s DW tools
- Leveraging alternate ETL tools such as ODI, ESB and Oracle Streams
- Introducing Oracle BI Server and the Oracle BI EE architecture
- Core BI Server functionality and it’s relation to the Oracle 11g Database
- Creating the BI Server metadata layer
- Caching, Summary Management
- Creating metadata over federated data sources
- Creating reports using Oracle Answers
- Publishing reports to Oracle Interactive Dashboards
- Advanced Techniques using Answers
- The OBIEE Presentation Server SOAP interface
- Oracle OLAP and Oracle Essbase fundamentals
- Oracle OLAP 11g Storage model and SQL Interface
- Using Oracle OLAP 11g Analytic Workspaces as DW summaries (MVs)
- Key Essbase fundamentals and architecture
- Using Essbase as an Oracle BIEE data source
- Benefits and role of OLAP and the two Oracle OLAP engines”
Even though I’m only speaking on the one day, I’ll probably go over for all three just to be able to hear the other presenters. Also, Munich and The Hague are pretty cool places and should make pretty good venues for the two events. Should be good.
Finally, I’m looking for an organization (or organizations) to help me put together a seminar tour in the USA / North America, and in the Middle-East / Gulf States. It’d be on the same material as above (perhaps spread over more than one day) - if you’re interested, or know of someone who might be able to help me organize something, let me know at mark.rittman@rittmanmead.com and we can have a chat.
Join the ODTUG BI&DW SIG Oracle Mix Group
Source: Mark Rittman's Oracle Weblog [link]
One of the big community announcements at Oracle Open World this year was Oracle Mix, a sort of cross between Facebook and LinkedIn for Oracle customers and developers. Kent Graziano and I have now set up the Oracle Mix ODTUG BI&DW SIG Group and anyone can join up once they’re an Oracle Mix member.

If you’ve come along to any ODTUG BI&DW SIG meeting, or if you’re just interested in shooting the breeze about any aspect of Oracle BI&DW tools development, sign up and join in the conversation. We’re also looking for speakers and topics for ODTUG Kaleidoscope 2008 in New Orleans, so if you’ve got some ideas add them to the mix, and if you’re thinking of speaking, pop on over to the call for papers and submit something now!
Analytics: The Critical Other Half of your CRM Solution
Source: Keep It Simple [link]
Be sure to register for this informative panel discussion on December 5th at the salesforce.com offices in San Mateo. The topic of this Sales Operations Forum meeting is the importance of sales analytics to CRM success.
When: December 5, 2007
Time: 8:30 - 10:00 AM
Topic: Analytics: The Critical Other Half of your CRM Solution
We hope to see you there!
Mr. Guthrie Rolls The .NET Road Map
Source: Loosely Coupled Human Code Factory [link]
I’m not even going to write much on this road map topic, I’m just looking forward to the ASP.NET MVC. In almost every application I have lined up, even a lot of the prospective business intelligence work I have lined up this pattern architecture will be ridiculously useful! Separation of concerns is of vital importance in so many aspects. I’m all about being organized and neat in my code, to the point of a slight obsession. The fact that the MVC enables a lot of this is absolutely…(read more)
How to Be Heard?
Source: Loosely Coupled Human Code Factory [link]
In almost every business, even when there are only two participants, there is always a need to communicate. The emphasis of teaching others to speak well, write well, and in general communicate their point clearly and decisively is very important. I however have found that one soft skill that is just as much, if not more important, is making sure that whoever you are communicating to is hearing what you are saying. All to often in work environments people communicate the entire day with…(read more)
Email Distribution of Reports using BI Publisher and Discoverer
Source: Mark Rittman's Oracle Weblog [link]
One of the presentations I’m giving on the Monday at the UKOUG Conference is a joint talk with Mike Durran, the Oracle Discoverer Product Manager. Mike is going to do an update on Discoverer migration and interoperability with Oracle BI Suite Enterprise Edition, and I’m going to show off one of the first fruits of this program, the ability for Discoverer worksheets to be used as a data source for Oracle BI Publisher. Mike and I discussed this feature, and we thought one of the best ways of showing off what this feature can do would be to do something Discoverer historically couldn’t do - schedule reports and distribute them via email. This sort of functionality is built-in to BI Publisher, so whilst Mike’s been putting the slides together, I’ve put together a demo to show off this integration.
The process starts off with a regular Discoverer worksheet, in this case from the Videostore Video Tutorial Workbook. Discoverer is at version 10.1.2.2.0 whilst BI Publisher is at 10.1.3.3 - see this previous posting on getting everything set up for Discoverer / BI Publisher integration.

Then it’s a case of starting up Microsoft Word, connecting to BI Publisher Enterprise as an OID user that has access to both Discoverer and BI Publisher, and then selecting the Discoverer workbook from the “New Template” dialog.

Once that’s done, the BI Publisher Desktop add-in grabs a sample dataset and presents me with a blank template to start working on. I use the Chart Wizard to create a simple bar chart using “Profit Sum” as the measure, “Deparment” as the label and “Year” as the series, like this:

Then I create a crosstab to accompany it, add a title and some commentary text, and drag and drop a logo onto the top right-hand part of the page, like this:

Now if I’m going to display this template in HTML format (as opposed to say, PDF) I need to make sure this image is available on a web server somewhere, so I copy it to the following location:
$ORACLE_BI_HOMEj2eehomeapplicationsxmlpserverxmlpserverobi_logo.jpg
which makes it available at the URL:
http://winxpvm:7781/xmlpserver/obi_logo.jpg
and I double-click on the image in the Word document, select “Web” from the “Properties” tab, and enter the following code:

I then save the template as an RTF file, and preview it in HTML form. Everything looks fine.

Then, I use the BI Publisher Desktop menu to upload the template to the BI Publisher Enterprise server, and it’s now ready to run within Enterprise. Before I do that though, I log in as an administrator and set up the Scheduler. This involves creating a database schema then passing the connection details to the Scheduler setup wizard, which then creates all the scheduler tables for me.

After I restart BI Publisher so that the new settings can take effect, I go back in as the administrator and set up email delivery. For the purposes of the demo I’m using a freeware email server that installs on Windows called ArgoSoft Mail Server which lets me demo email integration without an internet connection; I enter the connection details into the BI Publisher Enterprise web page and my email connection is set up.

Now it’s a case of logging back in to BI Publisher Enterprise as the report owner, locating the report and then clicking on the link to schedule it.

This brings up a screen where I can set the frequency of report runs, provide an email address for the report to be sent to, and provide any other details relating to the running of the report and the different destinations it can go to, including FTP, WebDav and the filesystem.

Once the scheduled job is submitted, I can check back on the schedule details and see that the report has run correctly:

and then finally, I can start up my email application and see that the report has been delivered.

As well as BI-style reports, BI Publisher can also produce reports in any format, including mail-merge letters, a feature that’s often required by customers migrating off tools like Cognos Impromptu. Overall, this is a useful feature for customers who’ve got a lot of investment in Discoverer reports and metadata, but want a bit more flexibility in the way they can use the data.
OLAP Workshop Part 2 : Understanding OLAP Technology
Source: Oracle Business Intelligence Blog [link]
OLAP Workshop Part 2 : Understanding OLAP Technology
In the last posting I hopefully explained some of the basic concepts behind OLAP. In this posting I want to explore how those basic concepts are exposed by the various OLAP aware ETL and reporting tools provided by Oracle and other BI vendors.
Architecture of Oracle OLAP
For a long time now Oracle has been unique in the marketplace.
Top 5 Things Smaller Businesses Don’t Know About BI
Source: Keep It Simple [link]
This article by Laurie Sullivan lays out 5 of the most important things small and midsize businesses need to know about business intelligence. While I found it interesting that she lists off the usual suspects in the traditional, on-premise BI market and makes no mention of the SaaS opportunity, her list (with my commentary) is instructive nevertheless:
- Not all BI projects start with an expensive and lengthy investment. (Can you say, on-demand analytic applications? Her advice is still to build the nuclear on premise with scaled-down BI suites. You own the solution, and you own the problem in terms of integration, on-going maintenance, upgrades, training, etc.)
- Data warehouses aren’t required to build a successful business intelligence project. (Amen! This is a fundamental belief of LucidEra.)
- Multibillion-dollar companies analyze only 20 percent of the data they collect and store. Small and midsize businesses can outperform much larger organizations by increasing that percentage. (Hallelujah!! As we like to say, “Simplify! Simplify! Simplify!”)
- BI projects can transition into revenue-generating investments by designing reports and selling information to partners and customers. (Violent agreement here, except for the “sunk costs” that most of these projects get mired in. At Business Objects we used to try to avoid the term “data warehouse” at all costs due to the IT pain most of these “projects” bring to mind within many organizations.)
- Combining BI with service-oriented architecture (SOA) can streamline business processes and allow more people across a company to access and benefit from the data. (Now we’re back in IT-implementation speak. This is where people in SMB / midsized companies lose interest. If I’m a VP of sales at a high-growth company, do I want to hear about SOA or how the right business analytics can help me increase my deal size, close more deals faster, and improve sales effectiveness? As soon as the conversation can move away from tools and towards analytic solutions, the BI market will continue to struggle to get the attention of the people who need it the most.)
Data Integration Challenge – Understanding Lookup Process – III
Source: Business Intelligence – A Practitioner’s View [link]
In Part II we discussed ‘when to use’ and ‘when not to use’ the particular type of lookup process, the Direct Query lookup, Join based lookup and the Cache file based lookup. Now we shall see what are the points to be considered for better performance of these ‘lookup’ types.
In the case of Direct Query the following points are to be considered
• Index on the lookup condition columns
• Selecting only the required columns
In the case of Join based lookup, the following points are to be considered
• Index on the columns that are used as part of Join conditions
• Selecting only the required columns
In the case of Cache file based lookup, let us first try to understand the process of how these files are built and queried.
The key aspects of a Lookup Process are the
• SQL that pulls the data from lookup table
• Cache memory/files that holds the data
• Lookup Conditions that query the cache memory/file
• Output Columns that are returned back from the cache files
Cache file build process:
Based on the product Informatica or Datastage when a lookup process is being designed we would define the ‘lookup conditions’ or the ‘key fields’ and also define a list of fields that would need to be returned on lookup query. Based on these definitions the required data is pulled from lookup table and the cache file is populated with the data. The cache file structure is optimized for data retrieval assuming that the cache file would be queried based certain set of columns called ‘lookup conditions’ or ‘key fields’.
In the case of Informatica, the cache file is of separate index and data file, the index file has the fields that are part of the ‘lookup condition’ and the data file has the fields that are to be returned. Datastage cache files are called Hash files which are optimized based on the ‘key fields’.
Cache file query process:
Irrespective of the product of choice following would be the steps involved internally when a lookup process is invoked.
Process:
- Get the Inputs for Lookup Query, Lookup Condition and Columns to be returned
- Load the cache file to memory
- Search the record(s) matching the Lookup condition values , in case of Informatica this search happens on the ‘index file’
- Pull the required columns matching the condition and return, in case of Informatica with the result from ‘index file’ search, the data from the ‘data file’ is located and retrieved
In the search process, based on the memory availability there could be many disk hits and page swapping.
So in terms performance tuning we could look at two levels
- how to optimize the cache file building process
- how to optimize cache file query process
The following table lists the points to be considered for the better performance of a cache file based lookup
|
Category |
Points to consider |
|
Optimize Cache file building process |
• While retrieving the records to build the cache file, sort the records by the lookup condition, this sorting would speed up the index (file) building process. This is because the search tree of the Index file would be built faster with lesser node realignment • Select only the required fields there by reducing the cache file size • Reusing the same cache file for multiple requirements for same or slightly varied lookup conditions |
|
Optimize Cache file query process |
• Sort the records that come from source to query the cache file by the lookup condition columns, this ensures less page swapping and page hits. If the subsequent input source records come in a continuous sorted order then the hits of the required index data in the memory is high and the disk swapping is reduced • Having a dedicated separate disk ensures a reserved space for the lookup cache files and also improves response of writing to the disk and reading from the disk • Avoid querying recurring lookup condition, by sorting the incoming records by the lookup condition |
Data Integration Challenge – Understanding Lookup Process –II
Source: Business Intelligence – A Practitioner’s View [link]
Most of the leading products like Informatica, DataStage support all the three ways of lookup process in their product architecture. The following table lists ‘when to use’ and ‘when not to use’ the particular type of lookup process.
|
Advantage Cache Lookup:
The advantages of using cache file based lookups are that
- Fields that are present in the cache file is only that is needed by the lookup process so when querying the cache file the return would be faster as compared to the lookup table that might have more fields present
- The data structure of the cache file would be designed in such that the query from the ETL server is easily understood without any additional layer like SQL
Though in general it is said in user manuals that usage of cache files is best suited for low volume of lookup but in practical scenarios I have seen cache files are more valuable in terms of performance when the lookup records are huge.
Dynamic Cache: We have the concept of Dynamic Cache in Informatica and as well in Hash files of Datastage where you can Insert/Update or delete records from these cache file. The feature of updating the cache files is useful when we want to keep the cache file and the lookup table in sync.
Handling Multiple Return Records: Handling the return of multiple records by a lookup process is still a challenge not implemented in any of the leading products – limited to my knowledge. Probably in release 9 Informatica’s lookup can have a parameter for defining the number of records to return as an array like in its Normalizer transformer.
In Part III we shall see some of the things to be considered for better performance when using the lookup process
Business Intelligence Utopia - Enabler 4: Service Oriented Architecture
Source: Business Intelligence – A Practitioner’s View [link]
Service Oriented Architecture (SOA) and its closest identifiable alter-ego “Web Services” is another example of hyped-up, much maligned technology buzzword that takes at least 2 or 3 slides in any “bleeding-edge” technology presentation. Having said that, whatever I have investigated on Service Oriented Architectural concepts till now, is enough to warrant its listing as enabler no. 4 for Business Intelligence Utopia.
There are many powerful ways through which SOA can add significant value to the BI environment. The kind of BI, performance management and data integration artifacts that can be developed and published as web services include: Queries, Reports, OLAP slice services (MDX queries), Scoring and predictive models, Alerts, Scorecards, Budgets, Plans, BAM agents, Decisions (i.e., automated decision services), Data integration workflows, Federated queries and much more. You can get more information at the link: http://www.b-eye-network.co.uk/view-articles/4729
But the idea that fascinates me with respect to BI on SOA, is the concept of “Analytical Smorgasbord”. Imagine a scenario where the business user can assemble their own analytical components from a mélange of available ones, resulting in complete customization of information for the user to take his/her decisions. Each of these available analytical components is self-contained and performs a particular piece of BI functionality. These components are ‘Web-Services’ and the SOA in such an enterprise is all about –
a) How are these components created?
b) How do the components interact?
c) How is the information published and consumed, in a secure manner?
The concept of “Analytical Smorgasbord” truly empowers the business users and is a powerful way to enable, what Gartner terms, as “Information Democracy” in the enterprise. It is important to note that the concept of analytical aggregation changes the Data Warehousing paradigm in a profound way - From “Pulling data” to “Seeking data”. In more simplistic terms, the end-user analytics should go and fetch data wherever it is rather than expecting all data to be consolidated into one data repository (typically a data warehouse or data mart). More on this in future posts, under the topic of “Guided Analytics”.
The true intent of this post is to encourage the BI community to start looking at SOA from the end-user analytical standpoint, so that web-services does not remain a mere technology toy but really helps in “Putting the business back in BI” - http://www.tdwi.org/Publications/display.aspx?id=7913
I have intentionally left out the technology details related to SOA. You can find wonderful resources on the web like this one: http://www.dmreview.com/portals/portal.cfm?topicId=1035908 It is becoming increasingly important for BI practitioners to acquire/develop knowledge on Web technologies, XML, SOAP, UDDI, etc. as different domains are converging at a rapid pace..
Enabler 4 in the “Power of Ten” is more precisely defined as – Service Oriented Architecture enabling the creation of BI “Analytical Smorgasbord”.
Business Intelligence Utopia - Enabler 5: Extensible Data Models
Source: Business Intelligence – A Practitioner’s View [link]
Enabler 5 in my list for Business Intelligence Utopia are the ubiquitous, hard-working “Data Models”. Data Model is the heart of any software system and at a fundamental level provides placeholders for data elements to reside.
Business Intelligence systems with all its paraphernalia – Data Warehouses, Marts, Analytical & Mining systems etc. typically deals with the largest volume of data in any enterprise and hence data models are highly venerated in the Data Warehousing world.
At a high level, a good Data Warehouse data model has the following goals: (Corollary – If you are looking for a data modeler look for the following traits)
1) Understand the business domain of the organization
2) Understand at a granular level the data generated by the business processes
3) Realize that business data is an ever-changing commodity – So the placeholder provided by the data model should be relevant not only for the present but also for the future
4) Can be described at a conceptual and logical level to all relevant stakeholders
5) Should allow for non-complicated conversion to the physical world of databases or data repositories that is manipulated by software systems
Extensible Data models deal with all the 5 points mentioned above and more specifically has future-proofing as one of its stated goals. Such extensible models are also “consumption agnostic”, i.e. - it provides for comparable levels of performance irrespective of the way data is being consumed.
It is important for BI practitioners to understand the goals of their data models before embarking to use specific techniques for implementation. Entity-Relationship & Dimensional modeling (http://www.rkimball.com) has been the lingua-franca of BI data modelers operating at the conceptual and logical levels. Newer techniques like Data Vault (http://www.danlinstedt.com/) also provides some interesting thoughts in building better logical models for Data Warehouses.
At the physical implementation level, relational databases still form the backbone of the BI infrastructure, supplemented by multi-dimensional data stores. Even in the relational world, traditionally dominated by row-major relational vendors like Oracle, SQL Server etc. there are column-major relational databases of the likes of Sybase IQ with claims of being built ground-up for data warehousing.
In this article on column major databases - http://www.databasecolumn.com/2007/09/one-size-fits-all.html, there is reference to a new DW specific database architecture called Vertica. It makes for a fascinating read - http://www.vertica.com/datawarehousing. The physical layer is also seeing a lot of action with the entry of data warehousing appliance vendors like Netezza, Datallegro etc. (http://www.dmreview.com/article_sub.cfm?articleId=1009168).
The intent of this post can be summed up as:
a) Understand the goals of building data models for your enterprise – Make it extensible and future proof
b) Know the current techniques that help envisage and build data models
c) Be on the look-out for new developments in the data modeling and database world – There is lot of interesting action happening in this area right now.
Extensible data models combined with the right technique for implementing them, lists as Enabler 5 in the “Power of Ten” for Business Intelligence Utopia.
Data Integration Challenge – Understanding Lookup Process–I
Source: Business Intelligence – A Practitioner’s View [link]
One of the basic ETL steps that we would use in most of the ETL jobs during development is ‘Lookup’. We shall discuss further on what lookup is? when to use? how it works ? and some points to be considered while using a lookup process.
What is lookup process?
During the process of reading records from a source system and loading into a target table if we query another table or file (called ‘lookup table’ or ‘lookup file’) for retrieving additional data then its called a ‘lookup process’. The ‘lookup table or file’ can reside on the target or the source system. Usually we pass one or more column values that has been read from the source system to the lookup process in order to filter and get the required data.
How ETL products implement lookup process?
There are three ways ETL products perform ‘lookup process’
- Direct Query: Run the required query against the table or file whenever the ‘lookup process’ is called up
- Join Query: Run a query joining the source and the lookup table/file before starting to read the records from the source.
- Cached Query: Run a query to cache the data from the lookup table/file local to the ETL server as a cache file. When the data flow from source then run the required query against the cache file whenever the ‘lookup process’ is called up
Most of the leading products like Informatica, DataStage support all the three ways in their product architecture. We shall see the pros and cons of this process and how these work in part II.
Business Intelligence Utopia - Enabler 3: Data Governance
Source: Business Intelligence – A Practitioner’s View [link]
The “Power of Ten” introduced earlier in this forum is a list of pre-requisites to deliver the real promise of BI. We have already seen the first two – Changes to OLTP systems and Real time Data Integration.
The third enabler in the list is ‘Data Governance’. With increasing volumes of data coupled with regulatory compliance issues, the topic of Data Governance is very much in vogue, to the extent that anybody can look intelligent (beware!) by coining new terms like Data Clarity, Data Clairvoyance etc.
Data Governance at a very fundamental level is all about understanding the data generated by business, managing the quantity / quality of data and leveraging it to make sound business decisions for the future. From my view, the steps needed in a practical data governance program are:
1) Organizational entity, headed by a Chief Data Officer (CDO), whose task is to formulate and implement decisions related to Data Management across multiple dimensions, viz. Business Operations, Regulatory compliance etc.
2) Comprehensive understanding of the data ‘value chain’ – From the source of origination to its consumption. It is important to understand that the origination and / or consumption can also be outside the organizational boundaries.
3) Understand the types of data within the enterprise by following a ‘divide-and-conquer’ strategy. One of my previous posts on this blog illustrate one way of dividing data into ‘mutually exclusive collectively exhaustive’ (MECE) categories.
4) Profile data on a regular basis to statistically measure its quality.
5) Set-up a Business Intelligence infrastructure that effectively harnesses data assets for making decisions that affects (positively, of course!) the short, medium & long-term nature of business.
6) Continuous improvement program to ensure that data is optimally leveraged across all aspects of business. A data governance maturity model like the one illustrated here - http://datagovernanceblog.com/5-days-to-a-data-maturity-model-for-data-governance-day-1, can be envisaged for your organization.
‘Competing on Analytics’ (http://www.babsonknowledge.org/analytics.pdf) – A classic Harvard Business Review article by Thomas Davenport illustrates the power of fact-based business decisioning. For businesses to realize that power, it is important to realize that good data is a source of competitive advantage and not ‘any’ data.
Data Governance is fundamental to making organizations better and that is the reason that it figures as number 3 in my list of ten enablers for BI Utopia. Informative articles on Data Governance are present at the following link. http://www.bi-bestpractices.com/categories/1274/
Business Intelligence Utopia - Enabler 2: Real Time Data Integration
Source: Business Intelligence – A Practitioner’s View [link]
Business Intelligence practitioners tend to have lot of respect and reverence for transaction processing systems (OLTP), for without them the world of analytical apps simply does not exist. That explains my previous blog in introducing the first enabler for BI Utopia – The Evolution of OLTP systems to support Operational BI.
In this post, I introduce the second enabler in the “Power of Ten” – Real Time Data Integration
Data Integration in the BI sense, is all about, extracting data from multiple source systems, transforming them using business rules and loading it back into data repositories built to facilitate analysis, reporting, etc.
Given that the raw data has to be converted to a different form more amenable for analysis & decision-making, there are 2 basic questions to be answered:
- From a business standpoint, how fast should the ‘data-information’ conversion happen?
- From a technology standpoint, how fast can the ‘data-information’ conversion happen?
Traditionally, BI being used more for strategic decision-making, batch mode of data integration with periodicity of a day or later, was acceptable. But increasingly, businesses demand that the conversion has to happen much faster and technology has to support it. This leads to the concept of “Real Time BI” or more correctly “Right Time BI”. (http://www.tdwi.org/research/display.aspx?ID=7095)
Since the answer to the first question "How Fast" is fast becoming “as fast as possible”, the focus has shifted to the technology side. One area where I foresee a lot of activity, from a Data Warehouse architectural standpoint, is in the close interaction of messaging tools like IBM Websphere MQ etc. with data integration tools. At this point in time, though the technology is available, there aren’t too many places where messaging is embedded into the BI architectural landscape.
Bottom-line is that there is significant value gained by ensuring that raw business data is transformed to information by the BI infrastructure, as fast as possible – the limits being prescribed by business imperatives. The best explanation I have come across to explain the value of information latency is the article by Richard Hackathorn (http://www.tdan.com/view-articles/5132).
Active Data Warehousing is another topic closely related to Real Time Data Integration and you can get some perspective on it thro’ the blog on Decision management by James Taylor: http://www.ebizq.net/blogs/decision_management/2006/06/decision_technologies_and_acti.php
