BI Blogs

Bringing together Business Intelligence voices from across the web

The Analyst’s New Confusion II

Posted on the March 31st, 2010. Read times

Source: James Dixon's Blog [link]

A

Watch Pentaho’s Next Generation BI Webinar

Posted on the March 31st, 2010. Read times

Source: James Dixon's Blog [link]

A

The Analyst’s New Confusion

Posted on the March 31st, 2010. Read times

Source: James Dixon's Blog [link]

A

OWB 11gR2 – Flexible and extensible

Posted on the March 31st, 2010. Read times

Source: Oracle Warehouse Builder (OWB) Weblog [link]


More business analytics 101

Posted on the March 31st, 2010. Read times

Source: The sascom magazine blog [link]

Last year, this Analytics 101 seminar with Tonya Balan was one of the most popular Webcasts produced by BetterManagement. Based on its popularity and the popularity of the tweets and blog post I wrote while watching it, we asked Tonya to write a column on the same topic. That article, Understanding analytics, was by far the reader favorite in the first quarter 2010 issue of sascom magazine.

Given the interest in these introductory topics, I’m excited to tell you about a whole new series of Webcasts with the analytics 101 theme, including text analytics 101, forecasting 101 and data integration 101. You can sign up to watch the new seminars once a month starting in April or explore archived Webcasts on many of the same topics.

On the other hand, if you’re looking for a quick, immediate introduction to analytics to watch or share with colleagues, step through this analytics interactive tour, or watch individual analytics intro pieces on the SAS YouTube Channel, like this:

Data Protection in Cloud and Hosted environments

Posted on the March 30th, 2010. Read times

Source: Blog: Dan E. Linstedt [link]

Let’s face it, cloud computing, grid computing, ubiquitous computing platforms are here to stay.  More and more data mind you will make it’s way on to these platforms, and enterprises will continue to find themselves in a world of hurt if they suffer from security breaches.  If we think today’s hackers are bad, just wait…  they’re after the motherload: all customer data, massive identity theft, etc…  I’m not usually one for doom and gloom, after all - we have good resources, excellent security and VPN and firewalls right?   In this entry we’ll explore the notion of what it *might* take to protect your data in a cloud/distributed or hosted environment.  It’s a thought provoking future experiment - maybe it would take a black swan?

Fun with numbers. Big numbers.

Posted on the March 30th, 2010. Read times

Source: The sascom magazine blog [link]

How big is a billion? How many cars can you buy with a trillion dollars? How long would it take you to count to one billion?

A few years ago when we were talking about billions of terabytes of data (or maybe it was billions of exabytes, I’m not quite sure), our video team put together a whole series of videos to better illustrate the actual size of a billion, including the one below, and two others that showed a billion dollar bills and a billion credit cards.

I thought of these videos today when I came across this new number quotes site on the lifehacker blog.

Now, if you can just remember how many zeros are in a billion, you can look it up and come up with a few dozen different ways to visualize or think about such a large number.

Pentaho Agile BI Demo

Posted on the March 30th, 2010. Read times

Source: James Dixon's Blog [link]

A

Agile BI Webcast Tomorrow – 2pm Eastern

Posted on the March 30th, 2010. Read times

Source: James Dixon's Blog [link]

A

Vendor of Business Metadata & Technical Metadata

Posted on the March 30th, 2010. Read times

Source: Blog: Dan E. Linstedt [link]

I’ve just reviewed some technology from a vendor who manages business and technical metadata as a service platform.  Let me say: I’m impressed.  There are many current issues these days with different BI and EDW implementations, specifically around managing, entering, and governing the business and technical metadata.  For years I’ve found myself using Excel and Word to handle these tasks; while great tools - there are some problems for the Data Steward in the form of governance, management, and usefulness of metadata.   In this entry, I’ll discuss a new solution from IData called the Data Cookbook.

Complex type support in process flow – XMLTYPE

Posted on the March 29th, 2010. Read times

Source: Oracle Warehouse Builder (OWB) Weblog [link]


Different Approaches To Handling Tied Ranks

Posted on the March 29th, 2010. Read times

Source: Chris Webb's BI Blog [link]

Even if blogging about MDX feels, these days, a bit like blogging about COBOL (I’ll be returning to DAX soon, I promise), here’s an interesting MDX problem I came across the other day that I thought was writing about.

Calculating ranks is one of those things in MDX that is slightly trickier than it first appears. There’s a RANK function, of course, but in order to get good performance from it you need to know what you’re doing. It’s fairly widely known that with normal ranks what you need to do is to order the set you’re using before you find the rank of a tuple inside that set. Consider the following query on Adventure Works:

WITH
MEMBER MEASURES.REGULARRANK AS
RANK([Customer].[Customer].CURRENTMEMBER,
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC))

SELECT
{[Measures].[Internet Sales Amount], [Measures].REGULARRANK}
ON COLUMNS,
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)
ON ROWS
FROM [Adventure Works]

It’s unbearably slow (in fact I killed the query rather than wait for it to complete) because, in the calculated member, what we’re doing is ordering the set of all customers every time we calculate a rank. Obviously we don’t need to do this, so the solution to this problem is of course to order the set just once, use a named set to store the result, and then reference the named set in the calculated member as follows:

WITH

SET
ORDEREDCUSTOMERS AS
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)

MEMBER MEASURES.REGULARRANK AS
RANK([Customer].[Customer].CURRENTMEMBER,
ORDEREDCUSTOMERS)

SELECT
{[Measures].[Internet Sales Amount], [Measures].REGULARRANK}
ON COLUMNS,
ORDEREDCUSTOMERS
ON ROWS
FROM [Adventure Works]

This query now executes in 5 seconds on my laptop. You probably knew all this already though.

But what happens if you need to handle tied ranks? The approach above doesn’t give you tied ranks because the RANK function, in its two-parameter form, simply finds the position of a tuple in a set, and no two tuples can occupy the same position in a set. That’s why you get results like this:

image

Even though Courtney A. Edwards and Jackson L. Liu have the same value for Internet Sales Amount, their ranks are 799 and 800 respectively, because Courtney A. Edwards comes before Jackson L. Liu in the ORDEREDCUSTOMERS set.

BOL tells us of the three-parameter form of RANK that does give us tied ranks. This is how you use it:

WITH

SET
ORDEREDCUSTOMERS AS
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)

MEMBER [Measures].REGULARRANKTIED AS
RANK([Customer].[Customer].CURRENTMEMBER
,[Customer].[Customer].[Customer].MEMBERS,[Measures].[Internet Sales Amount])

SELECT
{[Measures].[Internet Sales Amount], [Measures].REGULARRANKTIED}
ON COLUMNS,
ORDEREDCUSTOMERS
ON ROWS
FROM [Adventure Works]

But, unfortunately, the performance is as bad as the original version of the non-tied rank calculation, ie incredibly bad, because once again we’re sorting the set every time we calculate a rank. So how can we get tied ranks and good performance?

The first approach I tried was to use a recursive calculation, which used the named set approach to calculate the non-tied rank and then checked to see if the CurrentMember on Customer had the same value for Internet Sales Amount as the member before it in the set of Ordered Customers; if it did, it displayed the rank of the previous Customer. Here it is:

WITH

SET
ORDEREDCUSTOMERS AS
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)

MEMBER MEASURES.REGULARRANK AS
RANK([Customer].[Customer].CURRENTMEMBER, ORDEREDCUSTOMERS)

MEMBER MEASURES.TIEDRANK AS
IIF(
[Measures].[Internet Sales Amount] =
(ORDEREDCUSTOMERS.ITEM(MEASURES.REGULARRANK-2), [Measures].[Internet Sales Amount])
AND MEASURES.REGULARRANK>1
, (ORDEREDCUSTOMERS.ITEM(MEASURES.REGULARRANK-2), MEASURES.TIEDRANK)
,MEASURES.REGULARRANK)

SELECT
{[Measures].[Internet Sales Amount], MEASURES.TIEDRANK,[Measures].REGULARRANK}
ON COLUMNS,
ORDEREDCUSTOMERS
ON ROWS
FROM [Adventure Works]

Now this particular query performs pretty well – 6 seconds on my laptop, only marginally worse than the non-tied rank. And it gives the correct results; the middle column of values below shows the tied rank:

image

Unfortunately, the performance of this approach varies a lot depending on the number of tied ranks that are present in the set. If we slice the query by the year 2001, when there were a lot more customers with tied ranks, as follows:

WITH

SET
ORDEREDCUSTOMERS AS
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)

MEMBER MEASURES.REGULARRANK AS
RANK([Customer].[Customer].CURRENTMEMBER, ORDEREDCUSTOMERS)

MEMBER MEASURES.TIEDRANK AS
IIF(
[Measures].[Internet Sales Amount] =
(ORDEREDCUSTOMERS.ITEM(MEASURES.REGULARRANK-2), [Measures].[Internet Sales Amount])
AND MEASURES.REGULARRANK>1
, (ORDEREDCUSTOMERS.ITEM(MEASURES.REGULARRANK-2), MEASURES.TIEDRANK)
,MEASURES.REGULARRANK)

SELECT
{[Measures].[Internet Sales Amount], MEASURES.TIEDRANK,[Measures].REGULARRANK}
ON COLUMNS,
ORDEREDCUSTOMERS
ON ROWS
FROM [Adventure Works]
WHERE([Date].[Calendar Year].&[2001])

…then performance gets really bad once again.

Then I came up with a new approach. After ordering the set of all Customers, I made a second pass over it and created a second set with exactly the same number of items in it: for every customer in the first set, in the second set I added the current Customer if that Customer did not have a tied rank; if the Customer did have a tied rank, I added the first Customer in the original set that shared its tied rank. So if there were four customers, A, B, C and D, and if A had sales of 1, B had sales of 2, C had sales of 2 and D had sales of 3, then this new set would contain the members A, B, B, D. I could then say, for Customer C, that it was the third Customer in the original set, but the third item in the new set was B, and that was the Customer whose rank I needed to display for Customer C. So each item in this second set gives us the member whose rank we need to display for the member in the same position in the set of ordered Customers.

Here’s the MDX:

WITH

SET
ORDEREDCUSTOMERS AS
ORDER(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]
, BDESC)

MEMBER MEASURES.REGULARRANK AS
RANK([Customer].[Customer].CURRENTMEMBER, ORDEREDCUSTOMERS)

SET CUSTOMERSWITHTIES AS
GENERATE(
{INTERSECT({ORDEREDCUSTOMERS.ITEM(0)} AS FIRSTTIE,{}), ORDEREDCUSTOMERS} AS ORDEREDCUSTOMERS2
, IIF(
({ORDEREDCUSTOMERS2.CURRENT AS CURRCUST}.ITEM(0), [Measures].[Internet Sales Amount]) =
({ORDEREDCUSTOMERS2.ITEM(ORDEREDCUSTOMERS2.CURRENTORDINAL-2) AS PREVCUST}.ITEM(0), [Measures].[Internet Sales Amount])
, IIF(
(PREVCUST.ITEM(0), [Measures].[Internet Sales Amount])
=
(FIRSTTIE.ITEM(0), [Measures].[Internet Sales Amount])
, {FIRSTTIE}
, {PREVCUST AS FIRSTTIE})
, {CURRCUST})
, ALL)

MEMBER MEASURES.HASTIE AS
RANK([Customer].[Customer].CURRENTMEMBER, CUSTOMERSWITHTIES)

MEMBER MEASURES.TIEDRANK AS
(MEASURES.REGULARRANK, CUSTOMERSWITHTIES.ITEM(MEASURES.REGULARRANK-1))

SELECT
{[Measures].[Internet Sales Amount], MEASURES.TIEDRANK,[Measures].REGULARRANK}
ON COLUMNS,
ORDEREDCUSTOMERS
ON ROWS
FROM [Adventure Works]

The named set CUSTOMERSWITHTIES is where the interesting stuff happens. I’m iterating over the set ORDEREDCUSTOMERS using the GENERATE function, and using inline named sets to store the current Customer in the iteration, the previous Customer, and the first Customer containing the shared tied rank (see here for a similar example of using named sets). It consistently executes in 12 seconds regardless of how you slice the query, so it’s not as good as the best performance of the recursive approach but it’s much, much better than the worst performance of the recursive approach. If anyone has any other ideas on how to solve this problem, I’d love to hear them. I’m still sure there’s a better way of doing this…

Of course, what I really want is for the Formula Engine to be able to optimise queries containing set functions like Order in scenarios like this – I’d want it to know that when a particular set operation returns the same result for a block of cells, it should only perform that set operation once. However, even this wouldn’t necessarily be good enough in all cases – there are plenty of situations where you need to perform the same expensive set operation like a sort or a filter in multiple similar calculations, and you’d like to share the result of this set operation between calculations. For example, you might have a calculated member that counted the number of Customers who bought something both this month and in the previous month, and a second calculated member that counted the number of Customers who not only bought this month and in the previous month and spent more than $1000. In both cases you end up finding the set of Customers who bought this month and last month, which may take a long time to do. This is why I think it would be useful to be able to have calculated members return set objects, which can then be cached, so you can share the set between multiple other calculated members; if you agree, please vote on this Connect. 

Sustainability cheat sheet for your industry

Posted on the March 29th, 2010. Read times

Source: The sascom magazine blog [link]

Hoping to hop on the sustainability band wagon but not sure where to start? Accenture has a great new research report available that includes sustainability priorities listed by industry.

For example, retailers are concerned with heavy carbon-emitting suppliers and rising sea levels that could interrupt the flow of materials from suppliers. Telcos, on the other hand, are paying attention to alternative energy sources and product differentiation.

What about your industry? And how can you execute a strategy be for sustainability? Read the full report.

According to Accenture, this research shows that companies adopting sustainable business strategies and practices drive value by:

  • Growing revenue through new products and services.
  • Reducing costs through efficiency gains.
  • Managing operational and regulatory risk more effectively.
  • Building intangible assets such as their brand, reputation and collaborative networks.

Oracle Warehouse Builder 11gR2 – Using Code Templates – Loading Metadata from Essbase to Relational Tables – Part 2

Posted on the March 29th, 2010. Read times

Source: Rittman Mead Consulting [link]

In the first part of this series here, i had shown how to go about using the ODI RKM’s from within Warehouse Builder to reverse engineer Essbase metadata. Though i did promise to come back sooner on the rest of the series, a client engagement completely held me away from doing this earlier. Better late than never,i guess. Now that we know how to reverse engineer Essbase Metadata within OWB, lets look at a way of using the Essbase Integration Knowledge Modules for loading Essbase Hierarchies to relational tables.

The first step in this process is to ensure that we have the Essbase Metadata properly reverse engineered. Once that is done, the next step is to copy the Essbase specific ODI Knowledge Modules from {ODI_HOME}/impexp folder to {ORACLE_HOME}/owb/misc/CodeTemplates directory

Picture 3

After we have copied the knowledge modules over to OWB, we need to import them by creating a new Code Template Module as shown below. For more details on Code Templates refer Mark’s blog entry on this before here

Picture 4

Once we have imported all the knowledge modules, we need to ensure that the Control Center Agent is started correctly. Control Center Agent basically spawns up a RMI instance (part of OC4J) through which all executions will be done

Picture 5

Configure the DEFAULT_AGENT within OWB to point to this Control Center Agent

Picture 6

After configuring the Agent, we need to deploy all the Code Templates(KM’s). This step is not necessary, but will help in actually testing whether the agent is setup correctly or not.

The steps listed above are basically pre-requisites to get us started on the actual deployment. Now let us start with a very simple example of loading the Sample->Basic cube’s Year dimension into a database table. To keep this simple, lets create a database table with exactly the same structure as the reversed Year dimension

CREATE TABLE YEAR_PC
(
PARENTNAME VARCHAR2(80),
MEMBERNAME VARCHAR2(80),
ALIAS VARCHAR2(80),
DATASTORAGE VARCHAR2(80),
TWOPASSCALC VARCHAR2(80),
CONSOLIDATION VARCHAR2(80),
UDA VARCHAR2(80),
FORMULA VARCHAR2(255),
COMMENTS VARCHAR2(80)
)

After the creation of the database table, we start with creating a new “Template Mapping” where we can map the source (Reversed Year Dimension) and the target (table created above). In the execution view of this Template mapping, for the Essbase source use the imported “Hyperion Essbase to SQL” code template. For the database table, use the “SQL to Control Append” code template.

Picture 8

Picture 9

If you notice, when we drag and drop the Essbase Year Dimension into the mapping editor, we lose the case sensitivity of the columns. This is shown below. The ODI Knowledge Modules are case-sensitive and hence if you execute this, you will start getting errors. To work around this, we will have to update the Knowledge Modules as shown below. The credit for this goes to  David Allan of the OWB team again. Basically your code templates should look as shown below (updated part of the code template)

Picture 11

Picture 10

Now just deploy the mapping and start executing the Template Mapping. This will load the entire Year Hierarchy from Essbase to a database table as shown below

Picture 1

Picture 12

Picture 13

This is pretty neat and quite simple as well once we start getting a hang of the pieces involved. The trickiest part was in updating the Code Template itself which is where David helped me out. I was actually expecting a lot of changes to the KM’s to make it work with OWB at-least for Essbase related loads. I was pleasantly surprised to see everything working almost out of the box. As i have said before, i think this capability of OWB is extremely good and efficient as well, atleast as far as i have tested. Next in this series is data loads into and from Essbase.

One of us is goin’ down

Posted on the March 28th, 2010. Read times

Source: Michael Tarallo - Open Source BI Guru [link]

Just having some fun

An Update on the BI Forum, OBIEE EMG and US Conferences

Posted on the March 28th, 2010. Read times

Source: Rittman Mead Consulting [link]

It’s been particularly hectic over the past few weeks, with several new projects kicking off, training being delivered around Europe and a couple of trips over to the States, but I’m back in the UK now gradually catching up with blog posts and articles. As it’s been so long since I posted an update on the blog, here’s news of a few things that might be of interest to readers, including news on conferences and our new venture led by Venkat Janakiraman.

First up is the BI Forum 2010, which we’re running in Brighton in May and which has now reached capacity.

biforumsmall

Like last year, we’re hosting an expert-level OBIEE, ODI and Essbase conference in our home town, with numbers limited to 50 so that we can maximize opportunities for networking. This year we have Kurt Wolff coming over as our special guest, who will be running a one-day masterclass on the day before the event opens proper, and we also have Phil Bates, Oracle Business Intelligence Enterprise Edition Architect, proving the opening keynote after Kurt’s masterclass. Many thanks to everyone who has registered and we look forward to seeing you all in a couple of months, and if you weren’t able to register in time we’ll post all the proceeding on our website after it has closed.

I’m also pleased to announce the OBIEE Enterprise Methodology Group, a collaboration between a number of OBIEE industry gurus to provide a forum for design, architecture and performance-optimization discussions, and based on an original idea for the Oracle ADF community by Chris Muir and Simon Haslam (thanks, Chris and Simon!)

Designed to complement the OTN OBIEE Forum, this group, organized in conjunction with the ODTUG BI&DW SIG and independent of any vendor or partner, aims to foster discussion of “big picture” OBIEE questions and has a number of moderators who will aim to keep the conversation “on-track”. Membership is free (although we do ask that you fill in a survey at the time of applying, so we can better understand members backgrounds) and the site is hosted on Google Groups, at http://groups.google.com/group/obiee-enterprise-methodology. If this sounds of interest to you, go over to the website and sign up, and post a question or design tip to start things going.

Moving on to conferences, there are two great events coming up in the States in the next few months. Collaborate’10, and in particular the “Get Analytical with BIWA Training Days 2010″ event that’s running as part of the IOUG Forum at Collaborate, takes place in Las Vegas from April 18th-22nd at the Mandalay Bay Hotel.

collaborate10logo

Dan Vlamis and Shyam Nath, two good friends of mine and members of the BIWA board have worked particularly hard to put an excellent tools, database and applications BI content stream together for the event, and Rittman Mead be strongly participating over the week of the conference. I will be running a one-day “Deep Dive” event on the Sunday as part of the IOUG Forum, where I’ll be taking a detailed look at some key OBIEE design techniques including data modeling and support for Essbase, whilst Peter Scott, Stewart Bryson and myself will be running regular conference sessions during the week.

There are also a number of “hands-on” sessions during the main conference including a chance to try out the new Simba Technologies MS Excel interface for Oracle OLAP, a hands-on with Stewart Bryson on Oracle Warehouse Builder 11g Release 2, and other sessions including ones on Oracle Data Mining and Oracle BI Enterprise Edition. You can register for the Collaborate’10 through the BIWA website using this link, which gives you access to additional member benefits and discounts.

On a similar topic, another event in the States and one that I’ve been particularly closely involved in. ODTUG Kaleidoscope 2010 is being held in Washington D.C from June 27th – July 1st.

kaleidoscopelogo

Before I joined the ODTUG Board in late 2009 I also volunteered as the Kaleidoscope BI, DW and Hyperion Reporting Tools content lead, helping set the agenda along with the Kaleidoscope conference committee. In particular, I was especially keen to make Kaleidoscope the premier US event for OBIEE, OWB and ODI content, building on the work that Edward Roske, Tim Tow and the Hyperion SIG committee did last year for the various Hyperion and Essbase streams and leveraging ODTUG’s focus on developers and tools.

For this year’s Kaleidoscope, we’ve got a number of exclusive sessions from Oracle product managers, including sessions on ODI 11g, the BI Apps / Essbase Integrator, the new OBIEE-optimized SmartView, as well as lots of sessions on OBIEE 11g. As Kaleidoscope is a relatively small conference (compared to Collaborate) we’ve also got a great opportunity to get all of the BI, DW and Hyperion speakers and delegates together during the event, and I’ll be planning a social event for any delegates and speakers who want to venture into DC for one of the evenings. And, even better, the World Cup will be on with several of the second round matches due to be played at the start of Kaleidoscope week, so we’ll get a chance to pull up a chair, crack open a Sam Adams and watch a few of the games whilst discussing Oracle BI technology.

Again, you can register for Kaleidoscope using this link, where you’ll also be able to sign-up for a free one-day Oracle EPM and Essbase Symposium on the Sunday before the main conference opens.

Finally, a bit of news on Rittman Mead overseas. Venkat Janakiraman, who’ll you’ll certainly know from this blog and his previous Oracle blog on Wordpress.com, recently opened up our new office in Bangalore as Rittman Mead Consulting Pvt Ltd. Venkat’s been a fantastic person to work with, and he’s now building a local team in India to work alongside Rittman Mead in the UK, Benelux and the USA on projects, training and support. We’ve been very impressed over the past few years with some of the super-smart people coming out of the Oracle IT industry in India, and Venkat will be building up an expert-services team that will support our activities in Europe in the US as well as running their own projects in India and Asia-Pacific, starting with Ram Chaitanya and Jay Gandhi who joined us in the past few weeks.

If you’re based around Bangalore and have a consulting, pre-sales or training background in Oracle BI and EPM and would be interested in working with Venkat, drop him a line and we’d be pleased to hear from you. Similarly, if you are looking for an expert-services, training and support within APAC, drop us a line and we’d be pleased to help.

Q1 2010 downloads, training, and informational material

Posted on the March 28th, 2010. Read times

Source: Dan English's BI Blog [link]

Just wanted to do a quarterly update on some of the items that I have downloaded and reviewed over the past few months so far.  Most of these I have tweeted about, but I know that everyone isn’t leveraging twitter yet.  I know I was skeptical at first, but since the PASS 2009 Summit I am on board and enjoying it.

So here is the list of some of the items I have checked out so far this year and these are in no particular order:

As you can see there is kind of a reoccurring theme here…SharePoint.

Hope you find some of these informational and helpful.

Open Core Business Model Revisited

Posted on the March 26th, 2010. Read times

Source: James Dixon's Blog [link]

A

Confused about open source Dual-Licensing?

Posted on the March 25th, 2010. Read times

Source: James Dixon's Blog [link]

A

GetPivot, la apuesta de Microsoft

Posted on the March 25th, 2010. Read times

Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]


Next Page »