It’s not too late for predictive insight
Source: The sascom magazine blog [link]
Last week was a busy one in the world of business analytics. The biggest news for SAS is our ranking as a leader in predictive analytics and data mining by an independent analyst firm. In The Forrester Wave: Predictive Analytics and Data Mining Solutions, Q1 2010, SAS earned top overall ranking in all three categories – current offering, strategy and market presence – including perfect scores for functionality, professional services, licensing and cost, direction, and company financials criteria.
Separately, new research launched last week from our friends at Accenture shows that most global organizations are failing to use analytics to make informed decisions and predict future business events.
What are we to make of these two divergent studies? On the one hand, you have a report that says there are at least nine qualified vendors that provide mature products for predictive analytics. On the other, you have survey results that say most executives are still primarily trusting their instincts. In fact, their organizations are encumbered with “siloed data, outdated technology and a lack of analytical talent.”
Are most businesses in a hopeless situation, buried in data, organized chaotically and too overwhelmed to find insights? Or are there still opportunities to re-prioritize and start making decisions based on predictive insights?
Data Warehouse Fault Tolerance Part 1: Resuming
Source: Rittman Mead Consulting [link]
In the introduction to this series of posts, I spoke briefly about data warehouse fault tolerance and the unique challenges resulting from high data volumes combined the batch load window required to create them. I then defined the goal: a layered approach allowing simple errors to be caught early before they turn in to serious conditions.
Resuming is the ability to continue effortlessly after an error. The important thing is that there should be no aftermath from the error: our process should pause gracefully until the error is corrected. The Oracle Database has offered out of the box functionality for resuming since version 9i in the form of Resumable Space Allocation. Resumable operations are supported for SELECT queries, DML and DDL, and can be enabled at either the system or the session level. To enable at the system level, the RESUMABLE_TIMEOUT database parameter should have a non-zero value.
SQL> alter system set resumable_timeout=3600; System altered. SQL>
To enable resumable operations at the session level, the statement follows this basic syntax, with the TIMEOUT and NAME clauses being optional:
ALTER SESSION ENABLE RESUMABLE <TIMEOUT n> <NAME string>;
The TIMEOUT value is specified in seconds, and if omitted, the default value of 7200 is used, or 2 hours. The NAME clause gives the resumable session a user-friendly name for when we are monitoring for resumable sessions (as we will see later) to see which of our processes is suspended. Enabling resumable operations for the session level requires that the RESUMABLE permission has been granted:
SQL> grant resumable to stewart; Grant succeeded. SQL>
Resumable operations can also be enabled with the Oracle utilities… such as SQL-Loader, Export/Import and Datapump. The command-line parameters RESUMABLE, RESUMABLE_NAME and RESUMABLE_TIMEOUT exist to mimic the functionality mentioned above.
Now for a demonstration. I’ll create a situation that is ripe for a space allocation error: I’ll put an empty copy of the SALES fact table from the SH schema in a tablespace with only 250K of space:
SQL> create tablespace target datafile '/oracle/oradata/bidw1/target01.dbf' size 250K; Tablespace created. SQL> create table target.sales tablespace target as select * from sh.sales where 1=0; Table created. SQL>
Now I’ll load some records into the table, which should cause it to suspend. To prepare my session, I need to enable resumable operations. Since I always instrument my code, I’ll register my process with the database. After that, I have an easy way to guarantee consistency when referring to processes. Now, I can use the registered name for my resumable session as well:
SQL> exec dbms_application_info.set_module('SALES fact load','insert some rows');
PL/SQL procedure successfully completed.
SQL>
SQL> DECLARE
2 l_module VARCHAR2(48) := sys_context('USERENV','MODULE');
3 BEGIN
4 EXECUTE IMMEDIATE
5 'alter session enable resumable timeout 18000 name '''||l_module||'''';
6 END;
7 /
PL/SQL procedure successfully completed.
SQL>
I start loading the records in hopes of a suspended session:
SQL> insert into target.sales select * from sh.sales;
So now, I open up another session, and I start another transaction against the TARGET.SALES table, just to pile on the TARGET tablespace:
SQL> exec dbms_application_info.set_module('SALES fact load2','insert more rows');
PL/SQL procedure successfully completed.
SQL>
SQL> DECLARE
2 l_module VARCHAR2(48) := sys_context('USERENV','MODULE');
3 BEGIN
4 EXECUTE IMMEDIATE
5 'alter session enable resumable timeout 18000 name '''||l_module||'''';
6 END;
7 /
PL/SQL procedure successfully completed.
SQL> insert into target.sales select * from sh.sales;
I’ll have a look in the DBA_RESUMABLE view (there is also a USER_RESUMABLE version) for my suspended sessions. Even though I could get all the following information with a single SQL statement, I broke it up for better visibility on the blog:
SQL> select name, start_time, suspend_time, status from dba_resumable; NAME | START_TIME | SUSPEND_TIME | STATUS ----------------- | -------------------- | -------------------- | ------------ SALES fact load2 | 02/06/10 10:33:33 | 02/06/10 10:33:33 | SUSPENDED SALES fact load | 02/06/10 10:29:03 | 02/06/10 10:29:03 | SUSPENDED 2 rows selected. Elapsed: 00:00:00.07 SQL> select name, sql_text from dba_resumable; NAME | SQL_TEXT ----------------- | ----------------------------------------------- SALES fact load2 | insert into target.sales select * from sh.sales SALES fact load | insert into target.sales select * from sh.sales 2 rows selected. SQL> select name, error_msg from dba_resumable; NAME | ERROR_MSG ----------------- | ------------------------------------------------------------------------ SALES fact load2 | ORA-01653: unable to extend table TARGET.SALES by 8 in tablespace TARGET SALES fact load | ORA-01653: unable to extend table TARGET.SALES by 8 in tablespace TARGET 2 rows selected. SQL>
The Oracle Database also publishes server alerts concerning suspended transactions using the Server-Generated Alerts infrastructure. This infrastructure uses the AWR toolset, the server package DBMS_SERVER_ALERT for getting and setting metric threshholds, and the queue table ALERT_QUE to hold alerts that have been published from AWR. Custom processes could be written to mine ALERT_QUE for these alerts, but the easiest way to configure and view server alerts is using Oracle Enterprise Manager (OEM). On the Alerts section of the main OEM page, we can see three different alerts generated by the Oracle Database:
If we click on the “Session Suspended” link, we can see the multiple alerts generated in this category:
Another alert generated indirectly by the suspended transaction is the “Configuration” class event caused by our session “waiting” to proceed. The Oracle wait event interface can show us information about the suspend waits on the system:
SQL> SELECT event, 2 SUM(time_waited) time_waited, 3 SUM(total_waits) total_waits, 4 AVG(average_wait) average_wait 5 FROM gv$session_event 6 WHERE lower(event) LIKE '%suspend%' 7 GROUP BY event 8 ORDER BY time_waited ASC 9 / EVENT | TIME_WAITED | TOTAL_WAITS | AVERAGE_WAIT ---------------------------------------------- | ----------- | ----------- | ------------ statement suspended, wait error to be cleared | 305373 | 1377 | 221.78 1 row selected. SQL>
To free up the space issue, I’ll enable autoextend on the TARGET tablespace. Then, I’ll take a look and see if anything has changed:
SQL> alter database datafile '/oracle/oradata/bidw1/target01.dbf' 2 autoextend on next 10M maxsize 1000M; Database altered. SQL> select status, resume_time, name from dba_resumable; STATUS | RESUME_TIME | NAME ------------ | -------------------- | ----------------- NORMAL | 02/06/10 10:56:49 | SALES fact load2 NORMAL | 02/06/10 10:56:49 | SALES fact load 2 rows selected. SQL>
The Resumable Space Allocation features includes the AFTER SUSPEND trigger, which allows the specification of a system-wide trigger that will fire whenever a transaction is suspended. The typical use for this functionality is alerting as suspended operations don’t write anything to the alert log.
UPDATE: I made a mistake here… suspended transactions do in fact cause entries in the alert log, and so does the RESUME process detailed below.
There are some features in the DBMS_RESUMABLE package that may make sense when writing an AFTER SUSPEND trigger:
SQL> desc dbms_resumable PROCEDURE ABORT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN FUNCTION GET_SESSION_TIMEOUT RETURNS NUMBER Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN FUNCTION GET_TIMEOUT RETURNS NUMBER PROCEDURE SET_SESSION_TIMEOUT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN TIMEOUT NUMBER IN PROCEDURE SET_TIMEOUT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- TIMEOUT NUMBER IN FUNCTION SPACE_ERROR_INFO RETURNS BOOLEAN Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- ERROR_TYPE VARCHAR2 OUT OBJECT_TYPE VARCHAR2 OUT OBJECT_OWNER VARCHAR2 OUT TABLE_SPACE_NAME VARCHAR2 OUT OBJECT_NAME VARCHAR2 OUT SUB_OBJECT_NAME VARCHAR2 OUT SQL>
This package adds functionality for writing custom processes in the AFTER SUSPEND trigger. The SPACE_ERROR_INFO function returns specifics about the table and tablespace affected by the space error. A series of checks could be coded enabling specific actions depending on which objects were affected. A suspended process can be ended prematurely with the ABORT procedure, or more time can be added using the SET_TIMEOUT procedure. I actually had one client explain how she had written an AFTER SUSPEND trigger that compiled information about the tablespace affected so that an “ALTER DATABASE… RESIZE…” command could be issued to add more space to the affected datafile. I didn’t have the heart to tell her that she had basically written a feature that already existed in the database: AUTOEXTEND.
So what are the best practices to take away from this? Quite simply… all ETL mappings and flows, as well as database maintenance processes, should use Resumable Space Allocation, preferably using the NAME clause in conjunction with DBMS_APPLICATION_INFO. Setting a RESUMABLE_TIMEOUT value at the system level can be scary, because a single suspended transaction could cause locks that reverberate all the way through the system. But is this really a concern in a BI/DW environment? Are there any processes in our batch load window or with any of our operational maintenance processes that we wouldn’t want to enable for resumable operations, no matter how many processes back up waiting for them to complete? It could spell bad news if we used any kind of synchronous replication technology to move data to the DW instance, but short of that, I can’t think of any. Please let me know if you have alternative viewpoints.
I’ve never found much reason to use the AFTER SUSPEND trigger though. Data warehouses should have production-type monitoring running already, just like other production systems. OEM is more than satisfactory for basic monitoring and alerting, and with the Server-Generated Alerts introduced in 10g, forms a complete product for Oracle environments. But regardless of which monitoring solution is used, it should be able to issue simple queries against the database and alert based on the results of those queries. A select against the DBA_RESUMABLE table provides all the information required to send out an alert, and with features such as AUTOEXTEND, I just can’t see a requirement for the ability to issue procedural code because a transaction is suspended.
UPDATE: as pointed out above, since suspended transactions do in fact show up in the alert log, this is good news for integrating Resumable Space Allocation into an existing environment. Assuming that there’s proper alert log monitoring with paging functionality already in place, implementing resumable operations can simply use that infrastructure already in place.
Keep your eyes open for the next of the “Three R’s” in BI/DW fault tolerance: Restarting.
SSAS 2008 R2 – Improvements for slow Metadata
Source: Darren Gosbell [MVP] - Random Procrastination [link]
There was a thread last year on the SSAS MSDN forum SSAS 2008 — Why is metadata so slow??? in relation to slow metadata queries on a cube with 250+ measure groups. The good news is that despite the fact that the SSAS team has largely been focussed on PowerPivot for the SQL Server 2008 R2 release, they have also tried to address a few of the customer pain points in SSAS. Below is part of a recent response to this thread from Akshai Mirchandani from the SSAS development team:
“As an update to this discussion, we did implement an improvement to the way this EstimatedSize property works in the upcoming SQL Server 2008 R2 release — the latest CTP includes this change. It would be great to hear from any of you who can try this out to see what the impact is on the performance in real-world situations.
This change does require you to update all your partitions and dimensions in some way — the server will persist the estimated size of those objects at that point and not require scanning their file sizes every time you discover metadata. ProcessUpdate on the dimensions and ProcessClearIndexes+ProcessIndexes on the partitions should do the trick. Without updating the dimension/partition objects things will still work, but without the performance gain…
Additional note: due to these types of changes, you may not be able to take a 2008 R2 database downlevel to a 2008 server.As most of you have probably heard, AS 2008 R2 is primarily targeted at the self-service user with the new PowerPivot solution — but we’ve tried to tackle a few of the pain points for our existing customers along the way. Some examples:
- This discovery of metadata issue
- Processing of a single (typically small) partition in the context of a very large number of metadata objects (e.g. 1000s of partitions) should now be significantly quicker. The algorithm that analyzes dependencies was optimized for this scenario.
- Also, R2 now supports calculated members in subselects — and the new Excel 2010 pivot tables will let you make use of that feature.”
If are having issues with slow metadata for a large database/cube then it would probably be worth giving the November CTP of SQL 2008 R2 a try. If you have any feedback for this issue feel free to post a comment either here or on the original forum thread.
#5. Release of Hyperion System 9
Source: Look Smarter Than You Are [link]
Pick one from each column: A CFO’s post-conference to-do list
Source: The sascom magazine blog [link]
When I attend a conference as a participant, one constructive practice I have acquired is to, immediately after completing the post-conference evaluation form, create my own personal “take away” worksheet. Nothing complicated, just two columns which can be labeled left/right, debit/credit, sooner/later, or plain/peanut. Don’t attempt to create this worksheet until after you’ve seen all the presentations and participated in all the discussions, workshops and roundtables - there is too much of a tendency to get overly excited about what you are immersed in at the moment; wait until you have gained some perspective.
The left hand column consists of those things that you’ve learned you are doing wrong and need to stop doing, or need to replace with an easy to implement change/alternative, and those things you feel you need to change or improve that can be accomplished without a project plan.
The right hand column consists of those areas where you want to improve, enhance or make major process or cultural changes that might, and probably will, require a project plan. Three to five items in each column, and once completed, circle ONE, and only one, from each column, for immediate action. This circled item is the one to which you will commit yourself to accomplishing no matter what; even if most or many of the others turn out to be impractical or inapplicable, you WILL commit to one short term and one medium term result. Lastly, write the words “WATCH IT!” underneath the list (to be explained).
In my participant ‘role’ as head of finance at the CFO Corporate Performance Management Conference held earlier this week in New York, this is what my final list looked like:
Continue reading “Pick one from each column: A CFO’s post-conference to-do list”
Crosstabs en Pentaho Reporting
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
EPM Connection New Jersey - Agenda and Single Day Rate
Source: Look Smarter Than You Are [link]

Speaking of the second day, John O’Rourke kicks off with another keynote at 8:45 on what’s coming in Oracle EPM in 2010. There are then three more breakout sessions until 1PM. There are a couple of general session things happening from 1-2:30PM before the event wraps up, but I honestly can’t tell at the moment what’s really going on during that 90 minute period.
Advanced Find and Search in OWB
Source: Oracle Warehouse Builder (OWB) Weblog [link]
Oracle - Best Ability to Execute Among EPM Suites
Source: Look Smarter Than You Are [link]
Gartner has released their annual “Magic Quadrant” report on CPM (Corporate Performance Management also known as EPM) suites. To the shock of almost no one, there are only three vendors in the “Leaders” quadrant which means they have high ability to execute and completeness of vision. Those three are IBM (with the Cognos products), SAP (with the Business Objects and OutlookSoft products), and of course, Oracle (with the Hyperion products).
…There has not been any major advancement in product vision during the same time frame, and Oracle needs to show evidence of thought leadership in CPM to maintain its position relative to other megavendors and some of the CPM specialists. Its IP functionality and planned financial-close functionality partly address this.
Gartner Magic Quadrant for BI Platforms 2010
Source: Oracle Business Intelligence Blog [link]
Analyst firm Gartner published its Magic Quadrant for Business Intelligence Platforms for 2010 (Gartner RAS Core Research Note G00173700) last week. The full report can be accessed on Gartner’s web site here.
Some salient points from the research note:
Oracle has been positioned in the Leaders Quadrant.
Oracle has established the Oracle BI Enterprise Edition (OBIEE) platform as the “BI standard
Evento innosfera sobre Open Innovation en Centro Cultural Blanquerna
Source: Information Management [link]
A
SPModule and Sharepoint 2010; the power of the PowerShell
Source: Laura Gibbons Scorecard Application & Six Sigma Blog [link]
Check out my Google sidewiki post here: http://www.google.com/sidewiki/entry/flowergirldujour/id/Swp7WYEPjDdbMeaYQkPn8Hir4MY
And definitely download this module so that you can use SPModule…
Here is what the interface looks like:
From Zach Rosenfield’s blog, who is the Sharepoint Program Manager at Microsoft:
Welcome to the introduction of SPModule. SPModule is a Windows PowerShell module written by members of the SharePoint Product Group in our spare time. SPModule is an example of how we would envision you accomplish various common tasks within Windows PowerShell in a SharePoint 2010 environment. We hope to position various best practices from these scripts and we hope in the long term to reference these also within technet. These blog posts serve simply as our first location of sharing them, and this post will be updated once we have the samples hosted within technet. The scripts themselves are not officially supported, but we will entertain questions and suggestions through this blog until we get it onto technet.
How do I get started?
First download the zip that contains the scripts from here:
http://sharepoint.microsoft.com/blogs/zach/Script%20Library/Modules/SPModule/SPModule.zip
Next, unpack the zip file onto a share in your environment. Before you get to use the scripts, you’ll need to make a decision around signing. By default, Windows PowerShell is configured securely such that it will not run unsigned scripts. You can choose to either sign them yourself with a self-signed certificate or run Windows PowerShell in a mode where you do not verify signatures. We do not recommend running Windows PowerShell in this state. However if you are in an isolated environment, you may choose to do so. If you follow my last post about signing files, you can use those instructions to sign the entire “Module” in a single command:
function Add-Signing($file){ Set-AuthenticodeSignature $file @(Get-ChildItem cert:\CurrentUser\My -codesigning)[0] }
ls -r -Include ("*.ps1","*.psd1","*.psm1") |%{ Add-Signing $_ }
Please note that if you have not installed SharePoint, then you need to lower the Execution Policy to “AllSigned” using this command: Set-ExecutionPolicy AllSigned. This is done by installing the SharePoint bits so if you’ve already installed this is not needed.
Then open Windows PowerShell as an administrator (right click on the link and select “Run as administrator”). If you already have SharePoint 2010 installed, you could use the SharePoint 2010 Management Shell instead. Once the window opens, the first thing we need to do is add the path to the module to your Windows PowerShell module path (presuming you created a folder called “SPModule” on your server):
$env:PSModulePath = “C:\SPModule;” + $env:PSModulePath
Next we need to import the modules:
Import-Module SPModule.misc
Import-Module SPModule.setup
When you import the SPModule.misc module, you will invoke a update check. In 1.0, this will check a file in the script library above to see if there is a newer version available. If you are notified that there is, you can go to that location and download the newer version. Once the Import-Module commands are done, you’re ready to use SPModule.
So, what does SPModule give me?
The 1.0 version of SPModule provides a few major new commands and a number of smaller supporting commands. Here’s how you can get the list of commands in the module:
Get-Command –Module SPModule.*
The major commands of 1.0 are Install-SharePoint, New-SharePointFarm, Join-SharePointFarm, and Backup-Logs. They do exactly what their names would lead you to expect (Backup-Logs collects all the logs on the local machine not the whole farm). The rest are for more advanced scenarios or are used by these larger functions—please be careful using commands you don’t understand Here’s some quick examples to get you started:
Install SharePoint Bits (including Prereqs) on a
Install-SharePoint -SetupExePath “\\servername\SharePoint2010-Beta\setup.exe” -PIDKey “PKXTJ-DCM9D-6MM3V-G86P8-MJ8CY”
New-SharePointFarm –DatabaseAccessAccount (Get-Credential DOMAIN\username) –DatabaseServer “SQL01” –FarmName “TestFarm”
Join-SharePointFarm -DatabaseServer “SQL01” -ConfigurationDatabaseName “TestFarm_SharePoint_Configuration_Database”
Backup-Logs -outp “$env:userprofile\Desktop\SharePointLogs.zip”
Note: Backup-Logs may have trouble putting the subzip files into the final zip. We are aware of this issue and are working on this for the next release. For now, we will detect the situation and keep the subzip files that had a problem”
Oracle BI EE 10.1.3.4.1 – Modeling Measure Dimensions on Relational Sources – Essbase Style
Source: Rittman Mead Consulting [link]
One of the biggest differences between Essbase and most other reporting/OLAP tools like BI EE/Oracle OLAP etc is in the way Measures are treated(comparison only in terms of data modeling and reporting and not on physical implementation). In Essbase, every data/value(cell) is attributed by a set of dimensions. Even a measure is considered as an attribute of another dimension. But in the case of Oracle OLAP, BI EE etc, one requires to specifically identify a set of measures and load/report against them. The problem with this approach is the fact that in many cases while doing the reporting we would like to treat the measures like a normal dimension and vice versa. Just to illustrate this point, the most commonly used component in BI EE Answers is the Pivot View component. A Pivot View basically converts a set of dimension attributes as a set of measures(typical analysis) by spreading all the dimension attributes in the column section.

Similarly, in many cases, we want to treat the measures as a set of dimension attributes as shown below

Again pivot table component can be used to achieve this. Both SQL and MDX can swap the attributes to Columns and Rows pretty efficiently(basically one can manually handcraft SUM, FIRST etc kind of aggregation on any valid column). But once we designate a set of static measures within the reporting tool, the model can become constrained to an extent. Of course, there are ways around it like using Pivot tables, Answers based aggregation, UNIONs etc. The main drawback with a measure based model is the fact that one cannot efficiently drill on Measure Hierarchies. The other drawback is since our database related data structures are more oriented towards static measure modeling (facts and dimensions), it is very difficult to switch the measure hierarchy for reporting.
For example, if you had gone through or attended our training days event last year, we had shown how to go about switching measure hierarchies in BI EE for an Essbase source. Its a very common requirement when you have Essbase as a data source. Many users might choose Accounts dimension as a measure dimension for reporting. Some users might choose Scenario as the measure dimension for reporting. Essbase provides that flexibility and the same can be leveraged from BI EE as well. But in the case of relational sources, it can be quite difficult to achieve this. There are a couple of reasons for this
1. If we model a separate measure dimension in our warehouse, the fact tables will contain only one measure. The size of the fact table will grow multiplied by the number of measures in the measure dimension.
2. Its not practical(& to an extent does not make sense) to model it this way for relational data models at the database level.
Though we might stick with a fact/dimensional model in the data warehouse, i think the reporting tools should provide the capability to switch the measure dimensions even for relational sources(PIVOT and UNPIVOT operations should be supported from the RPD). Currently we do not have this capability in BI EE (from a repository standpoint). Though BI EE does not support this currently, the BI Server repository can be tweaked to an extent to achieve this using Conforming Dimensions. Lets look at a couple of scenarios. One is modeling the measure dimension for relational sources and the other is in switching the measure dimension itself.
Measure Dimension:
Lets understand this through our usual example SH schema that gets shipped with oracle database. Most common way of designing a repository for this SH schema is given below

Its a very simple repository containing a Sales measure, a cost measure and all the dimensions. If you look at this, UNIT_COST and AMOUNT_SOLD are designated as measures(which is how it exists in the physical database). Lets assume that a requirement is to treat both these measures as attributes of a measure column i.e we want to be in a position to treat both the measures as part of a single dimension. The report below basically explains what is needed

The above report has a column called Measure which will basically un-pivot the measure values. To achieve this, we start with creating a new custom dimension within the RPD called as Measure Dim. This will basically have one column called as Measure which will be fed through 2 logical table sources (one is Sales and the other is Costs). This is to signify that each LTS will be used to serve each measure



Then we create 2 custom fact tables (one for sales measure and the other for cost measure) and then force the conforming dimension join as shown below

To combine both the measures through a common single fact, we create another logical fact table with one logical column. This will be fed through a logical calculation of UNIT_COST and AMOUNT_SOLD as shown below


This will provide an extra dimension for your reporting (measure dimension). Remember, this will work as long as you have measures from different physical tables. If you have measures from the same physical table then you will have to alias the table as many times as the number of measures (which is not recommended). Also, currently we cannot make BI EE to push the PIVOT and UNPIVOT operators directly from the BMM.

Swapping Measure Dimensions:
The next requirement is, now that we have converted the measures as a set of attributes derived from a column, to convert each member of the CHANNEL_CLASS column as a measure i.e we want 3 measures Direct, Indirect and Others. To do this, replicate the measure in each fact table 3 times. Basically each measure will have a FILTER function applied to filter only on the necessary attributes. For example, the Sales and Cost fact tables are shown below

The Direct measure will have a FILTER function applied as shown below

Similar filters will be applied for Indirect and Others measures as well for both the fact tables. Then in the main fact table, we need to combine the individual measures again.

This will allow you to create a report containing the actual measures as Dimension attributes and the actual dimension attributes as measures (all of this in a normal table view)

If you look at the SQL, you will notice that there will be conforming dimension join done on the Amount Sold and Unit Price attributes and then they are combined using a main query
WITH
SAWITH0 AS (select sum(case when T1776.CHANNEL_CLASS = 'Direct' then T1929.AMOUNT_SOLD end ) as c1,
sum(case when T1776.CHANNEL_CLASS = 'Indirect' then T1929.AMOUNT_SOLD end ) as c2,
sum(case when T1776.CHANNEL_CLASS = 'Others' then T1929.AMOUNT_SOLD end ) as c3,
'Amount Sold' as c4,
T1916.PROMO_NAME as c5
from
PROMOTIONS T1916,
CHANNELS T1776,
SALES T1929
where ( T1776.CHANNEL_ID = T1929.CHANNEL_ID and T1916.PROMO_ID = T1929.PROMO_ID and (T1776.CHANNEL_CLASS in ('Direct', 'Indirect', 'Others')) )
group by T1916.PROMO_NAME),
SAWITH1 AS (select sum(case when T1776.CHANNEL_CLASS = 'Direct' then T1784.UNIT_COST end ) as c1,
sum(case when T1776.CHANNEL_CLASS = 'Indirect' then T1784.UNIT_COST end ) as c2,
sum(case when T1776.CHANNEL_CLASS = 'Others' then T1784.UNIT_COST end ) as c3,
'Unit Price' as c4,
T1916.PROMO_NAME as c5
from
PROMOTIONS T1916,
CHANNELS T1776,
COSTS T1784
where ( T1776.CHANNEL_ID = T1784.CHANNEL_ID and T1784.PROMO_ID = T1916.PROMO_ID and (T1776.CHANNEL_CLASS in ('Direct', 'Indirect', 'Others')) )
group by T1916.PROMO_NAME)
select distinct case when SAWITH0.c4 is not null then SAWITH0.c4 when SAWITH1.c4 is not null then SAWITH1.c4 end as c1,
case when SAWITH1.c5 is not null then SAWITH1.c5 when SAWITH0.c5 is not null then SAWITH0.c5 end as c2,
nvl(SAWITH1.c1 , 0) + nvl(SAWITH0.c1 , 0) as c3,
nvl(SAWITH1.c2 , 0) + nvl(SAWITH0.c2 , 0) as c4,
nvl(SAWITH1.c3 , 0) + nvl(SAWITH0.c3 , 0) as c5
from
SAWITH0 full outer join SAWITH1 On SAWITH0.c4 = SAWITH1.c4 and SAWITH0.c5 = SAWITH1.c5
order by c1, c2
Though this will work in most cases, there are some drawbacks with this approach
1. The SQLs generated will not be the best performing ones. As the number of measures grow, the number of conforming dimension joins will increase thereby producing poor performance. Unfortunately BI EE cannot push the PIVOT/UNPIVOT operators supported from Oracle 11g
2. If there are a lot of measures from a single fact table then this requires multiple fact table aliases to force the conforming dimension join across all dimension sources (For the measure dim). This might make the RPD un-necessarily big.
You might not find this technique to be of practical use, but this demonstrates the capabilities of the BI Server to model some complex requirements. I am hoping that sometime in the near future we should see more of pure multi-dimensional modeling on relational sources to be supported by BI EE.
Nine optimization lessons from TARGET
Source: The sascom magazine blog [link]
I kept hearing that the Big Ideas session at NRF was standing room only, but until now I hadn’t read a good recap. Retail TouchPoints comes through, describing the guest intelligence presentation from Eric Bibelnieks, group manager for guest data and analytical services at TARGET Corporation.
Pre-conference materials said, “In this session, you’ll learn how Target uses guest intelligence to optimize marketing and merchandising processes – with the ultimate goals of enhancing guests’ shopping experiences, driving store performance and creating brand cachet that’s the envy of the industry.”
Read the Retail TouchPoints article to find more details about these nine lessons TARGET learned along the way about optimization:
- Analytics.
- Data Quality & Standards.
- Financial Resources.
- Clearly Defined Roles.
- Training.
- Hire Analytical, Reporting and Data Analysis Talent.
- Prototype, Learn then Develop.
- Test, Learn then Scale.
- Balance the Art and Science of Retail.
Looking for more tidbits from NRF? The Big Idea blog ran a great post about the business analytics talk with Chris McCann and Tom Davenport, titled ion.
Gartner Magic Quadrant 20101T de Business Intelligence
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
Creating Options
Source: Frank Buytendijk Blog [link]
Buena comparativa entre pentaho, jasper y palo
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
OLAP HeartBeat, para analizar Analysis Services, gratis
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
OData and Microsoft BI
Source: Chris Webb's BI Blog [link]
I first came across OData last year when Jamie Thomson blogged about it; since then I hadn’t really thought about it much until I came across Douglas Purdy’s blog and specifically his post from yesterday which really brought home how important it is to the future of Microsoft BI. I would urge you to watch the video “OData: The Movie” that he mentions in his post because it gives a really good introduction to the way that OData can be used in BI scenarios; if you don’t have the time, all you really need to know is that it’s OData that makes it possible for PowerPivot to consume data from SSRS reports and Sharepoint lists.
Just watching this video made my mind boggle with the possibilities of OData, although since I’m a long way from being an expert in this area I won’t bother to detail all of these fantasies as they’re probably rubbish. However, just consider how much easier life would be for the PowerPivot user of the future if the internet was full of sites that supported OData; certainly, when I’ve looked at the new UK government websites such as the recently-launched http://data.gov.uk/ and http://data.london.gov.uk/, or even sites like the Guardian Data Store that I blogged about last year, I’ve felt that the lack of a standard format to consume this data (a badly-formatted Google spreadsheet is not ideal) has seriously limited these sites’ usefulness. And what if Bing came up with a service like the sadly-useless Google Squared, where you could search for data on the web and return it in a structured, OData format?
I’d also love to see Analysis Services support OData too, in some shape or form. Perhaps it could be used to solve some of the same problems with XMLA that Julian Hyde lists in his recent post on xmla4js. Wouldn’t it also be cool if there was a standard interface for publishing an MDX query up to SSAS from any client tool, and then be able to consume it via OData (similar to what I suggested here)? You’d then enable scenarios like this: power user creates query using an advanced SSAS client tool, then publishes it up to SSAS whereupon it becomes something similar to a SQL stored proc with the results available as an OData feed, and therefore can be consumed either in your own code (eg in a website), by PowerPivot users, or by something like SSRS (which would then know nothing about the query used but would just be used to format the resultset).
Bread-and-Butter: Classic OWB Implementations in the News
Source: Oracle Warehouse Builder (OWB) Weblog [link]

