SAS executives not afraid to sweat
Source: The sascom magazine blog [link]
Contributed by Beverly Brown

It’s fashionable these days for executives to say they want their employees to be healthy. Employer-sponsored wellness initiatives are popular, and it’s easy to understand why, given the bottom-line benefits of reducing health-care costs. But how many executives actually work out alongside their workers?
At SAS, where employee wellness has long been a corporate priority, that’s become a more common sight of late. For example:
Continue reading “SAS executives not afraid to sweat”
PASSMN June 15, 2010 Monthly Meeting
Source: Dan English's BI Blog [link]
The next Minnesota SQL Server User Group meeting is coming up in a couple of weeks on Tuesday, June 15. This month Lara Rubbelke will be presenting on the new features in SQL Server Reporting Services 2008 R2 – Report Builder 3.0, new visualizations (map report item, sparklines, data bars, indicators), new shared features (report parts and datasets), new data sources, new rendering enhancements, and more. If you have any questions related to SSRS that you would like to have discussed and covered in the discussion please send them in advance to support@mnssug.org.
In order to RSVP for the event you will need to login to the national PASS site and click on the RSVP link. This month our sponsor for the event is Magenic again. Last month there was a mix-up with the food and beverage order the Papa John’s actually got sent to Davannis.
PASSMN May 2010 Meeting - SSAS Design Best Practices & Incremental Processing
- Meeting sponsor: Magenic
- Location: 8300 Norman Center Drive, 9th Floor, Bloomington, MN 55437
- Please RSVP here for the event to assist in making sure there is enough snacks and beverages ordered.
- Live Meeting URL: https://www.livemeeting.com/cc/usergroups/join?id=MC6Q87&role=attend&pw=D4%28%3Eq3%3AWJ (to get the audio you will need to download and install the full Live Meeting client)
3:00 –3:15 : Announcements
3:15 –3:25 : Sponsor
3:25 –4:45 : SSRS 2008 R2 (Lara)
4:45 –5:00 : Survey Collection and Giveaways
5:00 –5:30 : Networking
Presentation:
What’s New in SQL Server 2008 R2 Reporting Services (presented by Lara Rubbelke) - Amp up your Reporting Services with the many great enhancements in SQL Server 2008 R2. Lara will take you on a tour of all that is new in R2 including collaboration features, new visualizations, rendering enhancements, and new data sources. After this session you will wonder why they call R2 an "interim" release!
Hope to see you there!
Also, last month we were finally able to get our first successful Live Meeting recording - PASSMN May 2010 Follow-up. And this week the Microsoft BI Conference and TechEd is going on down in New Orleans, so there should be a lot of content and information being posted this week through Twitter (#msbic and #teched – Twitter handles) and Blogs.
Videos about using and installing STpivot
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
Informacion util para empresas tecnologicas
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
SAS BI Clients 4.3 and SAS 9.3
Source: Blogging about all things SAS [link]
I caught up with James Foster from SAS Australia last week and got a bit of an insight into where the SAS products are going in the next few months.
At the moment we are at SAS 9.2 M3 for the SAS Platform (eBI, eDI, EM etc).
Later this year (Mid Q3 ish?) a new set of the BI Clients will be released under the 4.3 banner (so Enterprise Guide 4.3, Web Report Studio 4.3 etc). These clients will bring massive enhancements to the BI client functionality.
Two that I have already mentioned are the enhancements Chris has outlined will be coming in Enterprise Guide 4.3 here and the new BI Dashboards James mentioned at SUNZ earlier this year.
To leverage this new functionality you will need to be on SAS 9.2 M3, then install the desktop/developments clients and/or redeploy the new Web App versions.
James also mentioned that SAS 9.3 is currently under development as the next major release and maybe be out sometime next year.
Its great to see we don’t have to wait another 5 years (aka SAS 9.1 –> SAS 9.2) before SAS delivers new capability. I think given the BI competitors SAS (aka Oracle, SAP) have in the market the speed of development is going to be key.
Bring it on!
Thoughts on Analysis Services in the Cloud
Source: Chris Webb's BI Blog [link]
So far this year I’ve indulged myself a few times in a bit of futurology (here and here, for example) regarding directions the Microsoft BI stack might take. The one area I haven’t touched on recently, though, is what Analysis Services in the cloud might look like; I did speculate a bit here but that was a while ago now and before several relevant technologies had been announced. It’s certainly coming, and presumably somebody somewhere is working on it right now in some top-secret bunker in Redmond, so maybe a few public comments on what we the user community would want from it would be helpful…? Anyway, welcome or not, here are some thoughts…
So why would you want or need Analysis Services in the cloud rather than regular Analysis Services? I can think of a few things it should be able to do to justify its existence:
- It should be cheaper than hosting all the infrastructure in-house and it should be scalable to the Nth degree. OK, so these are the standard reasons dragged out for cloud-based anything, but with Analysis Services there are two obvious times when resource usage peaks - processing and when a big/complex query runs – and equally there are times when the server can be completely quiet; so the idea of being able to make use of near infinite resources when you need them to make the processing/querying super-quick, but only pay for what you use, is very appealing. From this point it follows that when you need to use extra resources, you need a platform that can scale to be able to make use of those resources.
- As well as being able to work with ‘traditional’ data sources such as your corporate data warehouse, it should be able to work with cloud-based data sources be they relational or non-relational (like Amazon SimpleDB, Google’s recently-announced BigQuery, Azure Table Storage and all the rest), feeds (like OData or GData), linked data (RDF), web-based spreadsheets like Google Docs or the Excel web app, or completely unstructured data from anywhere on the web (maybe something like how Google Squared works). Supporting the integration of data modelled for all of these different types of database would be a challenge but I think it should be possible.
- It should be available as a data source for anywhere else on the web – your own apps, reports, web-based spreadsheets and so on – as well as desktop apps like Excel. The really important thing, for me at least, would be for it to expose an XMLA interface to allow ad-hoc querying (note that Excel can’t talk direct to an XMLA provider, it only does OLEDB for OLAP, but it’s possible to bridge the two and Simba already sell an OLEDB provider that does this); grudgingly, I’ll admit a SQL query interface would be useful too. The ability to expose data via an OData feed would be a must as well.
- I’d also like to see it support some basic ETL functionality too, because if it is as scalable and fast as I’d like to be then it would have an obvious secondary use for large-scale number crunching – aggregating data, doing lookups, sorting, many of the things that you might do today in the SSIS data flow or which you might look at Hadoop to do. Derived columns and lookups could all be done with DAX or MDX calculations; pivoting, sorting and filtering could all be done (and configured very easily with a good client tool) through the right MDX query. I can imagine it acting as a datasource for itself: you’d load data into a cube or a table or whatever, create a query on top of it which is made available as a feed, then take the data from that feed and load it into another cube/table, and so on.
- Following on from the last two points, it’s not enough to be able to act as a data source, Microsoft would need to come up with a decent web-based client tool specifically for use with it. And no, vanilla pivot tables on the web wouldn’t cut it, nor would SSRS in the cloud (not that that I wouldn’t want that) – you’d need to have the wow factor that something like Live Labs Pivot has as well as serious, power-user functionality like the Proclarity desktop client; it would probably need to be built using Silverlight or HTML5. I still think there’s an opportunity to rethink what a client tool could be here, blur the line between BI client tool, spreadsheet and database and come up with something really new.
n’t expect to get, a recognisable version of Analysis Services 2008 in the cloud in the way that SQL Azure is recognisable as server-based SQL Server. While I still see an important role for Analysis Services as we have it today as in corporate BI scenarios I don’t think there’s any point transferring it to the cloud with exactly the same functionality. Some things, like dimension security, would still be needed, but some things, like cell security, we could probably live without. PowerPivot in the cloud would make more sense as a starting point, so long as it was not just a straight copy of PowerPivot on the desktop: the ability to scale to really, really large data volumes, as in the first bullet above, would be the key feature and the only real reason why customers would want BI in the cloud. And it’s not just the scalability of simple queries either – queries that use complex calculations would need to scale too. You know what, though? When I look at DAX I can’t help but think it was designed with this requirement in mind; I can see how the evaluation of DAX expressions could be easily parallelised.
So all this sounds pretty ambitious, even a bit pie-in-the-sky. The way I see it, though, as far as BI-in-the-cloud goes there’s everything still to play for and if Microsoft doesn’t deliver then someone else will. Could it be Google, or Amazon, or some startup we’ve never heard of? Now that Google BigQuery has a SQL interface, it’s only going to be a matter of time before someone builds a BI app on top of it (I wonder if Mondrian can be made to work with it?) – and it certainly seems to be fast. Microsoft needs to think beyond using BI to defend the Excel/Sharepoint franchise and start thinking about the future! With ten years of BI experience behind it, Microsoft should be in a strong position to move forward into the cloud but it’s only going to succeed if it’s innovative. I understand there will be some new PowerPivot-related product announcements at the BI Conference this week; I’m keeping my fingers crossed.
As always, your comments and ideas are welcome…
ODTUG Kaleidoscope Scores John Heffron (and other social events)
Source: Look Smarter Than You Are [link]
There’s a ton of new social news for ODTUG Kaleidoscope this year, but let’s not bury the lead:
I don’t know about you, but after the nature-assisted disaster that was Collaborate’s “Beach Party Except In a Ballroom” event, I’m glad to see someone step up and spend some money on some A-list talent. John Heffron, winner of the second season of Last Comic Standing (and in my opinion, the funniest winner by far with only Alonzo Bodden coming close), will be the featured act at Wednesday night’s closing celebration.
|
|
We’ll be rocking while talking…Traditionally, opening receptions at conferences have string quartets or jazz trios soothing you to sleep with every note. ODTUG Kaleidoscope 2010 will bring you non-stop rock brought to you by… you. That’s right, we have two Rock Band PS3 setups on stages high above the reception crowd and the exhibit hall floor. Participants will be able to choose from hundreds of downloaded songs from Rock Band and Rock Band 2 as bands, made up of Kaleidoscope attendees, go head-to-head on drums, guitar, bass, and vocals. We only have one rule: the vocalist picks the song, but beyond that, anything goes as long as it won’t get us arrested.
Rockers wearing appropriate concert attire (still have your ’87 Guns ‘N Roses black tee lying around?) will be given preference as we form lines to rock out in front of God, country, and your fellow K’Scopers. Prizes will be awarded to the best performers, which are far more about impressing the crowd than it is playing the right notes, since we’ll have “No Fail Mode” turned on. Come one and all as you wow us with your rock and roll acumen, and yes, our amps do go all the way to 11. If you wish to sign up for your place in rock history ahead of others getting talked into it on-site, e-mail events@odtug.com. We who are about to rock, salute you.
Resources for developing analytic talent
Source: The sascom magazine blog [link]
In a recent Accenture survey of senior managers in the United States and the United Kingdom and Ireland, 51 percent of the respondents said their organizations have “more opportunities to use analytics to improve the business than we have analytical resources to address them.”
I heard this repeated again and again at the SAS Global Forum executive conference in Seattle, especially in the CIO presentations. One presenter, a director of analytical services at a large retail chain said, “There are never enough resources - money, people or data. Deal with it.”
But how do you deal with it? The presenter suggested training, hiring quality talent and a lot of testing and learning. I’ve tried to compile a good list of resources and articles here that offer advice on how to hire, train from within or start assembling an analytics center of excellence. If you have additional tips or articles to suggest, let me know.
Continue reading “Resources for developing analytic talent”
Gartner 2010 Business Intelligence Tools Magic Quadrant
Source: Laura Gibbons Scorecard Application & Six Sigma Blog [link]
For those of you who prefer not to register to receive this information, here is the 2010 Gartner Magic Quadrant rating the latest and greatest BI Platforms.
I love how many of the pure play newbies of last season like QlikTech moved from visionaries into the challengers role giving the big dogs on campus, MS, SAP/Business Objects and Oracle a run for their money. And while, value can be shown easily using a product which can consume and spit out dashboards as easily as making scrambled eggs in the morning, one has to wonder how much value it provides over time when the data to support such dashboards often still requires much manual intervention, ie. acquisition from source systems, cleansing, transformation and loading into a consumable format. Where’s the ROI in that? Most systems boast on the time savings achieved with implementation when calculating a BI system’s ROI.
But, not to knock them. I find them a great alternative for proof of concept work or when the manual nature of compiling the data isnt a concern or is someone’s role, and all that is needed is the icing to tell the cunning story (“Once upon a time, there was a SKU…And this SKU had many family members who lived in different houses in different regions of the world”)
Aah yes, if only all BI could be told as such a happy little anecdote of a story…A girl can wish can’t she?
Download Gartner 2010 Report Click Here
Report is also available in my SkyDrive library.
The ‘Function XXXX only works with contiguous date selections’ Error
Source: Chris Webb's BI Blog [link]
In my last post on DAX, I mentioned I’d come across the following error when running a query in PowerPivot:
ERROR - CALCULATION ABORTED: MdxScript(Sandbox) (22, 4) Function ‘DATEADD’ only works with contiguous date selections.
The DAX expression I was using was as follows:
=CALCULATE(SUM(FactInternetSales[SalesAmount])
, DATEADD(DimDate[FullDateAlternateKey], -1, YEAR)
, ALL(DimDate))
And the error was happening when I ran queries with Dates on either rows or columns and was using a slicer that had multiple years selected but only a few months, for example like this:
Clearly I had a non-contiguous date selection: I was expecting to see all the dates in January and February 2003 on rows, followed by all the dates in January and February 2004. But I couldn’t understand why I was getting an error on what, to me, seemed like a perfectly reasonable query!
So I asked my friends at Microsoft and Marius Dumitru explained that this wasn’t really an error, it was actually a deliberate feature designed to stop users running queries where calculations would return misleading results, or where the semantics of what was actually happening would be hard to understand. I can understand the motivation for doing this – for example, consider what would happen in the scenario above if there was no error, and I didn’t have dates on rows or columns – no time intelligence calculation would return any meaningful values here. But I still maintain that my scenario above should return values, because in this case the slicers are only serving to filter which dates are being displayed on a visible axis, and when the dates themselves are visible on an axis then it makes sense to see the result of the calculation.
Time to open a Connect, then, in the hope that this scenario will be allowed in a future version. Please vote here:
https://connect.microsoft.com/SQLServer/feedback/details/565032/loosen-restrictions-on-contiguous-date-selections-error
Eight data prep lessons for advanced analytics
Source: The sascom magazine blog [link]
Are you using advanced analytics to discover relationships and anticipate the future? According to TDWI, only 38 percent of organizations already are but 85 percent plan to do so in the next three years.
If you’re among that 85 percent, be sure to read and follow these data prep lessons before you get started:
- Scale up data integration to handle large analytic data volumes.
- Realize that reporting and analytics have different purposes and needs.
- Distinguish between data warehouses, data marts, and analytic databases.
- Design a data warehouse architecture that accommodates analytics.
- Prepare data to meet the needs of the analytic method you’ve chosen.
- Preserve analytic data’s rich details, because they enable discovery.
- Improve data after working with it, not before.
- Apply the products of advanced analytics to BI and DW activities.
The list above and the broader paper from TDWI (sponsored by SAS and Netezza) is billed as klist, but I’ve pulled the first step - use advanced analytics to discover relationships and anticipate the future - out here as the goal. These same steps were also covered recently in the Analytical Data Prep 101 Webcast.
The themes on the list are similar to what I’ve been hearing from executives and technology experts here at SAS as I’ve interviewed them for a recent CIO-focused publication that I’m editing.
In particular, #3 above - distinguish between data warehouses, data marts, and analytic databases - caught my attention, and the table on page 4 of the report answered some questions I had recently about when you might want to use an analytic database instead of a full-scale data warehouse. Read the full report on data requirements for advanced analytics. You might find answers to the same - or other - questions you’ve been asking as well.
PASSMN May 2010 Follow-up
Source: Dan English's BI Blog [link]
This past month I presented at our local Minnesota SQL Server User Group. I was a tad bit exhausted from presenting at TechFuse 2010 and the SQL Server 2008 R2 launch event here in Minneapolis earlier in the month. It was tough when each presentation was on a different topic and I filled in for the launch event and had less than a week to absorb the content. I didn’t plan for all of these to occur one week after the other either, just kind of happened. Needless to say I was a little burnt out, so hopefully my
content was good even though I probably sounded a bit less than enthused, which I apologize for. I also want to apologize for the food. The Papa Johns pizza was mistaken delivered to Davannis….I wish I could have been there to see that one.
For the evaluations I received the following scores (on a five point scale):
- Effectiveness of the demonstrations - 3.7
- Usefulness of the material presented - 4.0
- Speaker’s presentation skills - 3.5
- Correct blend of presentations and demonstrations - 4.0
- Overall Presentation Rating - 3.8
I have been holding out on posting the content because this was the first month we got an actual recording of the meeting. I have been working on converting it and making it available to view. My talk was on SSAS Design Best Practices and Incremental Processing. I highlighted some of the built-in items in BIDS along with BIDS Helper, did a few demos, and wrapped it up with a little PowerPivot comparison.
PASSMN May 2010 - SSAS Design & Incremental Processing from Dan English on Vimeo.
I wanted to provide some follow-up reference links that goes over some of my content or highlights items that I discussed:
- SQL Server Best Practices
- Designing Effective Aggregation with Analysis Services (video – Chris Webb)
- To Partition or Not – This is the Question
- In Search of Aggregations
- Project REAL: Data Lifecycle – Partitioning
- SSAS - Speed up dimensions using a NULL default cube measure
- Excel 2007 Document: Designing SQL Server 2005 Analysis Services Cubes for Excel 2007 PivotTables
- SQL Server 2005 Report Packs - SQL Server 2005 Integration Services Log Reports
I think I misspoke a couple of times. One time I stated that I don’t usually have the security information for SSAS ahead of time. That isn’t necessarily the case, usually it is a requirement up front. Sometimes though we add it in after the fact and it doesn’t take a lot of work to accommodate that either using SSAS roles. I believe I also mentioned that PowerPivot was row based compression and it is column. It was kind of funny, because the day after my presentation I ran into a blog posting about hierarchies in PowerPivot which I was talking about at the end of my presentation and did a posting in regards to it - Hierarchies in PowerPivot. You still don’t get the drill path like you would in SSAS, but you can get a similar look-and-feel.
The other big thing was the incremental processing setup I had posted about previously - Analysis Services (SSAS) Processing and Aggregations. I had originally set that package up to handle yearly incremental processing and before I published it I changed it to monthly. Well needless to say I forgot to take into account the yearly partitions that might be included and these were getting included in the full processing group instead of being ignored. I have made the change and have created a new version of the packages and replaced the old ones.
Downloads from my SkyDrive (presentation and SSIS package download):
Feel free to leave me any comments if you have any questions or ask me questions at one of the PASSMN monthly meetings.
Comparativa entre Kettle y Talend
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
STpivot, Jpivot with steroids
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
Oracle Financial Analytics Presentations and Photos
Source: Rittman Mead Consulting [link]
Rittman Mead Consulting have recently initiated a series of seminars that combine the companies in-depth technical knowledge with a greater focus on the business perspective. The first of these was held last September and focused on commercial considerations for BI Project Delivery. The second seminar in the series was held in Oracle’s City Office last Wednesday, May 26th and focused on the Oracle Financial Analytics module within the pre-built Business Applications suite.
The audience was made up of both Finance and IT representatives from the worlds of Retail, Finance, Transport, Leisure, Health, Hi-Tec and Local Government.

The underlying theme of the seminar was that recent trends in Operational Finance towards standardisation on single instance ERP systems and the creation of Shared Service Centres have in turn created an opportunity to leverage a pre-built, best practice BI Solution for Operational Finance that will save time and money.
One of the best received presentations of the morning was provided by Terry Gadd who is Head of Finance Systems at LV=. Terry has worked closely with the Senior Management team at LV= who have been responsible for the transformation of the Insurer from the old Liverpool Victoria brand to the new LV=, which has been central to a high profile TV ad campaign. Terry had responsibility for the Finance aspects of the back office transformation. This entailed moving old systems on to a vanilla install of E-Business Suite providing the perfect platform for Oracle Financial Analytics.

Damian Arnold, a Solution Architect at Rittman Mead then provided a full profile of the capabilities of Oracle Financial Analytics along with a live demo. The session prompted some good debate amomg the audience, not least around the roles that should be played by IT and Finance personnel. Mike Vickers, a Rittman Mead Project Manager followed this with an explanation how the solution fits within the wider BI strategy of the enterprise and why it will benefit an organisation to always keep the bigger picture in mind.
Here’s the slides from the event:
- Setting the Scene: An overview of the days agenda and the business and technology challenges & opportunities in the Finance Department
- Life in the real world: How LV transformed the finance department using Oracle Financial Analytics
- Defining an approach: To successfully leverage Financial Analytics a joint approach is required between the Finance Department & IT
- Post Implementation: How the principals of a Business Intelligence Competency Centre apply to Financial Analytics
Please let us know if you are interested in attending future seminars in this series by dropping us an email at info@rittmanmead.com, and we will keep you informed regarding dates and venues.
A man called [Sun] Barry (The Verve, circa 1992)
Source: The sascom magazine blog [link]
A common theme throughout the recent ACORD/LOMA Insurance conference in Las Vegas was that analytics is here to stay.
This was particularly evident in a moderated panel called Using business analytics to improve business performance that included Mark Gorman and Traveler’s Second Vice President Rachel Alt-Simmons. A lively discussion arose from this session. Topics included analytical centers of excellence, data quality, definition of a blended being, and even three-legged stools! Many references and ideas in this session can be found in recent research reports by The Gorman Group and an article by Accenture on Counting on Analytical Talent.
The analytics message was also supported by two respected analysts: Kimberley Harries-Ferrante from Gartner presented a session called Preparing for Insurance 2015 and beyond, which highlighted the need for clean accessible data, as well as, predictive techniques to mine data and help strategy planning. And, Karen Pauli from TowerGroup spoke about the power analytics can bring to the claims process.
Another recurrent event topic was social media and multi-channel distribution. Driving Dynamic Distribution Strategies, a session by Karen Furtado and Mark Breading from Strategy Meets Action Marketing, emphasized how insurance needs to adapt to support the various distribution channels available today, including social media like Facebook and Twitter. Analytics is fundamental to the success of these strategies to make certain insurers can gain insight into their optimal distribution model.
The widespread feeling from the conference was of general optimism for the insurance industry. However, most recognize that improving efficiency is essential for the industry’s future, and analytics is at the core to help insurers achieve this aim. This is good news for SAS.
A lighter, insurance note? The end of May is also witness to one of the crazier sporting events: Cheese Rolling. In this race, a cheese is rolled down a steep hill, followed by the race participants. The first participant down the hill wins the cheese. Did I say the hill was steep? Check out this video. Unfortunately the 2010 event was canceled; perhaps there were too many false Workers’ Comp claims.
Data mining in Excel 2007
Source: Ella Maschiach's BI Blog [link]
Mr. Donald Farmer gave a lecture yesterday, May 31st 2010, at the Israeli BI User Group. This is what I summed up from the session.
Predictive Analytics
A good predictive analytics project would have:
Actionable - you get useful information from it
Innovative - gives something new, new insight from the model
Trustworthy - the model makes sense
Seamless - part of your everyday operations, so that it will get used more easily by the users
"All models are wrong, but some are useful" George Box
No model is perfect, you can't predict everything, but even then the model is useful. Don't expect to build a perfect model, just try to build a useful model.
How do you build a useful model?
Right problem
Right criteria - a realistic criteria for the business, small improvements over time, working gradually
Right data
Right results - results that meet your criteria
Right Delivery - giving people a model that users can understand
|
Traditional BI |
Predictive Analytics |
|
Exploration |
Discovery |
|
Drill down |
Classification |
|
Trending |
Perdiction |
|
Force constraints |
Discover outliners |
|
Apply rules and models |
Find Patterns and relationships |
Right problem
Cross-sell and up-sell (selling something additional or selling them something similar but more expensive market basket analysis)
Customer acquisition (getting new customers, the demographics of your current customers)
Customer retention (keeping your customer with you, pattern of leaving)
|
Scenario |
Tasks |
|
Cross-sell and up-sell |
Association |
|
Customer acquisition |
Clustering |
|
Customer retention |
Classification, estimation |
Data Mining Add in for Excel 2007
Analyze - the tab for beginning Data Mining
Turn the data into a table and then you can use the Analyze tab
Market basket analysis - what can I sell with the current product. Use profit per product to analyze for better profit.
Analyze key influencers - understanding what are the main drivers of a action (what makes people buy).
Detect categories - finding groupings within the groupings (no use to include ID). We can also define how many categories we'd like to get.
Fill from example - filling in missing data about your customer according to other carasteristic that we have about them.
Highlight exception - findings exceptions, how far away from the center is the data for that customer.
I don’t like a reclaimable database disk copy
Source: Weblog for the Amis technology corner [link]
As of Oracle 10g it’s possible to recover (or wind forward) a RMAN disk copy of the database to a specific time in the past. This type of backup is quite suitable for big databases (> 100GB), because you have to perform a time consuming full backup only once. All other backups to disk will […]
Oracle BI EE 10.1.3.4.1 – Do we need measures in a Fact Table?
Source: Rittman Mead Consulting [link]
I know this blog title can be classified as a deviation from my more conventional ones but i have been planning on writing this for a while now, so hopefully this should be of some interest to everyone. As we start working with BI EE more and more, we get to know what its good features are and along with it we get exposed to its not so good ones(as with any tool) more. Recently i came across a requirement where BI EE was being implemented on a Data Warehouse that was built almost 4 years ago. The Data Warehouse had been built based on Kimball’s model and it had been satisfying the requirements of the customer so far(excel & MSAS reporting) unless and until BI EE came in as potential replacement for MSAS. When the customer first told me this, i was kind of confused on why BI EE failed to meet their requirement considering the fact that it was a very simple data warehouse(standard set of dimensions and 3 to 4 fact tables). On further analysis, there were 2 main requirements that stood out to be showstoppers in using BI EE
1. Ability to use any dimension member(Across dimensions) as a measure and the ability to use any measure as a dimension attribute – This is one key requirement that cannot be solved easily by BI EE currently. The problem with Business Model and Mapping layer of BI EE is the fact that we have to identify measures. Though this will work in most cases but if we want to do true multi-dimensional reporting, then the tool should ideally provide us with an ability to even treat measures as dimensions. For example, consider the report shown below

This is a very simple report where Amount Sold and Quantity Sold are measures (both in the data warehouse as well as in BI EE). But what if the customer requirement is to create a report as shown below

In the above case, Amount Sold and Quantity Sold measures in the first report have become dimension attributes. Also, the Channel dimension attributes Direct and Indirect have become measures in the second report.
In my customer scenario, end-users required the ability to treat any dimension as measure and vice versa. So, clearly they wanted a multi-dimensional reporting capability using their existing Data Warehouse and using BI EE. In retrospect, i would say this is a very common requirement for users who are used to Multi-dimensional tools like Essbase, MSAS etc. In some cases, with BI EE 10g, where we need to treat dimensions as measures, we just use Pivot Tables. But in cases, where measures in BI EE need to be shown as dimension attributes, then the only way to solve such requirements is to use set operations like using UNION etc.If we use them in Answers, we will start facing all sorts of issues like losing drills, losing the ability to use is prompted filters etc. There are other methods as well like the one that i have discussed before here. But the problem is the fact that we cannot make any method (BI EE modeling methods)generic enough to solve all these requirements.
2. The second requirement was to provide drilling ability on measures. For example, consider the report below. Here the 2 measures Amount Sold and Quantity Sold have been grouped under SALES. Similarly there are another 2 measures, Unit Price and Unit Cost which have been grouped under COSTS.

Drilling on COSTS or SALES should give us the individual measures.


Though the above example demonstrates a single step Measure hierarchy, the actual requirement was to have a multi-level measure hierarchy.
Ideally, considering the requirement, this would have been solved by using Essbase & Smart View. But considering the fact that the customer had already bought BI EE, a solution was to be arrived at using BI EE 10g. There were 2 things that we knew & were sure of
1. The data warehouse was working well. The customer was open to tweaking it a bit further to enable this requirement within BI EE.
2. Essbase provides this capability and will quite easily solve the customer’s requirement.
We even thought of probably asking the customer to wait for BI EE 11g, but again the actual problem was in the fact that measures were actually required by BI EE which i don’t think will go away in the near future since the fundamental design of BI EE RPD revolved around the need to identify measures and dimensions(we weren’t sure of what changes in the RPD were coming in 11g and hence we had to find a solution in 10g itself). This is where we had to devise a DW model using the concepts of Essbase and then retrofit it back to the DW as well as BI EE.
I will illustrate what we did for the customer considering the SH schema of Oracle which is based on Kimball’s DW methodology.

As you see, we have 2 fact tables(of differing grain) and multiple dimension tables linking to the facts. Each fact table has a set of pre-defined measures as shown below


From an end-user perspective, each of the measures above are nothing but attributes of data. So, in effect if you model this in Essbase, what essbase will do is, it will maintain an internal single measure called (<Data>) and then everything else are nothing but attributes of this single measure(i.e dimensions. How these are internally stored within Essbase is determined by the 2 storage options BSO and ASO). So, to model this similar to Essbase, we start with separating out measures into its own dimension table called Measure as shown below

Then we create 2 new Materialized Views, Sales_Data & Costs_Data, on top the Sales and Costs fact tables as shown below

So the actual relationship diagram between the fact and dimension tables will look as shown below

So far so good. What this enables us is a fact that we can now sort of treat all measures as dimension attributes. But how do we now provide the ability to treat dimension attributes as measures.
This is done by applying Filter function on the DATA column. For example, to create a measure called Direct Sales we create a logical column and then apply a function as shown below
FILTER("SH - Multiple Measure Dimensions"."Sales - Facts".Sales
USING
"SH - Multiple Measure Dimensions".Channels."Channel Class" = 'Direct')

This way, we can create as many logical measure columns as possible depending on what we need to treat as measures in the presentation layer. The good part about this is, we can use an attribute as a measure and also as a dimension.
The idea is we build a model containing the following presentation layer

For every dimension table, there will be a corresponding Fact Measure table with measures required for reporting.
So, to achieve the first requirement where we need to convert the measures as dimension attributes, we choose the following columns in the report


If we look at the SQL of this report, you would notice that FILTER function converts it into relevant case when statements thereby providing us with the ability to do multi-dimensional analysis
select T14286.PROMO_NAME as c1,
T14409.MEASURE as c2,
sum(case when T14224.CHANNEL_CLASS = 'Direct' then T14394.DATA end ) as c3,
sum(case when T14224.CHANNEL_CLASS = 'Indirect' then T14394.DATA end ) as c4,
T14286.PROMO_ID as c5,
T14409.MEASURE_ID as c6
from
MEASURES T14409,
PROMOTIONS T14286,
CHANNELS T14224,
SALES_DATA T14394
where ( T14224.CHANNEL_ID = T14394.CHANNEL_ID and T14286.PROMO_ID = T14394.PROMO_ID
and T14394.MEASURE_ID = T14409.MEASURE_ID and (T14224.CHANNEL_CLASS in ('Direct', 'Indirect'))
and (T14409.MEASURE in ('Amount Sold', 'Quantity Sold', 'Unit Cost', 'Unit Price')) )
group by T14286.PROMO_ID, T14286.PROMO_NAME, T14409.MEASURE, T14409.MEASURE_ID
order by c1, c2
Similarly to solve the second requirement, we just need to pull the relevant measure attributes from the measure dimension and then use the DATA field for our measure


There are significant advantages of this method as this does not fix us to any specific dimension/measure. It provides an ability where we can swap measures and dimensions at will.
So the question to everyone reading this is, do we need measures at all in the Fact table(apart from the implicit Data Fact column)? Majority of the tools like BOBJ, Cognos etc all require measures to be assigned explicitly. How about forking out the measures into their own dimension and then treat everything else as normal? Are there any downsides to this approach? The features offered seem to outweigh the conventional modeling method. But i am aware that there could be potential problems with this as well. One problem which i could see is the performance. The size of the fact tables can grow quite quickly if there are a lot of measures. Also, the end-users might have to get used to this model. Is there anything else that you see could become potential problems in the future? If so, do put your thoughts in the comments section.
Algunas normas en el trabajo
Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]
