BI Blogs

Bringing together Business Intelligence voices from across the web

¿Cuál es la meta de la empresa?

Posted on the November 30th, 2008. Read times

Source: Information Management [link]

A

Web 2.0 and Cloud Computing From the Couch

Posted on the November 29th, 2008. Read times

Source: Keep It Simple [link]

I enjoyed this relaxed panel discussion from the recent Web 2.0 summit featuring Marc Benioff from Salesforce.com (who takes some nice shots at Oracle and SAP), Dave Girouard from Google, Paul Maritz from VMware, and Kevin Lynch from Adobe:

/summize.com/”>Summize, a very useful search engine for Twitter.

Share/Save/Bookmark

Search OWB documentation, whitepapers, blog entries and forum all at once!

Posted on the November 29th, 2008. Read times

Source: Oracle Warehouse Builder (OWB) Weblog [link]


Shout-out to an OWB/BI blogger

Posted on the November 29th, 2008. Read times

Source: Oracle Warehouse Builder (OWB) Weblog [link]


BIWA Next Week

Posted on the November 29th, 2008. Read times

Source: Rittman Mead Consulting [link]

I am sitting in my room at a Heathrow Airport hotel on my way from Athens (Greece) to San Francisco.  My body is still working on Eastern European time and I am beginning to dread the change to the Californian time zone. Still having two days of ‘rest’ (writing time, really) to adjust will help.

I am off to the BIWA summit which is, this year, at Redwood Shores. I will be speaking about using Oracle OLAP 11g Cube Organized Materialzed views in DW summary management (must get a shorter title!) on the first morning (11:10) which means that I get the rest of the conference where I can relax and soak in the sessions, but with up to 6 streams running in parallel there will be a lot that I miss.

Next week my colleagues will be posting from the UKOUG conference and I will do my part by posting some of my experiences at BIWA. So it looks like there will be a lot to read here on the blog.

Oh, and if you happen to see me at BIWA, say hello.

Upgrading your data integration efforts to enable Business Intelligence (BI) 2.0

Posted on the November 28th, 2008. Read times

Source: Data Doghouse - performance management, business intelligence, and data warehousing [link]

People have been using the term “business intelligence 2.0” for a few years, but it’s described in different ways. In Business Intelligence 2.0: Simpler, More Accessible, Inevitable Neil Raden says:

"…the current era of BI is coming to an end and will be
succeeded by a BI 2.0 era that promises simplicity, universal access,
real-time insight, collaboration, operational intelligence, connected
services and a level of information abstraction that supports far
greater agility and speed of analysis. The motivation for this "version
upgrade" for BI is the need to move analytical intelligence into
operations and to shrink the gap between analysis and action."

Charles Nichols writes, in BI 2.0: The Next Generation that:

"BI 2.0 is a term that encapsulates several important
new concepts about the way that we use and exploit information in
businesses, organizations and government. The term is also
intrinsically linked with real-time and event-driven BI but is really
about the application of these technologies to business processes."

BI 2.0 is not really about a new generation of BI tools to perform
analytics but getting more comprehensive, consistent, correct and CURRENT data.

>>>Continue reading Upgrading your data integration efforts to enable Business Intelligence (BI) 2.0

Xaware y Snaplogic

Posted on the November 28th, 2008. Read times

Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]


Thoughts on OBIEE Performance Optimization & Diagnostics

Posted on the November 28th, 2008. Read times

Source: Rittman Mead Consulting [link]

One of the sessions that I’m giving at next week’s UKOUG Conference is Optimizing Oracle BI EE Performance. Now I’ve given talks along these lines before but I’ve always ended up listing out all the thing you can do to speed up OBIEE queries - turning on caching, using the aggregate persistence wizard and so on - rather than setting out a methodology. This time around though, I wanted to come up with a set of guidelines, an approach so to speak for optimizing OBIEE queries and the OBIEE architecture, so here’s a few thoughts on how I might go about it. I also want to think about OBIEE tuning in a similar way to how you do database tuning, given how well developed an area that is, and one of the things you think about when you’re tackling a database performance problem is “what diagnostic data is available, to help me work out where the problem is?”

If you look at OBIEE, there are seven major sources of diagnostic data that I’m aware of:

1. The query log generated by the BI Server (NQSQuery.log)

This log file, depending on which level of logging you have enabled, tells you for each query what the logical request contained, the business area and subject area, the outcome of the query (i.e. did it complete successfully), the physical response time and the physical SQL query(s) used to retrieve the data.

bi_server_log.jpg

With this, you can see whether the SQL that’s being generated looks sensible, whether a federated query is being used, and you can switch over to a took like TOAD or SQL Developer and generate an explain plan for the query (which may however be different to the explain plan used for the query in the log, if statistics for example have been updated on the database)

2. The server logs generated by the BI Server (NQSServer.log)

This log file primarly contains messages about the BI Server startup and shutdown, whether usage tracking has started correctly, which subject areas are loaded and so on. It can be useful in diagnosing whether the BI Server is up or not, whether usage tracking is working and so on, but it’s not particularly useful from an optimization perspective.

3. The BI Presentation Server log (saw0log.log, etc)

This log is like the BI Server server log and mainly records the Presentation Server startup and shutdown, details of any server crashes, background tasks for purging the cache and so on.

4. Usage Tracking files and tables.

Now this is more interesting. The Usage Tracking repository, files/tables and associated reports can show you, per query (request), the total time elapsed, database time, row count, whether the query executed successfully, rows returned and so on. All of this information is stored in a file or database tables which means you can compare report run times across various days, which you can then use to start detecting performance drops or predict run times going into the future.

usage_tracking.jpg

What this diagnostic source doesn’t give you though is the actual physical SQL queries generated by the BI Server, or the execution plans (BI Server ones or DB ones) corresponding to those. So if you want the full details on a query, you need to examine the log file (which may or may not have been switched on previously), whereas if you want access to a historical, always collected record of query run times, Usage Tracking is what you want.

logvsusagetracking.jpg

5. The Grid Control BI Management Pack

I blogged about the BI Management Pack the other week, which extends Oracle Grid Control 10gR4 to cover the BI Server, BI Presentation Server, Scheduler and DAC Repository. Like Grid Control in general, it monitory system-level metrics such as uptime, memory usage, CPU usage and so on, plus the BI Management Pack add-in gives you access to some internal BI Server, Presentation Server, Scheduler and DAC counters that can be used to assess the performance of these servers.

gridcontrolpresserv.jpg

For example, for the BI Server the BI Management Pack provides data on CPU usage, physical db connections, memory usage, execute requests, fetch requests and prepare requests. By dashboard, it provides the total time, database time, failed requests etc, with these details also being provided by user (note, not by report)., You also get a bunch of cache metrics such as data cache hit ratio, cache hits vs misses, cache requests etc.

For the Presentation Server, you get metrics on memory usage, cpu usage and sessions, whilst for the DAC you get the total tasks, competed tasks, running tasks, failed runs, number of runs per execution plan together with average duration and so on. Finally, for all servers you get the uptime, completed requests and alerts.

So what’s this information useful for? From what I can see it’s mainly useful for assessing whether the host, or the individual OBIEE server components, are under excessive load. You can also use the service level monitoring feature of the tool to generate service-level tests (logging in, navigating to a dashboard page and running reports, for example) which Grid Control can then run every so often to test “real-world” performance, and you can also use it to save and then compare configurations so that you can see, for example, whether the slow-down is because someone’s turned off caching.

6. The Database and Application Server Monitoring Pages on Grid Control, or your own DBA scripts

As we all know, the major contributor to the performance of OBIEE queries is the database that it gets its data from. If the database is slow, or the schema design is wrong, you’re usually going to get slow-running queries however you tune the BI Server layer. Understanding the performance of the database is therefore pretty important, and you can do this with something as simple as a few scripts and Statspack, or if you’re on 10g or 11g and you’ve licensed the Diagnostics Pack, you can generate reports using ASH, AWR and ADDM and get Grid Control to automatically recommend database tuning steps for you.

dbgridtuning.jpg

So what does this tell us? Well, if you monitor the database at the time your queries are running, you can see the load on the datbase, the host it’s running on and so on, and if you’re quick enough you’ll be able to capture the SQL queries generated by the BI Server and see what execution plans, waits and so on were registered for the statement. If you use some of the tools provided with the diagnostics pack (the ones that work off of ASH, for example) you should be able to even get down to the session level for each query, and work out the reason (via wait events) your query ran slow.

7. The DAC Console and Repository

The final source of diagnostic data applies to situations where OBIEE is working against the Oracle BI Applications data warehouse, and you’re loading data into it via the Database Administration Console. You should be able to tell from this whether last night’s load completed successfully, for example, which may explain why your users are complaining that the warehouse hasn’t been updated. You can also use the data in the DAC repository to measure over time how long each step in your DAC execution plan has taken to run (if you’re clever, you can bring the repository into OBIEE) so that you can trend the data and spot whether your loads are taking longer, using an iBot for example to alert you if things are looking a bit dodgy.

So, now we’ve got all this diagnostic data, how are we going to use it? It’s worth thinking at this point about some typical performance optimization scenarios, and so what we’ll do is image three calls you might get from users, as well as something where you’re being a bit more proactive.

Performance Optimization Scenarios

1) “The system seems slow compared to normal”

2) “This report is running slow”

3) The system isn’t working

4) You wish to proactively spot performance problems

“The System Seems Slow Compared to Normal”

This is something you might get from a user, or one of the admins or developers working with the OBIEE system, where they’re noticing that the dashboard is displaying a bit slower, reports run slower or the system just seems a bit sluggish. Now in the database world, you’d look at Grid Control or Database Control and see if the load on the server or database seems unusually high. You might look at the Average Active Sessions report on Database Control and notice that the number of sessions has gone through the roof, most of them are waiting on User I/O wait classes and you can generally see that things are looking a big strange. If you’re on an earlier version of Oracle you may well generate a Statspack report, and note the system-wide activity on the database. That’s all you can do at the moment, as what you’ve been told is that the whole system is slow and no-one’s mentioned any particular report, transaction or screen that’s an isolated problem.

The equivalent steps you’d do here for OBIEE are firstly to look at the BI Server, Presentation Server and Scheduler pages on the Grid Control BI Management Pack screens, and see whether the load on the various components has suddenly gone up (or is high compared to the historical norm.

gridcontrolpresserv.tiff

You can also run some reports against the Usage Tracking tables; if you’ve previously defined a few key reports to monitor, you could compare their recent run times with the same time yesterday, or earlier in the week, at least to confirm that the problem is definately there. But as it’s a system-wide problem, chances are that the Grid Control pages will at least tell you what’s causing the problem (memory, CPU, disk) and allow you to quantify the problem. Given Grid Control’s ability to set up service level thresholds and alerts, it should also be able to let you know things have slowed down rather than having to wait for users to let you know.

“This Report is Running Slow”

Now this is a potentially more interesting problem. If a particular report, transaction or screen in an OLTP application runs slow, but the rest of the application runs fine (or perhaps has it’s own performance problems), you really need to trace or monitor the particular session it is running in and fine out what has contributed most to response time. You can do this using TKPROF and Extended SQL Trace, or you can use utilities such as DBMS_MONITOR or the various ASH reports to work out what’s going on. Using SQL Trace usually has the drawback that you needed to have enabled it in advance, whilst ASH and AWR run continuously in the background so that you’ve always got some session-level diagnostic data (I’m talking about the Oracle database here, by the way). Tools like Statspack aren’t much use here though as they deal with diagnostic data in the aggregate, which could mask out the real reason for your particular report or sceen’s performance issue.

In the OBIEE world, it’s a similar thing. The “sort of” equivalent to a trace file is the NQSQuery.log log file, and again you have to have enabled it in the first place for data to be collected, and at the relevant logging level. What you may need to do is to go back to the user, enable logging beforehand and then get them to re-run the report, but this time you’ll have the logical query, the physical query, and if you’re particularly adventurous you can even start sifting through the BI Server execution plan.

biserverqueryplan.jpg

You can also refer to the Usage Tracking data and pull up timings for this particular report, but as you don’t have access there to the physical SQL generated, it’s not really much use as typically one of them main reasons for a particular report running much slower is that a different physical SQL query has suddenly started to be used; also, you’ll need the physical SQL query to check it looks sensible and to potentially trace it back to the database, retrieve the execution plan and work out what’s gone wrong. Again, if you’re particularly keen, you can set up an “on logon” trigger on the BI Server that turns on proper database tracing for a particular user, get them to run the query again and then start taking a look through the database wait events.

“The System Isn’t Working”

This is a bit more of a dramatic problem, as some part, possibly the BI Server, possibly the Presentation Server, has gone down for some reason. It’s usually fairly straightforward to work out what part has gone down, more interesting perhaps is why the crash occurred. In the database world we can look at the alert log, or use Grid Control to read the alerts and establish just how much downtime we’ve experienced recently.

In the OBIEE world it’s a similar story, with the BI Server, Presentation Server, Scheduler and DAC Server all having logs that record startups, shutdowns and crashes. All of these logs are monitored by the Grid Control BI Management Pack which means you can work out what’s gone wrong without getting up from your seat.

biisitup.jpg

I mentioned it previously, but one nice feature of Grid Control is that you can define service level tests, where Grid Control records a browser session where you log on to OBIEE, run a report and so on, with these service level tests being run every so often to spot firstly, if the system is up or now, and secondly whether response times have deteriorated.

servicelevels.jpg

You Want To Proactively Spot Performance Issues

This of course is getting a bit ahead of the game, where you want to spot problems before they happen, correct them, so that you can go up to your users and say, “your report is taking twenty seconds more to run each day than it used to, I’ve reorganized the database, added some summaries and now it”ll take just a few seconds to run. Aren’t I fantastic”. In the database world, you can use Grid Control features to monitor query performance, you can run your own tests, or if you’re particularly clever you can monitor the execution plans of your queries to spot if any of them change.

In the OBIEE world, we actually a bit better off than in the vanilla database world, as Usage Tracking stores the runtime for our reports which we can then hook up to an iBot and warn us if something’s looking strange. Going forward (and this is where Oracle’s BI database independence, vs. optimization towards Oracle becomes an issue), you can imagine some sort of system-wide advisor that monitors reports, recommends summaries for example at either the database level or BI Server level, recommends indexes and so on, but driven from the report level rather than the low-level SQL level.

So What Do We Do Now?

Now that we’ve got all this diagnostic data, and we can diagnose problems at the system-wide and report-wide level, what can we do to speed things up, or sort out the server so that it’s not always crashing? Well your initial approach should typically be to try and move as much of the “grunt work” down to the database, swapping federated queries for formally integrated reporting data using an ETL tool and a data warehouse, and with the database doing as much of the complicated calculations as possible, possibly even using an OLAP Server such as Essbase or Analysis Services to handle the summarization and inter-row calculations for you.

You can try caching your data, though in my experience this is usually the last desperate throw of the dice, or you can use the Aggregate Persistence Wizard to generate some BI Server-level aggregates, which can be useful if your data is unavoidably federated or the DBAs won’t let you create some materialized views (don’t forget to add indexes to the summaries created by the wizard though). If you’re data is federated and one of the tables in the join is much bigger than the other, you can try and set up parameterized nested loop joins via the Driving Table feature, and see if this reduces the memory usage on the BI Server.

pnlj2.jpg

pnlj.jpg

If your problem is more due to resource constraints, you can either add more memory, CPU or disk bandwidth to the database or BI Server/Presentation Server layers, or you can cluster any part of the infrastructure to balance the load amongst several physical servers. Whatever you do though, the key thing is to work out what’s causing the performance problem in the first place, rather than just bung a load of indexes on or turn on caching, and the diagnostic tools mentioned above should give you a better idea as to where the problem is. If you’re interested in reading more, I’ve uploaded the presentation to our website, and I’m also working on an article for OTN on how the BI Management Pack works, I’ll post a link to this too when the article goes up which will probably just after Christmas.

VM articles on my new Wiki

Posted on the November 28th, 2008. Read times

Source: oramoss oracle [link]

I wanted a place to store notes that I could write up from anywhere…but weren’t necessarily relevant to put in a blog, so I now have a Wiki on my website.

Don’t get excited, I’m not planning on hosting a full blown wiki for open editing - it’s just for me.

Amongst the things on there are some short “How to” articles relating to VMWare.

I’m sure I’ll have made mistakes along the way - feel free to point them out via this blog or email me and I’ll sort them out. Comments welcome as well.

NonEmpty() and that all-important second parameter

Posted on the November 27th, 2008. Read times

Source: Chris Webb's BI Blog [link]

Here’s a question which comes up all the time - it was asked at Mosha’s MDX seminar last week, and a friend of mine asked me about it recently too - what does the NonEmpty function do if you don’t specify the second parameter?

Let’s take a look at some example queries. I think everyone knows that you can use NON EMPTY before an axis definition to remove all the empty tuples on that axis, as with:

SELECT [Measures].[Internet Sales Amount] ON 0,
NON EMPTY
[Date].[Date].[Date].MEMBERS
ON 1
FROM [Adventure Works]
WHERE([Product].[Subcategory].&[1])

The problem comes when people assume that you can use the NonEmpty() function in the following way to get the same result:

SELECT [Measures].[Internet Sales Amount] ON 0,
NONEMPTY(
[Date].[Date].[Date].MEMBERS
)
ON 1
FROM [Adventure Works]
WHERE([Product].[Subcategory].&[1])

In a lot of cases you might not see any obvious differences between what the two uses return, but if you run the query above you can see a lot of empty rows returned so they clearly aren’t the same. So what’s happening? If you clear the cache, rerun this second query and then run a Profiler trace you can get a hint:

NEProfiler

Why are the Reseller Sales measure group partitions being hit? Because the Reseller Sales Amount measure is the default measure on the Adventure Works cube, and since we didn’t specify a measure in the second parameter for NonEmpty() it’s using the default measure to decide which dates have values or not. To fix this we can explicitly tell AS which measure to use:

SELECT [Measures].[Internet Sales Amount] ON 0,
NONEMPTY(
[Date].[Date].[Date].MEMBERS
,[Measures].[Internet Sales Amount])
ON 1
FROM [Adventure Works]
WHERE([Product].[Subcategory].&[1])

The moral here is always, always, always specify a measure in the second parameter for NonEmpty() whenever you use it. If you don’t you may get unexpected results back and you may also get poor performance, for example if the default measure comes from a very large measure group.

Oh, and as a bonus tip, don’t ever use NonEmptyCrossjoin() with AS2005 or later. It’s difficult to use and frankly unpredictable in what it does sometimes; you can always do whatever you want with NonEmpty or Exists (when specifying a measure group in the third parameter) much more reliably and just as fast.

For more information on this topic, have a look at this old-but-good blog post from Mosha:
http://sqljunkies.com/WebLog/mosha/archive/2006/10/09/nonempty_exists_necj.aspx

Key SAS Configuration files

Posted on the November 27th, 2008. Read times

Source: Blogging about all things SAS [link]

Within the SAS9 environment there are a number of key configuration files, which are unfortunately normally scattered all over the place.

These are some that you need to know about:

  • C:\Program Files\SAS\SAS 9.1\sasv9.cfg
    Contains configuration information to be used when SAS is launched such as WORK and SASUSER directories
  • C:\SAS\Config51\Lev1\SASMain\ObjectSpawner\OMRConfig.xml
    Contains information about the startup of the object spawner
  • C:\SAS\Config51\Lev1\SASMain\omaconfig.xml
    Contains user information used when the metadata server initializes
  • C:\SAS\Config51\Lev1\SASMain\sasv9.cfg
    Contains configuration information about SAS session startup
  • C:\SAS\Config51\Lev1\SASMain\MetadataServer\Sasv9_MetadataServer.cfg
    Contains configuration information to be used when the SAS Metadata Server is launched such as OBJECTSERVER parameters and log file locations
  • C:\SAS\Config51\Lev1\SASMain\appserver_autoexec.sas
    Contains SAS session startup commands that should be run at the start of session startup, included in autoexec_solutions.sas
  • C:\SAS\Config51\Lev1\SASMain\ConnectServer\OMRConfig.xml
    Contains startup information used by the Connect Server upon initialization to retrieve the proper configuration information
  • C:\SAS\Config51\Lev1\SASMain\ObjectSpawner\OMRConfig.xml
    Contains startup information used by the Object Spawner upon initialization to retrieve the proper configuration information
  • C:\SAS\Config51\Lev1\SASMain\OLAPServer\sasv9_OLAPServer.cfg
    Contains SAS configuration parameters specific to the SAS OLAP Server and the sessions associated with the SAS OLAP Server
  • C:\SAS\Config51\Lev1\SASMain\ShareServer\sasv9_ShareServer.cfg
    Contains SAS configuration parameters specific to the Share Server and the sessions associated with the Share Server
  • C:\SAS\Config51\Lev1\SASMain\ShareServer\libraries.sas
    Contains initialization for the Share Server and library assignments that should be made accessible via the Share Server
  • C:\SAS\Config51\Lev1\SASMain\ShareServer\startShareServer.sas
    Contains initialization Share Server startup commands
  • C:\SAS\Config51\Lev1\SASMain\StoredProcessServer\sasv9_StorProcSrv.cfg
    Contains SAS configuration parameters specific to the SAS Stored Process Server and the sessions associated with the SAS Stored Process Server

Our thoughts and prayers are with our colleagues in India

Posted on the November 26th, 2008. Read times

Source: The sascom magazine blog [link]

On the eve of the Thanksgiving holiday in the US, we have new reason to give thanks. Early reports indicate that all members of the SAS family in India are safe following the terrorist attacks in Mumbai. We wish all of our friends there a swift return to peace and safety.

An arms race my customers don’t care about

Posted on the November 26th, 2008. Read times

Source: bayon blog [link]


PASS Summit 08

Posted on the November 26th, 2008. Read times

Source: Chris Webb's BI Blog [link]

I’m back from my trip to Seattle at the PASS Summit and have just about recovered from the jet lag, so now’s a good time to blog about the last week. What did I get up to?

  • Monday: Mosha’s MDX pre-conference seminar. This was the first time I’ve ever paid for any form of training out of my own pocket and I was not disappointed – it was everything I was expecting it to be, ie a great in-depth look at the inner workings of MDX. Over the subsequent days I was stopped by a few Microsoft people who asked me how it went, and I got the impression they thought it was going to be too detailed, but to be honest I actively wanted detailed information and so did the vast majority of other people present (the likes of George Spofford, Deepak Puri, Tomislav Piasevoli and so on were also there). Mosha may not be as slick as some of the more experienced speakers out there on the SQL conference circuit but he’s more than competent; the material was laid out in a logical order and his slides were clear. And the most important thing of all is that he is probably the only person in the world who has the knowledge to be able to give this kind of seminar. Some of the material was repeated from his blog but benefited greatly from being put into a wider context; some of the material was completely new to me, and given that I’m someone who’s lived and breathed MDX for the last ten years or so that’s quite something. I only wish he could have gone on for another day since he clearly had enough topics prepared to be able to do so.
  • Tuesday: I had off, so I headed over to Redmond for a few meetings. Heard some interesting things (as I did all week) but all were NDA, unfortunately. Maybe one interesting point though: for some reason I had assumed that there were separate development teams for Gemini and Analysis Services, but that’s not true - Gemini is Analysis Services, it’s all the same team.
  • Wednesday: Day 1 of the conference. The first session I attended was from a company called Meta Integration. At first I was a bit surprised since it was clearly a ‘vendor presentation’ – they were plugging their software, which is definitely not the done thing at a conference like PASS. I didn’t mind that much since it was an interesting product, and then as the presentation went on came to the realisation that this was software that wasn’t on sale to the general public anyway. What Meta Integration do is provide tools for metadata integration and lineage to the big BI software suppliers and they make a big play of being independent from any one vendor. So, for example, you can track metadata from an ERwin model to an Informatica-based ETL process and so on through to an Analysis Services cubes and Reporting Services reports; the cool thing is that if you make any changes upstream you can track which cubes and reports are going to be impacted even if you’re using a variety of BI tools from different vendors, as most companies are. This is a massive missing piece in the current MS BI stack and the fact that they already have it working today had many people in the audience salivating; what they want to do is license this software to Microsoft and their presence at the conference was, as far as I can see, a clever bit of PR to try to twist Microsoft’s arm into reaching a deal with them by showing Microsoft’s customers a glimpse of what might be possible in the next version of SQL Server. So I’ll come right out and say it: Microsoft, please license this software and don’t try to build it yourself! We need it ASAP!
  • Thursday: the day of my presentation. Compared to other conferences I’d had to do way more preparation for it than usual because I’d decided to talk about a topic that I needed an awful lot of research rather than one I knew all about already. In fact I’d already decided that I should have submitted an abstract on cache warming for Analysis Services instead, which would have been much less work, but then on Monday Mosha made the typically gnomic remark that ‘cache warming was a bad thing’ so I’m glad I didn’t. Anyway the presentation itself went ok and despite being up against Kalen Delaney and Bob Ward I had a good crowd in the room. I was also very lucky that the SQLCat team’s presentation on monitoring Analysis Services, which covered a lot of the same topics, was scheduled for the next day so I wasn’t completely upstaged.
    To be honest, the SQLCat team had the best content out of all the speakers. In my case, where Carl Rabeler et al covered the same material as I did, they did it better than me. The same goes for other presentations: there were rather a lot covering MDX query performance and cube tuning (possibly too many) and the SQLCat session on this subject with Richard Tkachuk and Thomas Kejser was by far the best I saw. I suppose they do have the advantage of inside information and the fact that they do presentations like this for a living.
    I also stopped by some stands in the vendor exhibition space too. There was a disappointing lack of pure BI vendors, but I suppose they all blew their conference budget on the BI Conference last month. I did talk to the guys on the Hyperbac stand - they work in the backup and compression space, and were suggesting their latest product Hyperbac Online (which Simon Sabin blogged about recently) might come in handy for Analysis Services. Hmm, I don’t know given that AS does its own compression, but I’d be interested to try it out.
  • Friday: spent a lot of time in private MVP sessions, which of course are all NDA. I also saw a good session from Donald Farmer on integrating data mining into other apps, which made me think that there would be much greater uptake on data mining if it was as easy to use AS data mining with AS cubes as it was to use the data mining Excel adding. Donald demoed something similar to what I blogged about here; why can’t this type of thing be built into BIDS for cubes? Who has ever used the MDX Predict function? What about this old idea (although the SQLCat team noted that the old rule of thumb for partitioning whereby you should have no more than 20 million rows per partition no longer applied - you can get good, if not better performance with partitions three times that size)?
    The last session I saw was by Carolyn Chau and Sean Boon of the Reporting Services team, showing off all the new features of RS2008. As I’m sure you know, there’s a lot of cool new stuff in there but is it just me or is the tablix (which I discovered should be pronounced tay-blix rather than tab-lix, as I had been saying) a bit intimidating? Probably no more so than MDX to the uninitiated. Anyway, apart from all the things they’ve done for 2008 and the massive list of things they’ve got on their list to do for the next version, Carolyn mentioned the interesting point that SMDL (the language for building Report Builder 1.0 models) would not be coming back, but the idea of a semantic layer for RS would - and it would probably use the Entity Data Model in some way. Which makes me think: what would happen if Analysis Services went in that direction too? Replace the dsv with the Entity Framework and make Analysis Services an extra layer of metadata for aggregating entities? I don’t know enough about this subject to speculate further, but it’s an interesting area - who knows, it could lead onto the resurrection of the idea of the Unified Dimensional Model, with a Gemini-powered AS as the super-fast caching layer for all your reporting/querying needs.

d I think I made the right decision going to PASS rather than the BI Conference - there was more than enough BI content for me and the conference itself was well run and good fun. I’d really like it if PASS and the BI Conference were merged into one mega-conference so I didn’t have to choose.

Crystal Methodologies y los equipos de desarrollo

Posted on the November 26th, 2008. Read times

Source: Sistemas Decisionales, algo mas que Business Intelligence [link]

Hacia mucho tiempo que no revisaba Crystal Methodologies, no se trataba de una única metodología sino de un conjunto de ellas centradas en las personas que tienen que desarrollar el software, el equipo es la base de estas metodologías creadas por Alistair Cockburn.

La idea de este planteamiento creo que es muy acertada, no es lo mismo cocinar para cuatro personas que para veinte, no es lo mismo planificar un fin de semana para dos personas que para cuarenta, entonces ¿porque utilizamos la misma metodología para un grupo de tres desarrolladores que para un grupo de quince?. Desarrollar aplicativos ha de ser como un juego en el que todos cooperan, aportan su parte de invención y se comunican, ¿porqué olvidamos esta parte en las metodologías de desarrollo?.

El equipo de desarrollo es el factor clave y solo está limitado por los recursos a utilizar. Mientras mejor sea su comunicación e inventiva, mejor aprovecharán estos recursos. Crystal establece una serie de políticas de trabajo en equipo (Methods) orientadas a fomentar la mejora de estas habilidades. Dependiendo del tamaño del equipo se establecía una metodología u otra designadas por color. Crystal Clear para 3-8 personas, Crystal Yellow para 10-20 personas, Crystal Orange para 25-50,…)Como todas las metodologías ágiles, se basa en ciclos iterativos de desarrollo incremental (de 1 a 4 meses máximo), a lo que añade una reunión previa y posterior al ciclo, en la que reflexiona sobre el proyecto y sobre como ha ido ese ciclo. Antes de comenzar el siguiente ciclo al menos dos usuarios finales deben revisar, de forma independiente, lo desarrollado y validarlo.

De este planteamiento inicial, de Alistair con el tiempo solo se ha desarrollado en profundidad Crystal Clear, de la que recientemente se ha publicado un libro; Crystal Clear: A Human-Powered Methodology for Small Teams

Pero realmente es un lástima, porque la idea de una metodologia adaptativa por tamaño y experiencia del equipo no creo que sea una nada mala solución para los sistemas de Business Intelligence.

Outliers

Posted on the November 26th, 2008. Read times

Source: The sascom magazine blog [link]

It’s interesting to me that all the reviews I’m reading of Malcolm Gladwell’s new book Outliers are taking the time to define the word outlier.

Oh, what the heck. This isn’t a review, but I’ll do it too.

Outlier, noun.
1 : something that is situated away from or classed differently from a main or related body
2 : a statistical observation that is markedly different in value from the others of the sample

I think that’s the definition from the book.

Since I’m guessing I have at least a few statistically-minded readers here, I’m wondering what you all think of Gladwell’s spin on this word and how he has applied it to his theory about the factors of success. Namely that successful people often rise to success due to a random combination of many factors - not all of which are earned or deserved.

I admit that I haven’t read the book, so I’m not looking for reviews or critiques per se. I am a fan of Gladwell’s writing in general, but I’m mostly curious about his use of a statistical term to describe a somewhat un-scientific proposition.

My understanding of outliers is that they’re data points that are often thrown out of a data pool because they’re so far outside the norm that they’re difficult to explain, and including them in a data set used for analysis could throw off the results.

Should the same be said of successful people? Do we throw them out when we’re considering our goals for the future … unless we see ourselves as outliers too?

What other sociological studies can you think of that have attempted to study and explain outliers of any type, not just successful people?

I have more questions than answers here. That probably means I should go read the book.

Focus, Focus

Posted on the November 26th, 2008. Read times

Source: The sascom magazine blog [link]

A few days ago, I heard Chris Shigas, Vice President of French/West/Vaughn, speak at the Public Relations & Marketing Seminar 2008. I’m not sure if he was quoting someone else or if this was his wisdom, but he said something that really stuck in my head: “If you try to say something to everyone, you wind up saying nothing to anyone.”

I realize his words aren’t profound. Every young journalist is told to keep the reader in mind when crafting a story: Know your audience. Shigas’ message was directed to an audience of media professionals, specifically marketing and PR. But that message is also one for businesses: Know your audience.

At SAS, we believe corporations and SMBs can use business analytics to strategically target the marketing campaign to not only differentiate the message but get it to the right person. Two great examples of companies that use SAS® Business Analytics to optimize and enhance their marketing messages are Vodafone Australia and ING DIRECT Canada.

Vodafone Australia, like many telecommunications companies, faces a nearly saturated market, so it has to differentiate its message or lose customer share. According to Tyrone O’Neill, General Manager of Insights and Innovation at Vodafone, before implementing SAS Marketing Optimization, offers may have been made quite broadly. “Now, however, we typically tell fewer people about a particular offer. In fact, we are communicating less frequently, but the intention is that each communication has a greater impact,” he said.

Vodafone’s focused message resulted in a response rate that was up to 10 times better than it had been before SAS and the company saved money on its marketing.

ING DIRECT Canada also found that more isn’t always better. According to Rene Bettio, Senior Manager of Data Base Marketing at ING DIRECT Canada, sending out more marketing campaigns doesn’t necessarily mean you’ll get more clients.

ING DIRECT Canada used SAS Marketing Automation to reduce the time it takes to create a campaign. Now, for instance, when a client’s GIC (Guaranteed Investment Certificate) is coming due, the automated system sends out an e-mail or letter providing information on the latest rates and options. “It’s all about presenting the right offer to the right client at the right time,’’ said Bettio.

Thinking back on the seminar, I realize that the reason Shigas’ advice kept tickling in my head was because it had greater significance. He wanted to talk to me about writing focused marketing messages, but my brain kept thinking about the fact that businesses need the tools to be able to focus their marketing messages if they are going to compete in this economy.

Claudia Imhoff Webinar, What You Need to Know about Open Source BI and Data Warehousing

Posted on the November 26th, 2008. Read times

Source: Todo BI: Business Intelligence, Data Warehouse, CRM y mucho mas... [link]

Claudia Imhoff, una de las mejores especialistas en el campo del Business Intelligence, Ver blog, libros ofrece un webinar muy interesante.

Registrarse:
Webina Claudia Imhoff

En este interesante Webinar en el que participará junto al Director de Pentaho, Claudia hará un repaso de todo lo que explica en su documento sobre la adopción del BI Open Source en el mercado.
Por poner un ejemplo, alguno de los principales datos que aporta para dar esta relevancia el BI Open Source, son los siguientes:

- Según Aberdeen, el 25% de los encuestados van a adoptar BI Open Source en los próximos 12-24 meses (hablamos de grandes compañías).
- Durante los primeros meses de este año, los proyectos open source se han ido doblando sobre el trimestre anterior.
- La compra de MySQL por Sun es un ejemplo de la importancia del sector.


Además, os damos acceso a un interesante documento:
Open Source Business Intelligence. A 2008 Progress Report

SSAS: Listing Attribute Relationships

Posted on the November 26th, 2008. Read times

Source: Darren Gosbell [MVP] - Random Procrastination [link]

Occasionally questions come up about how to extract certain pieces of metadata from Analysis Services. In general all the metadata that you would need on a day to day basis is pretty well covered by the standard schema rowsets. And in SSAS 2008 you can use the system DMVs to get at most of this data.

For example, if you want to get a list of the current user sessions on the server you can do the following…

SELECT * FROM $System.DISCOVER_SESSIONS

…and in SSAS 2005 you can use the same syntax with the DMV() function that is part of ASSP.

call ASSP.DMV(”SELECT * FROM $System.DISCOVER_SESSIONS”)

But there are some details which can only be accessed through the DISCOVER_XML_METADATA command which returns a hierarchical result similar to what you get when you script an object from SSMS and both the DMV’s in SSAS 2008 and the DMV() function in ASSP does not handle this data. Unfortunately the hierarchical information is not the easiest thing to read quickly and is even harder to try to incorporate into a reports.

This is where the DiscoverXmlMetadata() function comes in handy. I wrote this function to use a syntax similar to XPath in order to extract certain nodes. By default the function lists all of the properties of the node it finds which matches the specified path, however you can also add a pipe character (|) after any node and list extra properties that you would like returned

The following call will return a list of all the attribute relationships in the current database:

call assp.DiscoverXmlMetadata(”\Database\Dimensions\Dimension|Name\Attributes\Attribute|Name,Usage\AttributeRelationships\AttributeRelationship“)

And if you want to view the relationships for just a single dimension you can use the optional parameter to pass in a predicate in the same form that you would use in an SQL query (provided that you compile the code yourself or use a version greater than the current 1.2 release - as I only recently added this filter parameter)

call assp.DiscoverXmlMetadata(”\Database\Dimensions\Dimension|Name\Attributes\Attribute|Name,Usage\AttributeRelationships\AttributeRelationship
, “DimensionName=’Product’”)


The 5 Basic Tenents of Sales 2.0

Posted on the November 25th, 2008. Read times

Source: Keep It Simple [link]

Today InsideCRM published a nice summary of Sales 2.0 and How it Will Improve Your Business. It includes a summary of the basic charactistics of Sales 2.0. They are:

  1. Sales 2.0 is about acceleration
  2. Sales 2.0 is about collaboration
  3. Sales 2.0 is about professionalization
  4. Sales 2.0 is about accountability (this is where sales analytics come in!)
  5. Sales 2.0 is about alignment

You can read the entire article here. If you want to further, you can preorder the soon-to-be-release Sales 2.0 book here. Have a great Thanksgiving!

Share/Save/Bookmark

Next Page »