Wrapping Up In The Baltic States
Source: Mark Rittman's Oracle Weblog [link]
Well that’s the end of my brief trip to the Baltic States,
with my seminar in Estonia finished today and one in Lithuania
completed earlier this week. The seminar in Estonia was held at
Oracle’s offices in Tallinn, with ten people attending this time and
thirteen in Vilnius. As with Utrecht, most of the delegates were from
Oracle partners, looking to evaluate Oracle’s new tools and see how
they could be used with their customers.
src="http://static.flickr.com/118/255631438_db0e3c5376.jpg"
height="375" width="500"/>
One thing I noticed from speaking to delegates from both
events, is that it’s much more common in these countries for customers
to build their own solutions, manually using scripts and PL/SQL
packages, than to buy tools such as Warehouse Builder or even
Discoverer. In relative terms, tools such as Warehouse Builder are
quite expensive compared to employing staff, and as there are a lot of
university educated, technically aware people over here, it’s usual
practice to code a solution from scratch than to use a tool. One
example I was given was an Estonian bank who wrote their own banking
system themselves, rather than buy in a packaged solution - using tools
or packaged applications is the exception rather than the rule,
although I guess as labour costs rise, and the tools become more
productive, this will change over time.
It was a good couple of days though, with a couple of people
in the audience who had used Oracle OLAP or were keen to see the new BI
Suite Enterprise Edition, and with everyone wanting to see the new
toolset; also, now that this was the third time I’d run the seminar,
all the demos went more or less OK (except the Excel integration into
BI Suite EE one, that’s always tricky) - to the point where I actually
need to add some more material to fill in the time. By far the most
popular subjects are Discoverer and OWB - I might end up adding some
more in on Discoverer integration into OWB (perhaps the derivation of
business areas from OWB modules), or perhaps something on Process Flows
and deployment, this seems to be the area everyone wants to know about.
After the seminar finished, I had a chance to look around
Tallinn Old Town. Even more picturesque than Vilnius, not quite as nice
weather but a geniune walled town with lots of old churches, towers,
spires and windy old roads. I don’t think anyone could fail to be
impressed with this sight:
src="http://static.flickr.com/110/255632178_88c03cd16b.jpg"
height="375" width="500"/>
More photos on flickr.
I’m on the 7.45am flight from Tallinn to Gatwick tomorrow, so I should
get back fairly early and have a chance to spend the day with Scott and
Isabella, and my long-suffering wife Janet. Being away for the week
puts quite a burden on her (our kids are 4 and 1.5 years old), so it’s
good that my next few seminars will only take me away for a couple of
days each week. I’m trying to avoid doing too much travelling other
than the overseas visits, to be honest it’s all getting a bit too much
at the moment and I need to scale things back a little. I’ve
actually thrown the towel in on the aggregation techniques paper for
the UKOUG conference, looking at my diary for the next month, there’s
no way I can get it completed properly and have some sort of semblance
of a family life; I’ve also decided to not put a paper in for the IOUG
event next year, restricting myself instead to OOW and ODTUG.
Other than that, I’m still mulling over what to do with the
website - as I mentioned the other day, a couple of scripts on my site
were using up to 80% of the server resources, which means that I’ve had
to turn comments off for the time being. The problem with the site is
that the sheer size of it - over 1000 postings at the moment, 250k+
page views per month, plus the rather shortsightedly set-up template
that makes it impossible to add new pages, means that it’s due for a
rather radical overhaul soon. My preference is to archive off the old
pages, or at least the best of them, put them together as an
“anthology” PDF and make them available for download, then start
something new, probably using Wordpress. I’d like to keep the old
material - it’s an interesting history of Oracle BI&W over the
last four years - but I’ve got that urge at the moment to do something
new. The other interesting factor is that I’m just about to sign a
contract with a publisher to do a book on Oracle BI Suite, which could
well have some influence on what sort of web presence I have going into
2007. We’ll have to see what happens over the next few weeks.
ELT, ETL and Terabytes of Volume
Source: Blog: Dan E. Linstedt [link]
I’ve been working recently with the new release of a popular ETL tool that now handles ELT. It is an interesting approach but one that bears discussion. The systems we are working with are very large in terms of data sets. I’m not talking about large historical data (although that’s one issue) I’m talking about large daily data feeds. This customer has Terabytes of information to deal with on a daily basis. So if you’re like me, and you’ve used ETL for years, and want to know more about ELT, read on. This is a technical entry in my blog. I’ll get back to MDM very shortly.
SSIS Add Package Bug
Source: The Small ISV [link]
I run into a bug using SQL Server Integration Services 2005 yesterday.
As part of testing some new code for BI Documenter I was trying to deploy a new package using the Enterprise Manager and received the following message:
Invalid access to memory location. (Exception from HRESULT: 0×800703E6)
This error would occur regardless …
Green-field to Brown-field BI Environments – metadata analysis holds the key
Source: Cyril on Business Intelligence [link]
Our BI systems are getting on in years. What was an appealing virgin territory to the BI enthusiast has become a difficult pre-teen. Maybe bringing up BI systems is like raising children – every stage is the worst. Client executives are getting more aware of what’s possible as well and, of course, the data is still messy.
But the biggest challenge could well be the need to renovate existing systems, rather than have the luxury of starting afresh. Welcome to the brown-field.
I outlined the metadata complexity issue for BI analysts in my September 21 post. Earlier, in my June 30 post, I described the stages and tasks involved in a bottom-up approach to BI system design. I opined that the bottom-up, or rapid development, methods will remain dominant in the BI systems space for the foreseeable future. Now I am attempting to bring these two issues together.
To recap, with some adjustments to fit my current convictions and practice, the bottom-up approach to BI system development requires answers to the following queries:
1. Where are we now? E.g.
- What is the current, synchronized, state of the data warehouse and BI tables: the SQL database documentation, ETL (covering all the space from source transaction system to data warehouse), cubes and dimensions?
- What reports and query specifications are out there, what do they contain?
- What are the outstanding BI specification requests and user complaints/suggestions?
2. What is stopping us perform better in our BI reporting? E.g.
- Bad source data content?
- Inadequate ETL packages corrupting data warehouse content?
- Under-utilized data?
- Unused, wasted, data collection?
- Unavailable data to satisfy specified requirements?
3. What can we do to perform better? E.g.
- Repair the data sources
- Find and fix the ETL errors and omissions, starting with the most dysfunctional
- Check with the client executive base to see if the under utilized data should be reported more or better
- Review the available data with client executives to improve satisfaction
- Design new data warehouse content and associated reporting where possible and desirable (Of course, in this business everything is desirable, only a few things are essential!)
4. Prioritize the work schedule and get going – probably in a series of iterations with highest/fastest ROI first.
I know this is obvious, motherhood and flag, stuff to the experienced BI analyst. However, it is often not understood – I believe – that we are completely dependent on the stage 1 being done right. We must have comprehensive, accurate, up-to-date, access to the synchronized metadata. Further, we need to be able to explore this metadata, its relationships and sensitivities.
Iterative BI design is impossible if we cannot document the metadata adequately. Green-field sites are all migrating to the evolutionary mode as they age, and metadata is king of the brown-field BI environment. Once we have this documentation and analysis capability the rest of the project is fairly straightforward.
Documenting the relatively narrow BI software context isn’t enough; we must be able to reach out to the source data locations wherever possible.
Plus, passive documentation itself is also short of the full whack. It’s like being in the supermarket without the meal plan and recipes to tell us what we need to buy.
I reckon we need to have a “what-if” capability to allow us to explore the metadata world as documented, checking on data existence, link existence and integrity; and reviewing the Drill-down and Drill-through capabilities.
Of course, this capability is also very useful in the virgin, green-field site as well, but we usually have more flexibility to maneuver then.
There are many software documentation tools on the market, and I leave you, or others, to evaluate them for this purpose. However, it is my view that a more fully featured tool is required than those currently available.
To provide an evaluation yardstick, and to act as a design guide for my specific software design purposes, I’ve put together my checklist for a metadata repository tool that serves BI analysts in both their support and design roles. It also avoids the problems, e.g. the dog’s breakfast syndrome, raised in my last post and has the required “what-if” exploring capability.
My repository checklist is based on the Microsoft products, but others would be similar:
- Take regular snapshots of the relational database contents relevant to BI – Tables, Columns, etc.
- Also, snapshot the Integration Services packages: Data sources, Data Flows, Data destinations, Tasks
- Ditto the Analysis Services tables: Cubes, Dimensions, Report Specifications
- Ditto the Report Services content; Datasources, Datasets, parameters, drill-through reports
- Facilitate HTML and diagrammatic documentation for each of these snapshots as needed, with extended diagramming capability to capture design ideas and extensions.
- Provide a server application based analysis tool that:
- Tracks data lineage from SSRS, through SSAS and SSIS to the relational, legacy or other data source, and displays the transformations that occur on the way.
-
Provides cross component impact analysis capabilities, i.e. what is affected if I change the definition of this field – in the database, in the cube, in the report?
It’s taking a while to build this, but even the current status should be useful to the designer. If you’re interested you can monitor the progress and try it out yourself at www.bidocumenter.com. There is a new release due next week.
Automated Testing for Datamarts
Source: Mark Garner's Business Intelligence Blog [link]
Another tenet of Agile development is automated testing of code. We can apply this idea to our datamart as well.
What kinds of things need to be tested in a data warehouse? We don’t need to test transactions as this is the responsibility of the ETL system. What we do need to test is the quality of the data in the mart. This includes both measures in the fact table and data in the dimension tables.
There are two different times that we need to test our datamart. We need to test before our ETL load and also after. This allows us to first test the veracity of the structure of the datamart. We can then run the regular or standard ETL process into the fact or dimension table and then re-run the test with the new expected results. These two sets of tests should be run on known and static data. One set scripted out as insert statements for the fact table or dimension table and another set in an ODS or flat file for use in the ETL process run that is tested.
Why would we want to all the trouble of creating runnig all these tests? Having automated tests allows us to refactor and expand the datamart with more confidense. As we move through more and more sprints or development iterations, our test set grows giving us more confidense that new development doesn’t have adverse effects on already deployed functionality.
Some of the obvious tests we might run include sums of measures in the fact table sliced by many different dimensions, row counts in the fact table according to dimension slices and row counts of dimension tables by different dimension attributes.
The most important thing about testing the datamart is that the number of tests that are run during the testing phase continues to grow. As time passes, we should have more tests testing more functionality.
The tool that I have found that comes the closest to being able to perform these types of tests on SQL Server is called TSQLUnit. This tool places about 3 or 4 tables and 5 or 6 stored procedures in your database. You then call one of the stored procedures to run the tests. It loops through the database catalog and calls stored procedures with the prefix of ‘tsu_’. These are stored procedures that are written to perform the tests. The test results are then stored in one of the tables.
While this to me isn’t ideal, it is a really good start to testing a datamart. What would be better, in my opinion, would be a console application that does all the testing out of an XML file that defines queries and answers and runs all the tests and then reports answers.
Let me know what you think about TSQLUnit or comments about tests for a datamart.
SSIS Add Package Bug
Source: The Small ISV [link]
I run into a bug using SQL Server Integration Services 2005 yesterday.
As part of testing some new code for BI Documenter I was trying to deploy a new package using the Enterprise Manager I was receiving the following message:
Invalid access to memory location. (Exception from HRESULT: 0×800703E6)
This error would occur …
Revisiting old code
Source: Pete-s random notes [link]
Beware of legacy code that includes legacy workarounds for long-fixed bugs.
Partition exchange loading and column transposing …
Source: oramoss oracle [link]
I came across an interesting issue yesterday whereby a partition exchange load routine wasn’t working for my colleague Jon. The ensuing investigation seems to point at the possibility of a bug, whereby after exchanging a partition Oracle may have inadvertently transposed columns in the target table. I’ve created a test script to illustrate the problem and logged an SR with Metalink to see what they make of it.
In light of my recent post on DBMS_APPLICATION_INFO I’d better state that we found the problem on 10.2.0.2 on HP-UX and then tested it against 9.2.0.6 on HP-UX and found the same results so it seems the issue has been there a while in terms of Oracle releases.
The investigation started down a different avenue - as they often do - with the inability to complete a partition exchange due to the Oracle error:
ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION
Reissuing the exchange partition command with the EXCLUDING INDEXES clause allowed the exchange to proceed so we figured there was definitely a problem with the indexes.
Next we tried dropping all the indexes on both sides of the exchange and recreating them…but when we retried the exchange partition it failed again with the same error.
Right, drastic measures time…we dropped all the indexes on both sides and ran the exchange which, with no indexes around, worked fine (with INCLUDING INDEXES clause in place as it would normally be on this process).
Next we proceeded to add the indexes one by one, starting with the b trees for the primary key and unique keys. This worked without a problem so we moved on to the bitmap indexes which seemed to be fine for the first few but then after adding a few of them it suddenly stopped working. Through a process of trial and error we were able to determine that if two specific bitmap indexes were on the tables then the exchange would fail with ORA-14098.
So, what was so special about those two bitmaps and the columns they were on. We selected the details from dba_tab_columns for the two columns involved in these bitmap indexes and realised that the only difference was that they were at a different column position (COLUMN_ID) on the tables - which begged the question “If these two tables are not quite the same then how can we be allowed to exchange them?”
Well, I guess, in our case, we were able to say that the two objects being exchanged had the same columns but just not necessarily in the same order (those of you who love classic comedy will I’m sure be recalling Morecambe and Wise with Andre Previn right about now)…should this mean we can swap the tables or not ?
To try and illustrate the scenario I built a test script…the output from a run follows…sorry it’s a bit long…but it does prove some useful conclusions (I think)
> @test_pel_bug_mismatch_columns> REM Drop the tables…> DROP TABLE jeff_test/ Table dropped. > DROP TABLE jeff_test_ptn/ Table dropped. > REM Create the tables…> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column3 NUMBER NOT NULL 4 ,column2 NUMBER NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 PARTITION BY RANGE (partition_key_col) 8 ( 9 PARTITION p001 VALUES LESS THAN (1000001) 10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE) 11 ) 12 / Table created. > CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column2 NUMBER NOT NULL 4 ,column3 NUMBER NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 / Table created. > REM Populate the table…> INSERT /*+ APPEND */ 2 INTO jeff_test_ptn(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , MOD(idx,3) 11 , MOD(idx,4) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. > INSERT /*+ APPEND */ 2 INTO jeff_test(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , MOD(idx,5) 11 , MOD(idx,6) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. > REM Count the rows in COLUMN2 and COLUMN3 for both tables…>> REM Should see 3 rows…all NUMBERS> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2; COLUMN2 COUNT(1)———- ———- 1 12000 2 12000 0 12000 3 rows selected. >> REM Should see 4 rows…all NUMBERS> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 9000 2 9000 3 9000 0 9000 4 rows selected. >> REM Should see 5 rows…all NUMBERS> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2; COLUMN2 COUNT(1)———- ———- 1 7200 2 7200 4 7200 3 7200 0 7200 5 rows selected. >> REM Should see 6 rows…all NUMBERS> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 6000 2 6000 4 6000 5 6000 3 6000 0 6000 6 rows selected. >> REM Now lets try and swap the partition and the table…> ALTER TABLE jeff_test_ptn 2 EXCHANGE PARTITION p001 3 WITH TABLE jeff_test 4 INCLUDING INDEXES 5 WITHOUT VALIDATION 6 / Table altered. >> REM Surprisingly, it lets us do the above operation without complaining.> REM Even worse…it transposes the values in COLUMN2 and COLUMN3…>> REM Should see 5 rows…but we see 6 rows> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2; COLUMN2 COUNT(1)———- ———- 1 6000 2 6000 4 6000 5 6000 3 6000 0 6000 6 rows selected. >> REM Should see 6 rows…but we see 5 rows> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 7200 2 7200 4 7200 3 7200 0 7200 5 rows selected. >> REM Should see 3 rows…but we see 4 rows> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2; COLUMN2 COUNT(1)———- ———- 1 9000 2 9000 3 9000 0 9000 4 rows selected. >> REM Should see 4 rows…but we see 3 rows> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 12000 2 12000 0 12000 3 rows selected. >> REM Now, lets try again but with COLUMN2 and COLUMN3> REM being of different datatypes…>> REM Drop the tables…> DROP TABLE jeff_test 2 / Table dropped. > DROP TABLE jeff_test_ptn 2 / Table dropped. > REM Create the tables…> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column3 NUMBER NOT NULL 4 ,column2 DATE NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 PARTITION BY RANGE (partition_key_col) 8 ( 9 PARTITION p001 VALUES LESS THAN (1000001) 10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE) 11 ) 12 / Table created. > CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column2 DATE NOT NULL 4 ,column3 NUMBER NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 / Table created. > REM Populate the table…> INSERT /*+ APPEND */ 2 INTO jeff_test_ptn(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , SYSDATE + MOD(idx,3) 11 , MOD(idx,4) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. > INSERT /*+ APPEND */ 2 INTO jeff_test(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , SYSDATE + MOD(idx,5) 11 , MOD(idx,6) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. > REM Count the rows in COLUMN2 and COLUMN3 for both tables…>> REM Should see 3 rows…all DATES> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2; COLUMN2 COUNT(1)——————– ———-27-SEP-2006 11:49:03 1200028-SEP-2006 11:49:03 1200026-SEP-2006 11:49:03 12000 3 rows selected. >> REM Should see 4 rows…all NUMBERS> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 9000 2 9000 3 9000 0 9000 4 rows selected. >> REM Should see 5 rows…all DATES> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2; COLUMN2 COUNT(1)——————– ———-30-SEP-2006 11:49:03 720027-SEP-2006 11:49:03 720028-SEP-2006 11:49:03 720029-SEP-2006 11:49:03 720026-SEP-2006 11:49:03 7200 5 rows selected. >> REM Should see 6 rows…all NUMBERS> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3; COLUMN3 COUNT(1)———- ———- 1 6000 2 6000 4 6000 5 6000 3 6000 0 6000 6 rows selected. >> REM Now lets try and swap the partition and the table…> REM It will fail with error…> REM ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION>> ALTER TABLE jeff_test_ptn 2 EXCHANGE PARTITION p001 3 WITH TABLE jeff_test 4 INCLUDING INDEXES 5 WITHOUT VALIDATION 6 /ALTER TABLE jeff_test_ptn*ERROR at line 1:ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION >> REM So, it only fails when the columns have the same datatypes.>> REM Now, lets say they are the same datatype and look at how bitmap indexes> REM are affected in the PARTITION EXCHANGE process…>> REM First lets recreate the tables with COLUMN2 and COLUMN3> REM having the same datatype…> REM Drop the tables…> DROP TABLE jeff_test 2 / Table dropped. > DROP TABLE jeff_test_ptn 2 / Table dropped. > REM Create the tables…> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column3 NUMBER NOT NULL 4 ,column2 NUMBER NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 PARTITION BY RANGE (partition_key_col) 8 ( 9 PARTITION p001 VALUES LESS THAN (1000001) 10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE) 11 ) 12 / Table created. > CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL 2 ,column1 NUMBER NOT NULL 3 ,column2 NUMBER NOT NULL 4 ,column3 NUMBER NOT NULL 5 ,column4 NUMBER NOT NULL 6 ) 7 / Table created. > REM Populate the table…> INSERT /*+ APPEND */ 2 INTO jeff_test_ptn(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , MOD(idx,3) 11 , MOD(idx,4) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. > INSERT /*+ APPEND */ 2 INTO jeff_test(partition_key_col 3 ,column1 4 ,column2 5 ,column3 6 ,column4 7 ) 8 SELECT idx 9 , MOD(idx,2) 10 , MOD(idx,5) 11 , MOD(idx,6) 12 , MOD(idx,2) 13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT 2 / Commit complete. >> REM Now lets create a bitmap index on COLUMN1 on both tables…> CREATE BITMAP INDEX jtp1 ON jeff_test_ptn(column1) LOCAL 2 / Index created. > CREATE BITMAP INDEX jt1 ON jeff_test(column1) 2 / Index created. >> REM …and now try PARTITION EXCHANGE…> ALTER TABLE jeff_test_ptn 2 EXCHANGE PARTITION p001 3 WITH TABLE jeff_test 4 INCLUDING INDEXES 5 WITHOUT VALIDATION 6 / Table altered. > REM It works fine.>> REM Now lets create a bitmap index on COLUMN4 on both tables…> CREATE BITMAP INDEX jtp4 ON jeff_test_ptn(column4) LOCAL 2 / Index created. > CREATE BITMAP INDEX jt4 ON jeff_test(column4) 2 / Index created. >> REM …and now try PARTITION EXCHANGE…> ALTER TABLE jeff_test_ptn 2 EXCHANGE PARTITION p001 3 WITH TABLE jeff_test 4 INCLUDING INDEXES 5 WITHOUT VALIDATION 6 / Table altered. >> REM It works fine.>> REM Now lets create a bitmap index on COLUMN2 on both tables…> CREATE BITMAP INDEX jtp2 ON jeff_test_ptn(column2) LOCAL 2 / Index created. > CREATE BITMAP INDEX jt2 ON jeff_test(column2) 2 / Index created. >> REM …and now try PARTITION EXCHANGE…> ALTER TABLE jeff_test_ptn 2 EXCHANGE PARTITION p001 3 WITH TABLE jeff_test 4 INCLUDING INDEXES 5 WITHOUT VALIDATION 6 /WITH TABLE jeff_test *ERROR at line 3:ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION
ht, now what can we conclude from the above ?
1. If you don’t have indexes on your table then you can do an exchange even if the columns are in a different order - but you will silently transpose the columns. No errors are given and I think this is a bug.
2. You will only hit the problem in 1 if the columns which are transposed are of the same datatype - I’ve not checked for scale/precision/length - it may be, for example, that a VARCHAR2(20) and VARCHAR2(10) being transposed would raise an ORA-14097 error.
3. If you have indexes on columns which are transposed then the exchange will fail with ORA-14098. I don’t know whether this is a bitmap index specific thing as I’ve not tested it any further.
4. If you only have indexes on columns which are not transposed then you can do the exchange and there will be no errors - but your data is obviously transposed silently.
If It’s Wednesday, It Must Be Tallinn
Source: Mark Rittman's Oracle Weblog [link]
Well it’s stage two of my epic Seminar tour around the Baltic
States, and about an hour or so ago I arrived in Tallinn for tomorrow’s
session. Flying out of Vilnius was interesting, it’s quite an
old-fashioned airport that appears to still be run by the Russian
military - the only airport I know where you could starve to death, and
none of the foreign exchange booths (in a different building to
departures, and none after passport control) had an Estonian money…
Still, joking apart, it was a great two days in Lithuania, some good attendees and a
nice hotel to stay in courtesy of Oracle.
The seminar went pretty well (according to the feedback forms,
anyway) although I think the language thing was a bit of an issue -
I’ve provided fairly copious notes though so if you couldn’t understand
me (an easy thing to happen), at least I’ve written down more or less
everything in the notes (all 250+ pages of them) I give out.
Interestingly, unlike the first venue, Holland, none of the attendees
were particularly interested in the new BI Suite Enterprise Edition,
most of the interest was in data warehousing and Oracle Warehouse
Builder. I think tomorrow, if it’s the same in Tallinn, I’ll probably
drop one of the two Enterprise Edition sessions and do an extra OWB
one, perhaps go into more detail around data modelling / deployment /
mapping and so on. Let’s see how it goes.
Anyway, better get on with ironing a shirt and working out
where the local Oracle training centre is. Nothing much on the TV -
it’s all BBC World (v. boring), CNN (slightly more interesting),
Euronews or National Geographic - funny how you miss the rubbish TV at
home when you’re away..
Finally, I’ve had to disable both comments, and searching, on
this site for the time being as apparently the two perl scripts
(mt-comments.cgi, mt-search.cgi) that provided this function were
taking up 80% of the shared server resources, and my hosting contract
only allows at most 10% server usage. So, whilst I think of what else
to do - and I’m thinking pretty radical here, perhaps archiving off old
posts into a PDF document, and starting afresh with Wordpress - I’m
afraid you’ll just have to drop
me a line if you want to get in touch.
If It’s Wednesday, It Must Be Tallinn
Source: Mark Rittman's Oracle Weblog [link]
Well it’s stage two of my epic Seminar tour around the Baltic
States, and about an hour or so ago I arrived in Tallinn for tomorrow’s
session. Flying out of Vilnius was interesting, it’s quite an
old-fashioned airport that appears to still be run by the Russian
military - the only airport I know where you could starve to death, and
none of the foreign exchange booths (in a different building to
departures, and none after passport control) had an Estonian money…
Still, joking apart, it was a great two days in Lithuania, some good attendees and a
nice hotel to stay in courtesy of Oracle.
The seminar went pretty well (according to the feedback forms,
anyway) although I think the language thing was a bit of an issue -
I’ve provided fairly copious notes though so if you couldn’t understand
me (an easy thing to happen), at least I’ve written down more or less
everything in the notes (all 250+ pages of them) I give out.
Interestingly, unlike the first venue, Holland, none of the attendees
were particularly interested in the new BI Suite Enterprise Edition,
most of the interest was in data warehousing and Oracle Warehouse
Builder. I think tomorrow, if it’s the same in Tallinn, I’ll probably
drop one of the two Enterprise Edition sessions and do an extra OWB
one, perhaps go into more detail around data modelling / deployment /
mapping and so on. Let’s see how it goes.
Anyway, better get on with ironing a shirt and working out
where the local Oracle training centre is. Nothing much on the TV -
it’s all BBC World (v. boring), CNN (slightly more interesting),
Euronews or National Geographic - funny how you miss the rubbish TV at
home when you’re away; still, nothing will be quite as eyebrow-raising
at the pay TV channels I had to quickly page through in Utrecht whilst
trying to get to BBC1.
Wait Is on for Next-Gen Dashboards
Source: BI this week [link]
Are next-gen dashboards too good to be true? It all depends on what you mean by “dashboard,” industry watchers say.
TDWI Radio News: Spreading the Spreadsheet
Source: BI this week [link]
If spreadmarts cloud that coveted single version of the truth, can e.Spreadsheet let the sunshine in?
Survey Says: Microsoft a BI Force to be Reckoned With
Source: BI this week [link]
If both Gartner and IDC say a trend’s a trend, then it must be a trend, right?
BEA Revamps, Re-Launches EII tool
Source: BI this week [link]
Far from being a BEA-only deliverable, ALDSP is a bona-fide EII contender, analysts say.
Spreadsheet Add-in - Refreshing multiple queries
Source: Oracle Business Intelligence Blog [link]
This is a commonly requested new feature in the upcoming OracleBI Spreadsheet Add-in 10.1.2.2.
A new menu option now makes it easier to work with workbooks that contain more than one query.
As you might know, it is possible to create more than one query in a worksheet in a workbook. Each query can have it’s own connection, so it is even possible to run against more than one database from within
Using a trigger to grant access on new objects in …
Source: oramoss oracle [link]
End users on our warehouse log in using their own, personal oracle account and have the ability to create objects in their own schema - so they can test out analysis ideas without harming anyone else. I had to tune one of the pieces of SQL a user was running the other day and it involved links between some tables the user had created in their own schema and those in the public warehouse schemae and unfortunately, when I went to do get an execution plan using EXPLAIN PLAN it stopped me in my tracks because I couldn’t see the objects in the end user schema - I don’t have SELECT ANY TABLE privileges in my personal Oracle account.
So, the options I thought of were:
1. Get SELECT ANY TABLE privilege granted to me so that I could do the tuning for this query and, indeed, any query that I will come across. Not something the security people would be happy about which I can understand.
2. Get the end user to grant me SELECT access privilege on the objects - also possible but a bit painful if there are lots of objects and I’d need to go through this hassle every time I had a tuning requirement
3. Get the login details of the user and log in as them - a security minefield and not really practical.
So, no brilliant solutions there, until my teamleader Tank suggested that we create a trigger AFTER CREATE ON SCHEMA that would do the necessary grant(s). Phil from the DBA team had a look at it and found some code from Tom to do this and it works great.
It interested me as a solution because I remember reading recently on Tom’s blog how much he hates triggers and that he’d love to have them removed from the database. He did say that “triggers are so abused - and used so inappropriately” but there are some occasions when they are useful and perhaps this is a good example.
Why you should skim read deep stuff first!
Source: oramoss oracle [link]
Seems that my recent posting about constraint generated predicates is already thoroughly covered in Jonathans latest book (Ch 6, pp 145-6)…including the bit about specifying a “NOT NULL” predicate to get around the issue with when the column in question is declared as “NULL” and not “NOT NULL”.
Doug said to me recently that it was one of those books you probably should skim read first in order to get your brain used to what content is in it…so when you need to know about a particular topic you’ll (hopefully) remember that it was covered somewhere in such and such book…I think he’s probably spot on there.
DBMS_APPLICATION_INFO and V$SESSION
Source: oramoss oracle [link]
Tom noted on his blog recently that DBMS_APPLICATION_INFO is something we should be using in our code to provide instrumentation and monitoring information - something I fully agree with and have been doing for some time now.
The post showed how, if we’ve called DBMS_APPLICATION_INFO in our code, we can retrieve the values of the MODULE/ACTION/PROGRAM_ID by querying V$SQL with a nice example to illustrate the point.
After reading it I recalled that some of this kind of information is available on V$SESSION from 10.2.0.2 (Thanks to Yas for identifying the specific version) so I created a simple supplementary example to illustrate this:
First I created a package - I’m using a package in order to demonstrate a secondary point which will become clearer as we progress…
create or replace package pk1 as procedure p3; procedure p4;end pk1;/
create or replace package body pk1 asprocedure p3 isbegin dbms_application_info.set_module( ‘PK1 module’, ‘Startup’ ); dbms_application_info.set_action( ‘Running P3′ ); for x in ( select * from dual look_for_me_1 ) loop null; end loop; for x in ( select * from dual look_for_me_2 ) loop null; end loop; dbms_lock.sleep(20);end p3;
procedure p4 isbegin dbms_application_info.set_module( ‘PK1 module’, ‘Startup’ ); dbms_application_info.set_action( ‘Running P4′ ); for x in ( select * from dual look_for_me_2 ) loop null; end loop; dbms_lock.sleep(20);end p4;
end pk1;/
I’ve slightly changed the calls to DBMS_APPLICATION_INFO to show how the SET_ACTION works and also because if I don’t do it that way, the module name gets set by the first subprogram called in PK1 and should therefore not be specific to a subprogram.
Now lets get back the object name and ID - I’ve specifically excluded public synonyms in order to not cloud things:
select object_name, object_type, object_idfrom all_objectswhere object_name IN(’PK1′,’DBMS_LOCK’)and owner != ‘PUBLIC’ — don’t get public synonyms/
This gives us something like:
OBJECT_NAME OBJECT_TYPE OBJECT_ID———— ————- ———DBMS_LOCK PACKAGE BODY 4368 DBMS_LOCK PACKAGE 4265 PK1 PACKAGE BODY 76404 PK1 PACKAGE 76403
Now in another sql*plus session I can run the P4 procedure in package PK1 followed by the P3 procedure:
exec pk1.p4;exec pk1.p3;
After both calls complete I switch to my main session and issue a slightly modified version of the query Tom wrote:
select sql_text, action, module, program_id, program_line#from v$sqlwhere sql_text like ‘% LOOK_FOR_ME_%’ escape ‘’/
SQL_TEXT ACTION MODULE PROGRAM_ID PROGRAM_LINE#——————————– ———- ———- ———- ————-SELECT * FROM DUAL LOOK_FOR_ME_1 Running P3 PK1 module 76404 7 SELECT * FROM DUAL LOOK_FOR_ME_2 Running P4 PK1 module 76404 20
Which gives us the two cursors that have been run and as Tom indicated would happen, in my example, the LOOK_FOR_ME_2 cursor is first issued in the call to P4 so the Action is “Running P4″ even though the cursor is also issued by P3 subsequently.
Note that the PROGRAM_ID has the OBJECT_ID of the PK1 Package Body in V$SQL.
Now I can rerun the procedures again in the same order and because there is a small wait during the processing I can switch back to my main session to issue the following query against V$SESSION:
select plsql_entry_object_id, plsql_entry_subprogram_id, plsql_object_id, plsql_subprogram_id, module, actionfrom v$sessionwhere sid = 150 — this was the SID of my other session!/
During the call to P4 I see this:
PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID MODULE ACTION ——————— ————————- ————— ——————- ———- ———-76403 2 4265 8 PK1 module Running P4
…and then whilst P3 is running I see this:
PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID MODULE ACTION ——————— ————————- ————— ——————- ———- ———-76403 1 4265 8 PK1 module Running P3
So, the PLSQL_ENTRY* columns are showing that we’re running the second subprogram of the PLSQL package with OBJECT_ID 76403 - which is the PK1 Package Spec - so slightly different to the example from Tom with V$SQL - but understandable information nevertheless.
From the other two columns, we can also see the currently executing subprogram - 8 in the package with OBJECT_ID 4265 - which is the DBMS_LOCK Package Spec. If we DESCribe DBMS_LOCK we would see that the 8th subprogram is SLEEP - which tallies up with what we’re executing.
We can also see these subprogram_id values in the ALL_PROCEDURES view:
select object_name, object_type, subprogram_id, procedure_namefrom all_procedureswhere object_name IN(’PK1′,’DBMS_LOCK’)order by object_name, subprogram_id/
OBJECT_NAME OBJECT_TYPE SUBPROGRAM_ID PROCEDURE_NAME———– ———– ————- —————DBMS_LOCK PACKAGE 0 DBMS_LOCK PACKAGE 1 ALLOCATE_UNIQUEDBMS_LOCK PACKAGE 2 REQUESTDBMS_LOCK PACKAGE 3 REQUESTDBMS_LOCK PACKAGE 4 CONVERTDBMS_LOCK PACKAGE 5 CONVERTDBMS_LOCK PACKAGE 6 RELEASEDBMS_LOCK PACKAGE 7 RELEASEDBMS_LOCK PACKAGE 8 SLEEPPK1 PACKAGE 0 PK1 PACKAGE 1 P3PK1 PACKAGE 2 P4
So, V$SQL gives us useful information but we can also use V$SESSION in addition to get information by session including the subprogram within a package that we are executing.
As I said at the start, I’ve been using this for years and I’ve no idea why I don’t see it’s use more often. Hopefully the “Tom” effect will help to resolve this!
Compression and ITL
Source: oramoss oracle [link]
I’ve been a bit quiet on the blogging front lately - I’ve been trying to work out what happens inside the database blocks during compression as well as trying to run some benchmarking stuff based on Doug Burns latest parallel execution presentation.
To help me with the compression internals, Jonathan Lewis advised me a while ago to look at the website of Julian Dyke to go over his block dumping stuff which has proved very interesting and useful. I also found a presentation Julian did on compression based on Oracle 9iR2 which had some stuff I’d covered in my last presentation as well as plenty more detailed stuff as you’d expect from Julian.
Julian has a great picture of the structure and breakdown of the inside of a compressed block in his presentation which I’ve been trying to explore in more detail by testing with different block sizes and data. One of the things that has come to light is that there are quite a few factors involved in determining the compression that will be realized when using data segment compression. My original thinking was that the following factors would somehow play a part:
- Block size
- Number of rows
- Length of data values
- Number of repeats
- Ordering of data being pushed into the target data segment
But after looking at the picture of the block structure in Julians presentation it appears that the following could also play a part since they affect the amount of overhead in each block – which in turn affects the space left for data:
- ITL (Interested Transaction Lists – set by value of INITRANS on table create)
- Number of columns in the data segment
I’m currently developing some test scripts to go with the presentation which will show how each of these factors affects the level of compression achieved – might make their way into the presentation in the form of graphs just to illustrate the point.
I did have a TAR (sorry SR!) open with Oracle to see if they’d give me more details on the actual algorithm that is used during compression but after much delay and deliberation they (development) decided it was something they didn’t want to divulge.
Funniest thing I’ve found so far is that Julian shows the compressed block header for a block in a 9.2.0.1 database clearly showing “9ir2” literals – you’d think they’d change when you move to “10gr2” wouldn’t you ? Think again – it still shows “9ir2” in the 10gr2 block dump trace files!
Au revoir - ORA-01034
Source: oramoss oracle [link]
It’s been fun while it lasted!
I’m giving up the blogging for now…my personal family life gets busier by the day and I want to focus on my wife and son, Jude now.
I’ll still be presenting at the end of April at the Northern Server Technology Day but that will also be my last presentation for some time.
For those who have read my blog - thanks, I’ve enjoyed writing it and hopefully helping some of you along the way - it’s been a very educational experience thus far.
Take care
Jeff