Show Them the Money - Editorial - CIO
Source: OLAP/BI/IM stuff [link]
If this is what your CIO is reading, there’s a great opportunity to contribute as a DBA.
MDM and Consolidation of Data Sets
Source: Blog: Dan E. Linstedt [link]
MDM data often is dispersed across the organization. This begs the question: how can the MDM be a viable asset to the business base? Is the Master Data reused throughout the organization the way it should be? Is it defined in the right context (Master Metadata)? But technically, Master Data should be consolidated into a single global data center. MDM is not a tool, not a toy, not a process - it’s a way of doing business that includes tools, best practices, people, governance, metadata and single answers.
The nature of Appliances in the DW Space
Source: Blog: Dan E. Linstedt [link]
Many are now discussing the “Appliance” for Enterprise Data Warehousing. I’ve blogged on this in the past about definitions (or lack thereof), and what it means to be an appliance. I ran a google search on “appliance+data warehousing” and came up with a few vendors, and their thoughts on what an appliance is. We’ll try to sort through the claims in this entry - and drive it back to a simple level of understanding.
New Code Generator for Oracle Data Miner
Source: Oracle Data Mining and Analytics [link]
The following was announced today:
The beta release of Oracle Data Miner, a graphic user interface for Oracle Data Mining Release 10.1 and above, adds Oracle Data Miner PL/SQL Code Generator and is now available on OTN. The ODM PL/SQL Code generator enables companies…
[ This is a content summary only. Visit my website for full links, other content, and more! ]
New Code Generator for Oracle Data Miner
Source: Oracle Data Mining and Analytics [link]
The following was announced today:
The beta release of Oracle Data Miner, a graphic user interface for Oracle Data Mining Release 10.1 and above, adds Oracle Data Miner PL/SQL Code Generator and is now available on OTN. The ODM PL/SQL Code generator enables companies…
[ This is a content summary only. Visit my website for full links, other content, and more! ]
Making IT Work: When Technology Projects Break Down and What to do About It, Part 1
Source: OLAP/BI/IM stuff [link]
Windows Live Writer
Source: Mark Garner's Business Intelligence Blog [link]
I’m going to try using Windows Live Writer and see how it goes. It will be nice to see what happens when I publish.
Let’s try some fonts and sizes and stuff.
YTD Alternatives part 2
Source: Pete-s random notes [link]
A bit more testing on YDT by table function - this time using some more realistic tests
Backups are a Good Thing
Source: Clickstream [link]
Company taken out by bad database backups. My guess is that they didn’t test the other half, wherein things are restored from backup. Reminds me of the joke song that was circulating a few years ago:
Yesterday,
All those backups seemed a waste of pay.
Now
Analytics, BI and ECM should be ONE.
Source: BLOX Chronicles [link]
Internet Explorer and Firefox are key access points for BI users. It’s smart move, given that its easier to deploy and manage in a corporate environment and easy to migrate in case the company decides to extend it in the Internet or thru a private network to a satellite branch.
The success of Google Onebox is partly due to the complexity of getting the information from a myrad of this expensive and bloated Business Intelligence tools. Google Onebox allows users to search (with data security in mind) all the data available in and outside the organization.
ECM (enterprise content management) is another inovation that is catching up. With the abundance of blogs nowadays, we might have forgotten that behind it is a very user-friendly ‘content management system concept’, just like Blogger.
Since IBM recently acquired Filenet for $16bn, I am looking forward to the possibility of integrating analytics, business intelligence and content management systems into one. Though I doubt it, I am hoping because BI tools needs a lot of organizing to do.++++++++
Visit us at http://bloxchronicles.blogspot.com for more fruitfull BI, Analytics, Data Warehousing and Alphablox discussion.
Creating an SSIS Package Programmatically Part 1
Source: Mark Garner's Business Intelligence Blog [link]
Ever needed to create Integration Services (SSIS) packages in a .NET application? There aren’t a lot of examples out there. I thought I could contribute a little in that area.
This is an example that I wrote that identifies changes in two flat files and then acts upon whether the row is an INSERT, UPDATE or DELETE according to the FULL OUTER JOIN that happens in the package.
It is a little difficult to show tabbed code in a blog so you may want to paste this into
The application that I wrote is a VB console application. You’ll need to add references to these assemblies:
- Microsoft.SqlServer.Dts.Design
- Microsoft.SqlServer.DTSPipelineWrap
- Microsoft.SQLServer.DTSRuntimeWrap
- Microsoft.SQLServer.ManagedDTS
We’ll add a couple of import statements.
| Imports Microsoft.SqlServer.Dts.Runtime Imports PipelineWrap = Microsoft.SqlServer.Dts.Pipeline.Wrapper Imports RuntimeWrap = Microsoft.SqlServer.Dts.Runtime.Wrapper |
Next we’ll set up some local variables.
| Dim myPackage As Package Dim dataFlow As PipelineWrap.MainPipe Dim NewFileName As String = “NewFile.txt” Dim OldFileName As String = “OldFile.txt” Dim dbServerName As String = “IowaJHNDW07″ Dim dbName As String = “SSIS_Test_mgarner” |
First thing, we will create the package.
| Console.Write(”Creating Package…”) myPackage = New Package myPackage.PackageType = DTSPackageType.DTSDesigner90 myPackage.Name = “Custom Package” myPackage.Description = “Custom Package” myPackage.CreatorComputerName = System.Environment.MachineName myPackage.CreatorName = System.Environment.UserName myPackage.Variables.Add(”RowCount”, False, “User”, 0) Console.WriteLine(”Done”) |
Create Connection Managers…
| Console.Write(”Creating connection managers…”) Dim cmflatFileNew As ConnectionManager = myPackage.Connections.Add(”FLATFILE”) cmflatFileNew.Properties(”ConnectionString”).SetValue(cmflatFileNew, NewFileName) cmflatFileNew.Properties(”Format”).SetValue(cmflatFileNew, “FixedWidth”) cmflatFileNew.Properties(”DataRowsToSkip”).SetValue(cmflatFileNew, 0) cmflatFileNew.Properties(”ColumnNamesInFirstDataRow”).SetValue(cmflatFileNew, False) cmflatFileNew.Properties(”Name”).SetValue(cmflatFileNew, “FlatFileConnection New”) cmflatFileNew.Properties(”RowDelimiter”).SetValue(cmflatFileNew, vbCrLf) cmflatFileNew.Properties(”TextQualifier”).SetValue(cmflatFileNew, “”””)Dim ffNew As RuntimeWrap.IDTSConnectionManagerFlatFile90 = NothingffNew = TryCast(cmflatFileNew.InnerObject, RuntimeWrap.IDTSConnectionManagerFlatFile90)Dim newCol As RuntimeWrap.IDTSConnectionManagerFlatFileColumn90 Dim newName As RuntimeWrap.IDTSName90′ Create a few columns newCol = ffNew.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 8 newCol.MaximumWidth = 8 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “PersonID” newCol.DataType = RuntimeWrap.DataType.DT_I4newCol = ffNew.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 20 newCol.MaximumWidth = 20 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “FirstName” newCol.DataType = RuntimeWrap.DataType.DT_STRnewCol = ffNew.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 20 newCol.MaximumWidth = 20 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “LastName” newCol.DataType = RuntimeWrap.DataType.DT_STRDim cmflatFileOld As ConnectionManager = myPackage.Connections.Add(”FLATFILE”) cmflatFileOld.Properties(”ConnectionString”).SetValue(cmflatFileOld, OldFileName) cmflatFileOld.Properties(”Format”).SetValue(cmflatFileOld, “FixedWidth”) cmflatFileOld.Properties(”DataRowsToSkip”).SetValue(cmflatFileOld, 0) cmflatFileOld.Properties(”ColumnNamesInFirstDataRow”).SetValue(cmflatFileOld, False) cmflatFileOld.Properties(”Name”).SetValue(cmflatFileOld, “FlatFileConnection Old”) cmflatFileOld.Properties(”RowDelimiter”).SetValue(cmflatFileOld, vbCrLf) cmflatFileOld.Properties(”TextQualifier”).SetValue(cmflatFileOld, “”””)Dim ffOld As RuntimeWrap.IDTSConnectionManagerFlatFile90 = Nothing ffOld = TryCast(cmflatFileOld.InnerObject, RuntimeWrap.IDTSConnectionManagerFlatFile90)’ Create a few columns newCol = ffOld.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 8 newCol.MaximumWidth = 8 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “PersonID_Old” newCol.DataType = RuntimeWrap.DataType.DT_I4newCol = ffOld.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 20 newCol.MaximumWidth = 20 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “FirstName_Old” newCol.DataType = RuntimeWrap.DataType.DT_STRnewCol = ffOld.Columns.Add() newCol.ColumnType = “FixedWidth” newCol.ColumnWidth = 20 newCol.MaximumWidth = 20 newName = TryCast(newCol, RuntimeWrap.IDTSName90) newName.Name = “LastName_Old” newCol.DataType = RuntimeWrap.DataType.DT_STR’ OLEDB Destination Dim cmDatabase As ConnectionManager = myPackage.Connections.Add(”OLEDB”) cmDatabase.Properties(”ConnectionString”).SetValue(cmDatabase, “Data Source=” + _ dbServerName + “;Initial Catalog=” + dbName + _ “;Provider=SQLNCLI.1;Integrated Security=SSPI;Auto Translate=False;”) cmDatabase.Properties(”Name”).SetValue(cmDatabase, “OLEDB to ” + _ dbServerName + “_” + dbServerName + “”)’ Add Dataflow Task Dim th As TaskHost = TryCast(myPackage.Executables.Add(”DTS.Pipeline”), TaskHost) th.Name = “Master DataFlow”dataFlow = TryCast(th.InnerObject, PipelineWrap.MainPipe) Console.WriteLine(”Done”) |
| Console.Write(”Creating flat file source new…”)Dim ffSourceNew As PipelineWrap.IDTSComponentMetaData90 ffSourceNew = dataFlow.ComponentMetaDataCollection.New()’ Set stock properties. ffSourceNew.ComponentClassID = “DTSAdapter.FlatFileSource”Dim instanceNew As PipelineWrap.CManagedComponentWrapper = ffSourceNew.Instantiate() instanceNew.ProvideComponentProperties()ffSourceNew.Name = “FFSource New” ffSourceNew.OutputCollection(0).IsSorted = True’ Associate the runtime connection manager ‘ The connection manager association will fail if called before ProvideComponentProperties ffSourceNew.RuntimeConnectionCollection(0).ConnectionManagerID _ = myPackage.Connections(”FlatFileConnection New”).ID ffSourceNew.RuntimeConnectionCollection(0).ConnectionManager _ = DtsConvert.ToConnectionManager90(myPackage.Connections(”FlatFileConnection New”))’ Acquire Connections and reinitialize the component instanceNew.AcquireConnections(Nothing) instanceNew.ReinitializeMetaData()For Each col As PipelineWrap.IDTSOutputColumn90 In _ ffSourceNew.OutputCollection(0).OutputColumnCollection If col.Name = “PersonID” Then col.SortKeyPosition = 1 End If Next’ If you wanted to exclude a column from being in the OutputColumnCollection, ‘ this is how you would do it ‘ ffSourceNew.OutputCollection(0).OutputColumnCollection.RemoveObjectByID( [ColumnID] )instanceNew.ReleaseConnections()Console.WriteLine(”Done”) |
Add Flat File Source for the “Old” File…
| Console.Write(”Creating flat file source old…”)Dim ffSourceOld As PipelineWrap.IDTSComponentMetaData90 ffSourceOld = dataFlow.ComponentMetaDataCollection.New()’ Set stock properties. ffSourceOld.ComponentClassID = “DTSAdapter.FlatFileSource”Dim instanceOld As PipelineWrap.CManagedComponentWrapper = ffSourceOld.Instantiate() instanceOld.ProvideComponentProperties()ffSourceOld.Name = “FFSource Old” ffSourceOld.OutputCollection(0).IsSorted = True’ Associate the runtime connection manager ‘ The connection manager association will fail if called before ProvideComponentProperties ffSourceOld.RuntimeConnectionCollection(0).ConnectionManagerID _ = myPackage.Connections(”FlatFileConnection Old”).ID ffSourceOld.RuntimeConnectionCollection(0).ConnectionManager _ = DtsConvert.ToConnectionManager90(myPackage.Connections(”FlatFileConnection Old”))’ Acquire Connections and reinitialize the component instanceOld.AcquireConnections(Nothing) instanceOld.ReinitializeMetaData()For Each col As PipelineWrap.IDTSOutputColumn90 In _ ffSourceOld.OutputCollection(0).OutputColumnCollection If col.Name = “PersonID_Old” Then col.SortKeyPosition = 1 End If Next’ If you wanted to exclude a column from being in the OutputColumnCollection, this is how you would do it ‘ ffSourceOld.OutputCollection(0).OutputColumnCollection.RemoveObjectByID( [ColumnID] )instanceOld.ReleaseConnections()Console.WriteLine(”Done”) |
Create the Merge Join Task…
| Console.Write(”Creating merge join…”)Dim mergeJoin As PipelineWrap.IDTSComponentMetaData90 mergeJoin = dataFlow.ComponentMetaDataCollection.New()’ Set stock properties. mergeJoin.ComponentClassID = “DTSTransform.MergeJoin”Dim instanceJoin As PipelineWrap.CManagedComponentWrapper = mergeJoin.Instantiate() instanceJoin.ProvideComponentProperties()’ set column properties for the “old” input dataFlow.PathCollection.New().AttachPathAndPropagateNotifications( _ ffSourceOld.OutputCollection(0), mergeJoin.InputCollection(0))Dim vInputOld As PipelineWrap.IDTSVirtualInput90 = mergeJoin.InputCollection(0).GetVirtualInput()For Each vColumnOld As PipelineWrap.IDTSVirtualInputColumn90 In vInputOld.VirtualInputColumnCollection instanceJoin.SetUsageType(mergeJoin.InputCollection(0).ID, vInputOld, vColumnOld.LineageID, PipelineWrap.DTSUsageType.UT_READONLY) NextdataFlow.PathCollection.New().AttachPathAndPropagateNotifications( _ ffSourceNew.OutputCollection(0), mergeJoin.InputCollection(1))Dim vInputNew As PipelineWrap.IDTSVirtualInput90 = mergeJoin.InputCollection(1).GetVirtualInput()For Each vColumnNew As PipelineWrap.IDTSVirtualInputColumn90 In vInputNew.VirtualInputColumnCollection instanceJoin.SetUsageType(mergeJoin.InputCollection(1).ID, vInputNew, vColumnNew.LineageID, PipelineWrap.DTSUsageType.UT_READONLY) Next’ set join properties instanceJoin.SetComponentProperty(”JoinType”, 0) instanceJoin.SetComponentProperty(”NumKeyColumns”, 1)Console.WriteLine(”Done”) |
In the next part, we’ll conditionally split based on the results of the Merge Join Task and then act upon those different flows that are created in the conditional split.
Mark
AOL hunts for spammer’s gold - U.S. Business - MSNBC.com
Source: OLAP/BI/IM stuff [link]
AOL hunts for spammer’s gold - U.S. Business - MSNBC.com: “WASHINGTON - AOL is preparing to dig for buried gold and platinum on property in Massachusetts owned by the parents of a man it sued for sending millions of unwanted spam e-mails to its customers.”
Year-to-date alternatives
Source: Pete-s random notes [link]
Challenged to do YTD with a table function, I gave it a shot! The results are not clear cut
10gR2: OBJECT_NAME missing from V$SQL_PLAN for INSERT statement
Source: oramoss oracle [link]
I’d written a query the other day to try and identify SQL statements that had been run which wanted a certain degree of parallelism (DOP) and had either run with less than the requested amount of PX slaves or had run serially - in order that we could identify whether the (poor) performance of a query was potentially due to it not getting the resources (PX) we expected it to.
I posted the query on
Database Mail in SQL Server 2005
Source: OLAP/BI/IM stuff [link]
Chris Webb’s BI Blog: Debunking the Beer and Diapers Story
Source: OLAP/BI/IM stuff [link]
Chris Webb’s BI Blog: Debunking the Beer and Diapers Story: “Mark Whitehorn (of ‘Fast Track to MDX’ fame) debunks that old ‘beer and diapers’ story that I’ve used plenty of times in data mining presentations in the Register today:
http://www.regdeveloper.co.uk/2006/08/15/beer_diapers/
Not that I’m going to let the small matter of the truth get in the way of a good story in future presentations!”
Survey: CIOs Put Out Help-Wanted Sign for Business-Savvy IT Professionals
Source: OLAP/BI/IM stuff [link]

Survey: CIOs Put Out Help-Wanted Sign for Business-Savvy IT Professionals: “Finding 3: CIOs are looking for business-savvy technologists to build new systems.
Large and midsize companies are ramping up hiring in programming and systems development; also in demand are professionals in project management, business-process redesign, business analysis and systems integration. The message to IT professionals is clear: A need exists in the U.S. for talented technicians who can be businesspeople, too, especially if they can function in a global economy. “
Microsoft Certified Architect : A sample question for infrastructure MCA candidates
Source: OLAP/BI/IM stuff [link]
Microsoft Certified Architect : A sample question for infrastructure MCA candidates: “The information about review board questions is still too vague and hazy? You need a specific example? OK, here we go…
Automatic software updating is one of the critical knowledge areas for any infrastructure architect. Grab a marker and sketch your proposed architecture for the planetary-scale patch dissemination service. It has to be fast, effective, massively scalable and capable of patching up to a billion of globally connected computers every month. Think Windows Update service on steroids.
Sketch the conceptual design of proposed system with enough detail to enable the discussion around the following questions:
How many physical tiers would you design into a proposed architecture? Why?
What non-functional (systemic) qualities of the system would be the most important? Why?
What traffic properties would you expect? How would you measure the patterns and how would you make the system adaptable to unforeseeable spikes?
What is your signalling strategy to inform all subscribed computers on time without choking the distribution?
What is your preferred patch dissemination strategy? What would be good alternatives? Why?
Compare the patch distribution system with traditional content distribution (such as Web content). What are the major differences?
Try to VERBALIZE the answers. Be short and to the point. Keep the structure of answers and balance the conceptual view with enough of supporting details. Would you be persuasive and influential enough to satisfy a picky CIO of massive ISP and the group of his closest advisors?
Now go and read the following article: http://research.microsoft.com/research/pubs/view.aspx?tr_id=1129.”
Triggers: A Key Tool for B2B Marketing and Risk Management
Source: OLAP/BI/IM stuff [link]
The bottom line is marketing cost savings. For example, consider a company that requests a list of all businesses that have a new lien or judgment in a month, which produce 27,000 businesses. Assume the average cost for a direct-mail piece, including postage and creative work, is 43 cents per piece. If the criteria exclude marketing to anyone who has a lien or a judgment, triggers can help the company avoid dropping that mail piece on those 27,000 businesses - translating into a savings of $11,610. Not only is the cost of the mailing saved, but the cost of processing a claim that results in a ‘no’ also is avoided.”
Information Brokers: The Missing Link?
Source: OLAP/BI/IM stuff [link]
