ETL Subsystem 26: Version Migration
Source: Tod means Fox [link]
This article is part of a series discussing the Kimball Group’s “34 Subsystems of ETL“. The Subsystems are a group of “Best Practices” for delivering a BI/DW solution. In my articles, I discuss how each Subsystem can be implemented in SSIS or hand coded in Visual FoxPro.
Depending on the lifecycle in place at your business, you will likely need to move your data warehousing project from the development environment into a production environment. I am currently using an alpha / beta / gamma system (with mixed emotions), where alpha is our development environment, beta is our integration, and gamma is production. I have also worked with systems that combined alpha and beta and had a special user acceptance environment instead.
Whatever you do, you’ll need to consider migrating from one environment to another.
Some things to consider:
- Environments: How similar are your environments (both hardware and software)? Are there major differences between the development servers and the production servers? Ideally, all environments should resemble each other (integration, UAT, and production environments should really be identical). You don’t want to develop on a 64-bit Windows Server 2008 and deploy to a 32-bit Windows Server 2003 machine!
- Security: Have you considered the different security and access requirements that exist between development and production environments? Will you, as the data warehousing team, have access to manage server jobs, file transfers, and file shares?
- Version Control: Have you set up your version control in such a way that if an installation fails, you can revert back to a previous version? This is tricky, so be careful to set aside enough time for planning. You also need to keep in mind potential compliance issues, as it may be necessary to “pull out” an older version to see how data might have been manipulated differently than the current installation.
- Resources: Who are the people in charge of the migration? Are they available when you need them?
- Documentation: Does your organization require installation documents, verification plans, quality assurance sign-offs, user acceptance sign-offs, and back-out plans?
- Protocol: What type of red tape must you cut through to get your data warehouse online? Who needs to sign-off on the deployment? What forms must be filled out? Is there an official change request procedure? ITIL and other systems can be good in theory but difficult to work with in practice. Take the time to learn what you need to do to get your system online.
- Packaging: How will you package your software and database scripts? Will you use an installation program or will you have someone manually copy files? Will scripts be run individually? Will you use a deployment tool such as MSBuild?
- Verification: Have you thought about how you can test the system in a way that flexes all new and updated features, while providing adequate regression to be sure you didn’t break something that was once working? Have you done a proper impact analysis on all new or changed components?
- Data: Will you be migrating data changes as well? How will you get these data changes from one environment to the next? Has everything been scripted and tested thoroughly?
SQL Server 2005 Integration Services (SSIS)
SSIS makes deployment of packages from one environment to another relatively easy. You merely create a single package and supply different configurations for each environment. I am currently using XML configurations, but you might do better with database configurations, environment variables, or using a table in SQL Server. The basic premise is simple: Create variables in your package for things like file locations and connection strings. Using the Package Configurations dialog (SSIS -> Package Configurations…), enable configurations, click Add and then follow the Wizard. Check the SQL Server Development Center for details.
Remember that you can always set a configurable item (e.g. variables, connections, tasks) from the command line or in the SQL Agent Job. For the Agent Jobs, open the Job Step properties and from the General page, select “Set values”. When entering your values to configure, use the following syntax:
Packagecontainer.property
Example:
Package.Variable[ServerName].value
For more insight on setting up jobs you can see my post “SSIS Packages and setting up Jobs“.
SSIS comes with a Deployment Utility that will bundle all the files necessary for deployment of the package into a single folder. Copy this folder to the destination server and run the Package Installation Wizard to install your packages. While this is a good idea, and does remove the possibility of human error during the installation process, I do not use the Deployment Utility. Call me old-school, but I prefer to do this bundling and installing myself through the dtexec command line utility — I like to see what is going on and it gives me an opportunity to catch surprises before they become trouble.
Hand Coding with Visual FoxPro (VFP9)
You can also use FoxPro to mimic the same kind of functionality you get in SSIS using package configurations. This is no different than using an ini file, a “setup” table, registry entries, etc. I would warn against using free tables if your deployment team does not have access to FoxPro. You would be better served using XML or some other plain text formatted file.
You can also automate much of your builds by utilizing VFPBuild. Find the related CodePlex page here.
From Here
As a developer, you are responsible for ensuring that the right version of your software makes it through the migration path with little or no hassle. It can be challenging enough dealing with hardware and software compatibility issues. Do your best to know the routine, package appropriately, and plan ahead!
Next post, I’ll talk about Work Flow Monitoring…