Hello. I’m new to SSIS. This forum and Kirk Haselden’s book are my teachers. I’m having a hard time grasping something basic to get started defining a set of packages to automate the ETL process, however, and perhaps I’m simply misunderstanding the best practices of SSIS.
I have source data in two different transactional databases, and use OleDb connection managers (and OleDb Source components in the Data Flow) to extract the data. I use a Script Task and several Lookup widgets in the Data Flow to transform the data, and output each to two different package-scope DataSet variables.
How do I join these two datasets in a third Data Flow task for loading into my data warehouse? It seems I can iterate through them in the Control Flow, but I can’t write a query against them in the Data Flow, since there is no connection manager that allows me to “connect” to a package-level variable. Should I instead be storing my extracted, transformed data in temporary database tables, and then joining these to do the final load?
Any advice greatly appreciated. Thanks in advance.
MilwaukeeCharlie wrote:
Hello. I’m new to SSIS. This forum and Kirk Haselden’s book are my teachers. I’m having a hard time grasping something basic to get started defining a set of packages to automate the ETL process, however, and perhaps I’m simply misunderstanding the best practices of SSIS.
I have source data in two different transactional databases, and use OleDb connection managers (and OleDb Source components in the Data Flow) to extract the data. I use a Script Task and several Lookup widgets in the Data Flow to transform the data, and output each to two different package-scope DataSet variables.
How do I join these two datasets in a third Data Flow task for loading into my data warehouse? It seems I can iterate through them in the Control Flow, but I can’t write a query against them in the Data Flow, since there is no connection manager that allows me to “connect” to a package-level variable. Should I instead be storing my extracted, transformed data in temporary database tables, and then joining these to do the final load?
Any advice greatly appreciated. Thanks in advance.
Charlie,
You're absolutely correct, there is no source adapter for these and with good reason too.
There is a workaround though:
Recordsets instead of raw files
(http://blogs.conchango.com/jamiethomson/archive/2006/01/04/SSIS_3A00_-Recordsets-instead-of-raw-files.aspx)
Another option (and the one I would use for performance reasons as explained here: http://blogs.conchango.com/jamiethomson/archive/2006/06/28/SSIS_3A00_-Comparing-performance-of-a-raw-file-against-a-recordset-destination.aspx) is to use raw files
By the way, the correct nomenclature is "components", not "widgets"
Good luck!
-Jamie