Today I heard an interesting answer to the question : what is the difference between ETL and ELT ?
The answer is : in ETL the transformations are processed by the ETL tools while in ELT the transformations are processed by the target datasources (RDBMS). I guess that the rising scalability of databases allows this kind of data integration to be possible.
I created this package while working with SSIS for the first time.
The objective of this package is to extract data from 3 flat files in .txt format and perform some controls before inserting the data into 3 tables,
one table (destination) for each file (source).
These files are Territoires.txt, Personnes.txt and Ventes.txt. The data is extracted in that order.
The tables are SimuTerritoires, SimuPersonnes and SimuVentes.
These controls aim to :
– check if the SalesQuota is null
– check if the TotalDue is empty. If so, assign the value 0
– check the Orderdate value. If it is empty, assign the value 01/01/2001
– check the referential integrity. In the Ventes.txt file, if the Territory ID is not found in the table SimuTerritoires, then reject the row.
Inside my script component, I have a very simple method which processes every row.
The Microsoft Visual C# 2008 code for that method is :
public override void Input0_ProcessInputRow(Input0Buffer Row)
if (Row.OrderDateConverti_IsNull == true)
DateTime.ParseExact("01/01/2001", "MM/dd/yyyy", System.Globalization.CultureInfo.InvariantCulture);
The data of the files comes originally from the AdventureWorks sample database (Territories, Persons and Sales tables). You can download the package file, Simulation2.dtx, here.
Here are a few screenshots :
I have been working with SQL Server 2008 Integration Services lately and I found a cheat sheet for SSIS that is quite helpful.
This cheat sheet is available in PDF format from PragmaticWorks at http://www.pragmaticworks.com/cheatsheet/.