Wednesday 15 August 2007

SSIS - DataReader issue

Just a quick note, unrelated to anything to do with the actual website.

Today, at work, I was working on a task that involves moving data from a production database to a Historic database. The production environment has been misused at times in the past, and as a result, there's numerous backups of tables residing in the database.

Due to regulations, just dropping the tables or wiping data is not an option. Hence, we chose to move it to the historic database. Now to accomplish this, I selected SSIS as our prefered tool. Stupidly enough, I chose to use ADO.NET as a data connection. The fun started when I was creating the actual "Data Flow" objects. I must have created a dozen of them when suddenly I was confronted with an error that SSIS was unable to translate from unicode to non-unicode data.

Huh?


I checked my CREATE TABLE scripts, and sure enough, the data types in both source and destination were set to char. I tried altering the Source's External columns, which seemed to work from the colum's properties, but actually didn't if you would check the datatype coming into the Source. I tried fiddling with the Source's Output column, but got an error message that I was not allowed to. I tried explicitly casting the column as char in the Source's SELECT statement. No luck.

A colleague came over and suggested using a Dereived Column object. Again, that didn't work. I ended up Googling, and came to an MSDN link. There, it stated somewhere halfway the page:

"One issue when using ADO.NET is that there is no ADO.NET data type corresponding to VARCHAR (non-Unicode character) in relational databases. Therefore, ADO.NET data sources output all character data as Unicode. This can cause problems with simple components that use ADO.NET sources and destinations that expect single-byte character types. You can work around this by using a Data Conversion transformation."

I tested this, and using a Data Conversion did indeed work. But what a lot of extra work to do something relatively simple. I decided to just switch all connections to OLEDB.

Conclusion: When dealing with non-unicode char columns, OLEDB might be the better choice.

No comments: