Sunday, February 26, 2012

A Little Data Validation, Please?

I've been working on an implementation where part of the process is importing about 2 million lines of  data using a tool developed by the software vendor. A large part of their data is stored in the database as a sql_variant (dynamic data type). Once the data is imported , the program processes some of this data and expects a specific data type. Unfortunately there is no data validation during the import process, so, when a bad piece of data gets into the database you get a nifty cryptic error line 'unable to convert nvarchar to int'. (Better error  messages would be nice, too.) The root cause of this problem, however, was ignoring data validation. When you use dynamic data types you have to do your own input validation, you can't count on database or the user to do it for you. I know this is very obvious but apparently it was missed in this case. Because of this the database is currently unusable, and someone will have to spend a significant amount of time figuring out where the bad data is.