Recently a coworker came up with an issue I never ran into before. He’s importing a large number of Excel spreadsheets into a SQL 2005 instance. One of the columns contains mixed data; 1, 2, 3, x, y, z for example. What he found is that if the first 20 or so rows are strictly numeric then the connection treats the whole column as numeric, and alphanumeric values are returned as NULLs. He’s querying using OPENROWSET, but it looks like this happens in SSIS as well.
What his research found was that there’s a few registry keys that come into play. There’s also another property he needed to set in his connection string. When he adds “IMEX=1” it allows importing of mixed data types based on the the registry setting “ImportMixedTypes”. The default value is “text”. One drawback to text is that you’re limited to 255 characters. The only other available value is “Majority Type”. When you use Majority Type some of the values are scanned and the data type with the most values is used.
The number of rows scanned is based on the “TypeGuessRow” registry setting. By default it’s set to 8, but valid values are 0 to 16. If you set it to 0 then all rows are sampled. Setting it to a value between 1 and 16 samples that number.
Both keys can be found at HKEY_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/.
Here are a few articles that explain the issue better.
UPDATED 7/20/2010: You can read more about the actual issue here.