We’ve recently been tasked with converting a lot of MySQL Data into our system from a multi-tenant MySQL Database. We don’t have a fixed schedule to actually execute the imports because they’ll be on a “per customer” basis. Well, that sounded like a great task for SSIS. I set up the ODBC driver, connected, set the initial “sql_mode” options for our connections to (‘mssql, allow_invalid_dates’) and started to work.
First problem we ran into with an ADO.NET connection to MySQL and writing a SELECT * from schema.table was when we hit a MySQL “Date” column containing a value of ‘0000-00-00’. SSIS threw an error, not sure what to do. Thanks to some others who have solved this problem, I realized that within the MySQL Select statement, we could do something like:
CASE date_created WHEN '0000-00-00' THEN NULL else date_created END as date_createdto pass those dates over as NULL. That solves the implicit conversion to datetime (SQL 2005) and avoids the invalid dates. I ran something similar for a “Time” column to:
CAST(Time_Created as char(8) ) as Time_Created
So that solved one particular table export with about 20 or so CAST and CASE statements. Needless to say, I wasn’t looking forward to doing this for another 500 tables with a total of almost 6000 columns.
I finally set up a really basic query to generate most of the SELECT statements we would need to pull our MySQL data across without too much pain. Admittedly, a small step and I’d still need to copy/paste when I set up each new ADO.NET source, but it worked reasonably well. I’m adding the code snippet for MySQL here in case anyone else has a similar problem.
select
CONCAT(CASE WHEN ordinal_position = 1 THEN 'SELECT ' ELSE ', ' END,
CASE Data_Type WHEN 'date'
THEN CONCAT('CASE ',column_name,' WHEN ''0000-00-00'' THEN NULL else '
,column_name,' END as ',column_name)
WHEN 'time' THEN CONCAT('CAST(',column_name,' AS CHAR(8) ) as ',column_name)
ELSE column_name END, CASE WHEN Ordinal_Position <> MaxOrd THEN ''
ELSE CONCAT('
FROM MySchema.', c.table_name) END
) as Select_Column_Name
from information_schema.columns as c
JOIN (select table_name, MAX(ordinal_position) as MaxOrd
from information_schema.columns
WHERE Table_Schema = 'MySchema'
GROUP BY Table_Name) as t1
ON c.Table_Name = t1.Table_Name
where table_schema = 'MySchema'
order by c.table_name, ordinal_position LIMIT 0, 50000;
I’d love to hear other ideas if anyone has encountered this before and come up with a more elegant solution for translating “0” Date or Time data from MySQL into SQL Server.