We have written a number of SSIS packages that import data from CSV files using the Flat File Source.
It now seems that after these packages are deployed into production, the providers of these files may deliver files where the column order of the files changes (Don't ask!). Currently if this happens, our packages will fail.
For example, an additional column is inserted at the beginning of each row. In this case, the flat file source continues to use the existing column order, which obviously has a detrimental effect on the transformation!
Eg. Using a trivial example, the original file has the following content :
OurReference,Client,Amount 235,MFI,20000.00 236,MS,30000.00
The output from the flat file source is :
OurReference Client Amount 235 ClientA 20000.00 236 ClientB 30000.00
Subsequently, the file delivered changes to :
OurReference,ClientReference,Client,Amount 235,A244,ClientA,20000.00 236,B222,ClientB,30000.00
When the existing unchanged package is run against this file, the output from the flat file source is :
OurReference Client Amount 235 A244 ClientA,20000.00 236 B222 ClientB,30000.00
Ideally, we would like to use a data source that will cope with this problem – ie which produces output based on the column names, instead of the column order.
Any suggestions would be welcomed!
Not that I know of.
A possibility to check for the problem in advance is to set up two different connection managers, one with a single flat row. This one can read the first row and tell if it's OK or not and abort.
If you want to do the work, you can take it a step further and make that flat one-field row the only connection manager, and use a script component in your flow to parse the row and assign to the columns you need later in the flow.
As far as I know, there is no way to dynamically add columns to the flow at runtime - so all the columns you need will need to be added to the script task output. Whether they can be found and get parsed from the each line is up to you. Any "new" (i.e. unanticipated) columns cannot be used. Columns which are missing you could default or throw an exception.
A final possibility is to use the SSIS object model to modify the package before running to alter the connection manager - or even to write the entire package dynamically using the object model based on an inspection of the input file. I have done quite a bit of package generation in C# using templates and then adding information based on metadata I obtained from master files describing the mainframe files.