.net – Parse Delimited CSV in .NET

.netcsvparsingvb.net

I have a text file that is in a comma separated format, delimited by " on most fields. I am trying to get that into something I can enumerate through (Generic Collection, for example). I don't have control over how the file is output nor the character it uses for the delimiter.

In this case, the fields are separated by a comma and text fields are enclosed in " marks. The problem I am running into is that some fields have quotation marks in them (i.e. 8" Tray) and are accidentally being picked up as the next field. In the case of numeric fields, they don't have quotes around them, but they do start with a + or a – sign (depicting a positive/negative number).

I was thinking of a RegEx, but my skills aren't that great so hopefully someone can come up with some ideas I can try. There are about 19,000 records in this file, so I am trying to do it as efficiently as possible. Here are a couple of example rows of data:

"00","000000112260   ","Pie Pumpkin                             ","RET","6.99 ","     ","ea ",+0000000006.99000
"00","000000304078   ","Pie Apple caramel                       ","RET","9.99 ","     ","ea ",+0000000009.99000
"00","StringValue here","8" Tray of Food                             ","RET","6.99 ","     ","ea ",-00000000005.3200

There are a lot more fields, but you can get the picture….

I am using VB.NET and I have a generic List setup to accept the data. I have tried using CSVReader and it seems to work well until you hit a record like the 3rd one (with a quote in the text field). If I could somehow get it to handle the additional quotes, than the CSVReader option will work great.

Thanks!

Best Solution

I recommend looking at the TextFieldParserClass in .Net. You need to include

Imports Microsoft.VisualBasic.FileIO.TextFieldParser

Here's a quick sample:

        Dim afile As FileIO.TextFieldParser = New FileIO.TextFieldParser(FileName)
        Dim CurrentRecord As String() ' this array will hold each line of data
        afile.TextFieldType = FileIO.FieldType.Delimited
        afile.Delimiters = New String() {","}
        afile.HasFieldsEnclosedInQuotes = True

        ' parse the actual file
        Do While Not afile.EndOfData
            Try
                CurrentRecord = afile.ReadFields
            Catch ex As FileIO.MalformedLineException
                Stop
            End Try
        Loop