Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adapting csv reader to read unicode characters

Tags:

c#

unicode

I'm having a problem with characters in a csv file coming through as the black diamond with a ? in the middle.

I have written the code to parse the csv, but I don't get why the string isn't reading the unicode characters properly. It's probably something to do with my implementation:

StreamReader readFile = new StreamReader(path)

try {
  while ((line = readFile.ReadLine()) != null) {
    string[] row = { "", "", "" };
    int currentItem = 0;
    bool inQuotes = false;
    if (skippedFirst && currentItem != 3) {
      for (int i = 0; i < line.Length; i++) {
        if (!inQuotes) {
          if (line[i] == '\"')
            inQuotes = true;
          else {
            if (line[i] == ',')
              currentItem++;
            else
              row[currentItem] += line[i];
          }
        } else {
          if (line[i] == '\"')
            inQuotes = false;
          else
            row[currentItem] += line[i];
        }
      }
      parsedFile.Add(row);
    }
    skippedFirst = true;
  }
like image 373
ediblecode Avatar asked Jan 25 '26 20:01

ediblecode


1 Answers

Specify the Encoding when opening the File.

using (var sr = new StreamReader(@"c:\Temp\csvfile.csv", Encoding.UTF8)) {
}

You might also want to look into Filehelpers for CSV parsing:

https://www.filehelpers.net/quickstart/

like image 90
mfussenegger Avatar answered Jan 28 '26 11:01

mfussenegger



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!