Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the encoding to read and write files with special characters such as en dash, left quotes, etc?

Tags:

c#

file

encoding

I'm reading csv files that contain special characters such as the long en dash –, left double quotes “, and right double quotes ” and I can't figure out the proper way to read and write these correctly. I thought it was UTF8 or Unicode but it reads and writes them as a square or ? with a diamond. Opening the files in notepad++ to confirm. Maybe another specific encoding is needed? Here is the code I've been using so far, tried a few variations of this with different encoding.:

string[] lines = File.ReadAllLines(filePathTxt.Text, Encoding.UTF8);
...
Stream s = new FileStream(filePath, FileMode.Append);
StreamWriter sw = new StreamWriter(s, Encoding.UTF8, 1000, true);

Input of:

Surveys – Public

Documents:,“A”

comes out as

Surveys � Public

Documents:,�A�

Also shows problems in debugger as soon as it's read into the string array.

Edit: I've tried Unicode also. I'm using NotePad++, Win 10. The problem is definitely in the Read step, because if I add the following line to manually write a line of data, like so:

 sw.WriteLine("Surveys – Public");

That line writes the dash fine, so it's on the initial read of the file from the source csv where the characters get messed up. I've tried reading with a few different encodings, and NotePad++ just shows the csv as being ANSI.

like image 676
Brent Kilboy Avatar asked Nov 28 '25 19:11

Brent Kilboy


1 Answers

Instead of:

StreamWriter sw = new StreamWriter(s, Encoding.UTF8, 1000, true);

use this:

StreamWriter sw = new StreamWriter(s, Encoding.Unicode, 1000, true);

I just tried it and it shows up correctly in NotePad++

Here's the sample I ran that I used for testing it:

        using (StreamWriter swClifor = new StreamWriter("test.txt", true, Encoding.Unicode))
        {
            string cString = "en dash –, left double quotes “, and right double quotes ”";
            swClifor.WriteLine(cString);
        }
like image 133
Phil N DeBlanc Avatar answered Dec 01 '25 10:12

Phil N DeBlanc