csv module returning a BOM for first column

Question

I have a csv file formatted like this:

type,type_mapping, style,style_mapping,Count
Residential,Residential,Antique,Antique,109
Antique,Residential,Antique,Antique,48
Apt/Garage,Commercial,Apt/Garage,Apartment,1

I am parsing it using the csv module in Python (version 3). Here is my code:

import os
import csv

typeXref = dict()
with open('xref.csv') as csvData:
    csvRead = csv.reader(csvData)
    headers = next(csvRead)

    for index, row in enumerate(csvRead):
        typeXref[index] = {key: value for key, value in zip(headers, row)} 

print(typeXref)

For some reason my first column continually returns the byte order mark \ufefffor the first column in the header.

408: {'\ufefftype': 'Residential', 'type_mapping': 'Residential', 
      ' style': 'Antique', 'style_mapping': 'Antique', 'Count': '109'}}

I assume this is due to the way I'm opening the file, reading the content with the csv module, or generating the file.

I can figure out how to decode that one field, but would rather ensure I'm generating the file correctly, or using the csv module property.

Guillaume Lebreton · Accepted Answer

You have to tell that you are reading an utf-8 file with BOM:

with open('xref.csv', encoding='utf-8-sig') as csvData:
    ....

Then the BOM will be stripped

csv module returning a BOM for first column

Tags:

python

python-3.x

csv

unicode

Dom DaFonte

1 Answers

Guillaume Lebreton

Recent Activity

Donate For Us

csv module returning a BOM for first column

Tags:

python

python-3.x

csv

unicode

Dom DaFonte

1 Answers

Guillaume Lebreton

Related questions

Recent Activity

Donate For Us