Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python SyntaxError: Non-UTF-8 [duplicate]

I converted my Python script to a Mac.app (via py2app). I try to run it and get the following error:

SyntaxError: Non-UTF-8 code starting with '\xcf' in file 
py2app/dist/myapp.app/Contents/MacOS/myapp on line 1, but no encoding declared; see 
http://python.org/dev/peps/pep-0263/ for details

I visited the PEP website and added the following to the first two lines of my script:

#!/usr/bin/python
# -*- coding: utf-8 -*-

I have also put my code into various online tools (such as this one) to check whether there are any non-UTF-8 characters but I'm not getting any issues.

I did copy some text from an Excel file however there were no special symbols that I was aware of.

The script is approx 800 lines so is there a way of identifying the problem that doesn't involve manually scanning the script line-by-line?

EDIT

Not exactly a fix, but converting my script into an executable instead of a .app has fixed the issue and it now runs correctly.

like image 294
DDiran Avatar asked Sep 02 '25 10:09

DDiran


1 Answers

Python 3 uses UTF-8 as default encoding. This simplify the codes you get from Internet (and other packages). \xcf in UTF-8 is valid only if the byte before has predefined values, which it is not the case: Non-UTF8 code starting mean this, it is not a valid start (first byte) of UTF8 codepoint encoding.

As you see in the comment, you may convert the file into UTF-8, many times you can ignore the initial encoding (often such errors are from comments, e.g. author name). you may convert it, e.g. on options in Saving As on your original editor.

As an alternate way, you can specify the encoding on the first few lines of your code, see PEP-263 on how to do it. Note: Python has hardcoded byte strings to check [because it has not idea of encoding], so try to copy exactly the string as in such document. I think such line # -*- coding: latin-1 -*- should be ok, but this could misinterpret some characters, so test your program. If you do no know the original encoding, the easier way it is to convert original source (because you should in any case check all strings in the source code, and check if you guessed the correct encoding).

like image 86
Giacomo Catenazzi Avatar answered Sep 04 '25 23:09

Giacomo Catenazzi