Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preventing Python requests.post to encode strings to UTF-8

I am making an API call to an appliance, passing a message in a JSON payload via HTTP POST.

Despite not doing any character encoding, the string received is encoded in UTF-8.

Unfortunately, the appliance manufacturer requires no encoding for the message, and characters with accents are turned into 5-character codes :(

Here is the code:

import requests

payload = {
           "type": "send-message",
           "username": "myuser",
           "password": "mypass",
           "to": "456",
           "msg": "here are accents: é ç"
          }

resp = requests.post("http://192.168.1.10/send_message.html",json=payload)

The result seen by the recipient doesn't show the accent characters correctly:

received message

Doing a tcpdump, I can see the HTTP POST made by requests.post contains the following payload:

{"type": "send-message", "username": "myuser", "password": "mypass", "to": "456", "msg": "here are accents: \u00e9 \u00e7"}

As you can see, the text has been encoded to UTF-8, which is not asked for anywhere in the code.

If I try to force decode "here are accents: é ç".decode('utf-8') I get the error AttributeError: 'str' object has no attribute 'decode' which makes sense because it's not encoded.

If I attempt to force ASCII: "here are accents: é ç".encode('ascii','ignore') then the accents will be lost.

Testing with CURL it works perfectly:

curl -X POST 'http://192.168.1.10/send_message.html' -H 'Content-Type: application/json' -d '{"type": "send-message","username": "myuser","password": "mypass","to": "456","msg": "here are accents: é ç" }'

Looking at the tcpdump with the curl attempt from the linux CLI shows the JSON exactly as sent, and the appliance recognizes the accents and sends them exactly as expected.

Imported into Wireshark, the string sent by CURL which is not UTF-8 formatted, and is correctly interpreted looks like this:

wireshark-screenshot

Is there a way to tell Python's requests.post NOT to translate to UTF-8, or do I have to re-code the HTTP POST?

Thank you so much in advance.

like image 331
Love2Code Avatar asked Dec 14 '25 22:12

Love2Code


1 Answers

If for some reason the target API is not fully JSON-compliant, you can build a JSON response manually and encode it in whatever encoding you like. ensure_ascii=False wil disable non-ASCII translation to escape codes, and you can specify the encoding if it is non-standard. The wireshark screenshot shows the data is actually UTF-8-encoded, so that is what I've done below:

import requests
import json

payload = {
           "type": "send-message",
           "username": "myuser",
           "password": "mypass",
           "to": "456",
           "msg": "here are accents: é ç"
          }

headers = {'Content-Type': 'application/json'}
data = json.dumps(payload, ensure_ascii=False).encode('utf8')
resp = requests.post("http://192.168.1.10/send_message.html", data=data, headers=headers)
like image 56
Mark Tolonen Avatar answered Dec 16 '25 16:12

Mark Tolonen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!