Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is PHP's json_encode guaranteed to produce ASCII string?

Tags:

json

php

utf-8

Well, the subject says everything. I'm using json_encode to convert some UTF8 data to JSON and I need to transfer it to some layer that is currently ASCII-only. So I wonder whether I need to make it UTF-8 aware, or can I leave it as it is.

Looking at JSON rfc, UTF8 is also valid charset in JSON output, although not recommended, i.e. some implemenatations can leave UTF8 data inside. The question is whether PHP's implementation dumps everthing as ASCII or opts to leave something as UTF-8.

like image 494
Milan Babuškov Avatar asked Nov 24 '25 14:11

Milan Babuškov


2 Answers

Unlike JSON support in other languages, json_encode() does not have the ability to generate anything other than ASCII.

like image 95
Ignacio Vazquez-Abrams Avatar answered Nov 26 '25 03:11

Ignacio Vazquez-Abrams


According to the JSON article in Wikipedia, Unicode characters in strings are always

double-quoted Unicode with backslash escaping

The examples in the PHP Manual on json_encode() seem to confirm this.

So any UTF-8 character outside ASCII/ANSI should be escaped like this: \u0027 (note, as @Ignacio points out in the comments, that this is the recommended way to deal with those characters, not a required one)

However, I suppose json_decode() will convert the characters back to their byte values? You may get in trouble there.

If you need to be sure, take a look at iconv() that could convert your UTF-8 String into ASCII (dropping any unsupported characters) beforehand.

like image 41
Pekka Avatar answered Nov 26 '25 03:11

Pekka



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!