What does "Content-type: application/json; charset=utf-8" really mean?

Tags:

When I make a POST request with a JSON body to my REST service I include Content-type: application/json; charset=utf-8 in the message header. Without this header, I get an error from the service. I can also successfully use Content-type: application/json without the ;charset=utf-8 portion.

What exactly does charset=utf-8 do ? I know it specifies the character encoding but the service works fine without it. Does this encoding limit the characters that can be in the message body?

504

asked Feb 13 '12 02:02

DenaliHardtail

2 Answers

The header just denotes what the content is encoded in. It is not necessarily possible to deduce the type of the content from the content itself, i.e. you can't necessarily just look at the content and know what to do with it. That's what HTTP headers are for, they tell the recipient what kind of content they're (supposedly) dealing with.

Content-type: application/json; charset=utf-8 designates the content to be in JSON format, encoded in the UTF-8 character encoding. Designating the encoding is somewhat redundant for JSON, since the default (only?) encoding for JSON is UTF-8. So in this case the receiving server apparently is happy knowing that it's dealing with JSON and assumes that the encoding is UTF-8 by default, that's why it works with or without the header.

Does this encoding limit the characters that can be in the message body?

No. You can send anything you want in the header and the body. But, if the two don't match, you may get wrong results. If you specify in the header that the content is UTF-8 encoded but you're actually sending Latin1 encoded content, the receiver may produce garbage data, trying to interpret Latin1 encoded data as UTF-8. If of course you specify that you're sending Latin1 encoded data and you're actually doing so, then yes, you're limited to the 256 characters you can encode in Latin1.

answered Sep 23 '22 21:09

deceze

To substantiate @deceze's claim that the default JSON encoding is UTF-8...

From IETF RFC4627:

JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.

Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets.
      00 00 00 xx  UTF-32BE       00 xx 00 xx  UTF-16BE       xx 00 00 00  UTF-32LE       xx 00 xx 00  UTF-16LE       xx xx xx xx  UTF-8 

answered Sep 24 '22 21:09

Drew Noakes

Related questions
                            
                                PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
                            
                                How do I remove ï»¿ from the beginning of a file?
                            
                                What's the difference between encoding and charset?
                            
                                Change the encoding of a file in Visual Studio Code
                            
                                How can I transform string to UTF-8 in C#?
                            
                                Why specify @charset "UTF-8"; in your CSS file?
                            
                                What is the difference between encode/decode?
                            
                                Convert Unicode to ASCII without errors in Python
                            
                                Why charset names are not constants?
                            
                                Do I really need to encode '&' as '&amp;'?
                            
                                PHP DOMDocument loadHTML not encoding UTF-8 correctly
                            
                                Write to UTF-8 file in Python
                            
                                Writing Unicode text to a text file?
                            
                                What is ANSI format?
                            
                                How do you echo a 4-digit Unicode character in Bash?
                            
                                What is a vertical tab?
                            
                                How to convert Strings to and from UTF8 byte arrays in Java
                            
                                "for line in..." results in UnicodeDecodeError: 'utf-8' codec can't decode byte
                            
                                Is there an upside down caret character?
                            
                                Detect encoding and make everything UTF-8

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does "Content-type: application/json; charset=utf-8" really mean?

Tags:

mime-types

character-encoding

DenaliHardtail

People also ask

2 Answers

deceze

Drew Noakes

Recent Activity

Donate For Us