Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What character encoding should I use for a HTTP header?

Tags:

http-headers

People also ask

What encoding do HTTP headers use?

HTTP messages are encoded with ISO-8859-1 (which can be nominally considered as an enhanced ASCII version, containing umlauts, diacritic and other characters of West European languages). At the same time, the message body can use another encoding assigned in "Content-Type" header.

How do I set character encoding in HTTP header?

Use the header() function before generating any content, e.g.: header('Content-type: text/html; charset=utf-8'); Java Servlets.

What characters are valid in HTTP header?

The value of the HTTP request header you want to set can only contain: Alphanumeric characters: a - z and A - Z. The following special characters: _ :;.,\/"'?!(){}[]@<>=-+*#$&`|~^%

Should HTTP headers be URL encoded?

- This is a secure coding best practice to encode the HTTP headers (and other contexts like db queries, URLs, file paths etc) that may contain sensitive or untrusted data.


In short: Only ASCII is guaranteed to work. Some non-ASCII bytes are allowed for backwards compatibility, but are not supposed to be displayable.

HTTPbis gave up and specified that in the headers there is no useful encoding besides ASCII:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.


Previously, RFC 2616 from 1999 defined this:

Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].

and RFC 2047 is the MIME encoding, so it'd be:

=?UTF-8?Q?=E2=9C=B0?=

but I don't think that many (if any) clients support it.


Please read comments first, this answer likely draws wrong conclusions from the right sources, needs edit.


You can use any printable ASCII chars, and no special chars like ✰ (Which is not ASCII)

Tip: you can encode anything in JSON.

Edit: may not be obvious at first, the character encoding defined in the header only applies for the response body, not for the header itself. (As it would cause a chicken-&-egg problem.)


I'd like to sum up all the relevant definitions as per the spec linked by Penchant.

message-header = field-name ":" [ field-value ]
field-name     = token
field-value    = *( field-content | LWS )

So, we are after field-value.

LWS            = [CRLF] 1*( SP | HT )
CRLF           = CR LF
CR             = <US-ASCII CR, carriage return (13)>
LF             = <US-ASCII LF, linefeed (10)>
SP             = <US-ASCII SP, space (32)>
HT             = <US-ASCII HT, horizontal-tab (9)>

LWS stands for Linear White Space. Essentially, LWS is Space or Tab, but you can break your field-value into multiple lines by starting a new line before a Space or Tab.

Let's simplify it to this:

field-value    = <any field-content or Space or Tab>

Now we are after field-content.

field-content  = <the OCTETs making up the field-value
                 and consisting of either *TEXT or combinations
                 of token, separators, and quoted-string>
OCTET          = <any 8-bit sequence of data>
TEXT           = <any OCTET except CTLs,
                 but including LWS>
CTL            = <any US-ASCII control character
                 (octets 0 - 31) and DEL (127)>
token          = 1*<any CHAR except CTLs or separators>
separators     = "(" | ")" | "<" | ">" | "@"
                 | "," | ";" | ":" | "\" | <">
                 | "/" | "[" | "]" | "?" | "="
                 | "{" | "}" | SP | HT

TEXT is the most general and includes all the rest -so forget about the rest-. Here is the US-ASCII charset (= ASCII)

As you can see, all printable ASCII chars are allowed.


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!