Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Receiving HTTP headers

In educational purposes I'm writing a HTTP server in C++. When receiving a request, how do I know when the client has finished sending headers? Is there an obligation that all headers must be sent in one shot? What if a client sends G, then after 5 seconds E, then T..? Should I wait a timeout and just close the connection if it takes too long? Should I start parsing as soon as I get the first bytes to know if the request is invalid?

I know there are a lot of libraries for this, I'm just reinventing the wheel to better understand how the Web works at different layers. And I can't find how they deal with exactly my question.

like image 639
RocketR Avatar asked Oct 16 '25 14:10

RocketR


2 Answers

According to the HTTP 1.1 RFC (4.1):

    generic-message = start-line
                      *(message-header CRLF)
                      CRLF
                      [ message-body ]
    start-line      = Request-Line | Status-Line

There is an extra CRLF after the message header. So once you encounter the sequence CRLF -> CRLF, the body starts.

Concering timeout: You could start parsing once receiving characters (wait for CRLF so you know a header was completed) and once the request takes longer than 5 seconds or so, send back a 408 Request Timeout.

like image 105
Femaref Avatar answered Oct 18 '25 08:10

Femaref


There are two parts to this answer.

Firstly, the issue of delay and time-out: you should deal with timeouts indeed, as it's generally not possibly to detect whether a TCP connection is broken. There is more on this topic in this question: TCP socket in Unix - notify server I am done sending

Secondly, the format of an HTTP request is defined (in RFC 2616, section 5) as follows:

    Request       = Request-Line              ; Section 5.1
                    *(( general-header        ; Section 4.5
                     | request-header         ; Section 5.3
                     | entity-header ) CRLF)  ; Section 7.1
                    CRLF
                    [ message-body ]          ; Section 4.3

Essentially, you get the request line (for example GET /index.html HTTP/1.1), followed by multiple header lines (without empty lines). Then, the list of headers ends with an empty line. All ends of lines are represented with CRLF ("\r\n").

In addition to this, some requests also have a body (typically those using POST or PUT). If the request has a message body, its length will be given either by the Content-Length header or using delimiters via chunked transfer encoding.

like image 30
Bruno Avatar answered Oct 18 '25 10:10

Bruno



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!