I have an API for a file upload that expects a multipart form submission. But I have a customer writing a client and his system can't properly generate a multipart/form-data
request. He's asking that I modify my API to accept the file in a application/x-www-form-urlencoded
request, with the filename in one key/value pair and the contents of the file, base64 encoded, in another key/value pair.
In principle I can easily do this (tho I need a shower afterwards), but I'm worried about size limits. The files we expect in Production will be fairly large: 5-10MB, sometimes up to 20MB. I can't find anything that tells me about length limitations on individual key/value pair data inside a form POST, either in specs (I've looked at, among others, the HTTP spec and the Forms spec) or in a specific implementation (my API runs on a Java application server, Jetty, with an Apache HTTP server in front of it).
What is the technical and practical limit for an individual value in a key/value pair in a form POST?
There are artificial limits, configurations, present on the HttpConfiguration class. Both for maximum number of keys, and maximum size of the request body content.
In practical terms, this is a really bad idea.
You'll have a String, which uses 2-bytes per character for the Base64 data. And you have the typical 33% overhead just being Base64.
They'll also have to utf8 urlencode the Base64 string for various special characters (such as "+" which has meaning in Base64, but is space " " in urlencoded form. So they'll need to encode that "+" to "%2B").
So for a 20MB file you'll have ...
20,971,520 bytes of raw data, represented as 27,892,122 characters in raw Base64, using (on average) 29,286,728 characters when urlencoded, which will use 58,573,455 bytes of memory in its String form.
The decoding process on Jetty will take the incoming raw urlencoded bytes and allocate 2x that size in a String before decoding the urlencoded form. So that's a 58,573,456 length java.lang.String (that uses 117,146,912 bytes of heap memory for the String, and don't forget the 29MB of bytebuffer data being held too!) just to decode that Base64 binary file as a value in a x-www-form-urlencoded String form.
I would push back and force them to use multipart/form-data
properly. There are tons of good libraries to generate that form-data properly.
If they are using Java, tell them to use the httpmime
library from the Apache HttpComponents project (they don't have to have/use/install Apache Http Client to use the httpmime, its a standalone library).
Alternative Approach
There's nothing saying you have to use application/x-www-form-urlecnoded
or multipart/form-data
.
Offer a raw upload option via application/octet-stream
They use POST
, and MUST include the following valid request headers ...
Connection: close
Content-Type: application/octet-stream
Content-Length: <whatever_size_the_content_is>
Connection: close
to indicate when the http protocol is complete.Content-Type: application/octet-stream
means Jetty will not process that content as request parameters and will not apply charset translations to it.Content-Length
is required to ensure that the entire file is sent/received.Then just stream the raw binary bytes to you.
This is just for the file contents, if you have other information that needs to be passed in (such as filename) consider using either the query parameters for that, or a custom request header (eg: X-Filename: secretsauce.doc
)
On your servlet, you just use HttpServletRequest.getInputStream() to obtain those bytes, and you use the Content-Length
variable to verify that you received the entire file.
Optionally, you can make them provide a SHA1 hash in the request headers, like X-Sha1Sum: bed0213d7b167aa9c1734a236f798659395e4e19
which you then use on your side to verify that the entire file was sent/received properly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With