What would be the recommended maximum size for a job payload?
As a specific example, is a HTML document comprised of 500kb to 1Mb of text too large to be passed in to a job payload?
Since Sidekiq is backed by redis I'd say 512Mb, but I wonder if there's a limitation on the Sidekiq side of things.
Workarea applications use Sidekiq as a job queuing backend to perform units of work asynchronously in the background. These jobs, which include search indexing, cache busting, and cleanup of expired data, are defined as workers .
Sidekiq is usually the background job processor of choice for Ruby-on-Rails, and uses Redis as a data store for the job queues.
There's no need to change all your existing jobs if you are already using Sidekiq through Active Job. You don't have to use one or the other, you can define and use both at the same time.
Sidekiq is described as a “well-known queue processing software”. It's used by Ruby applications needing to run tasks in the background, and not in the web requests handling time, like Mastodon, Diaspora, GitLab and Discourse. Sidekiq is also used to submit threads to the PHASTER phage search tool.
See this article, you should make your job parameters small and simple. Just store some simple identifiers, and then Look up the objects once you actually need them in your perform method.
And Because it need serialization and deserialization, it will be extra cost to you add the html content to job. So just save the html-content into string or some container and send the string id or container id to redis for efficiency and simplity.
When you call perform_async Sidekiq makes a Hash of the arguments, worker class, and other information the job will need. Then it serializes the Hash as JSON (collectively the "payload") and pushes it into Redis with lpush. See A Tour of the Sidekiq API  by the author of Sidekiq for more.
The only limits are those of Redis and Ruby's string sizes, and your Redis memory. For a Ruby String that's 2G if you're 32bit and you'll-run-out-of-memory-first for 64 bit.
For Redis that is 512M. Note that's 512M after the content is serialized as JSON. If it's mostly text this will be a small amount. If it's binary data, for example if you compress the text, it could be signficiantly larger.
What would be the recommended maximum size for a job payload?
As small as possible. Large payloads require expensive JSON serialization and de-serialization and consume both Redis and worker memory risking out of memory errors.
Instead of sending the content of a file, store the file somewhere the worker can access. This could be a shared disk, or an S3 Bucket. Send only what is needed for the worker to retrieve the file.
See Best Practices in the Sidekiq wiki.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With