Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I monitor stalled tasks?

I am running a Rust app with Tokio in prod. In the last version i had a bug, and some requests caused my code to go into an infinite loop.

What happened is while the task that got into the loop was stuck, all the other task continue to work well and processing requests, that happened until the number of stalling tasks was high enough to cause my program to be unresponsive.

My problem is took a lot of time to our monitoring systems to identify that something go wrong. For example, the task that answer to Kubernetes' health check works well and I wasn't able to identify that I have stalled tasks in my system.

So my question is if there's a way to identify and alert in such cases?

If i could find way to define timeout on task, and if it's not return to the scheduler after X seconds/millis to mark the task as stalled, that will be a good enough solution for me.

like image 480
Eyal leshem Avatar asked Nov 08 '25 10:11

Eyal leshem


2 Answers

Using tracing might be an option here: following issue 2655 every tokio task should have a span. Alongside tracing-futures this means you should get a tracing event every time a task is entered or suspended (see this example), by adding the relevant data (e.g. task id / request id / ...) you should then be able to feed this information to an analysis tool in order to know:

  • that a task is blocked (was resumed then never suspended again)
  • if you add your own spans, that a "userland" span was never exited / closed, which might mean it's stuck in a non-blocking loop (which is also an issue though somewhat less so)

I think that's about the extent of it: as noted by issue 2510, tokio doesn't yet use the tracing information it generates and so provide no "built-in" introspection facilities.

like image 180
Masklinn Avatar answered Nov 11 '25 09:11

Masklinn


Tokio Console is a monitoring solution built by the Tokio team. It can be used to monitor for stalled tasks among other things.

In spirit, it is like the top command but specifically for Tokio.

https://github.com/tokio-rs/console

like image 39
tjb Avatar answered Nov 11 '25 07:11

tjb



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!