Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to manage multiple instances of a batch job using systemd?

Using systemd, I would like to manage multiple instances of a queue worker with the following properties:

  1. the number of queue workers should be configurable
  2. each queue worker should restart on failure
  3. run a single command to start/stop/restart all instances of the queue workers
  4. using a single command, monitor if all instances of the queue workers are running.

I was able to implement these features, but the solution features heavy compared to alternatives like e.g. supervisord. Is there a simpler way to manage multi-instance services using systemd alone?

like image 882
flexponsive Avatar asked Dec 06 '25 17:12

flexponsive


1 Answers

Steps: Manage Multiple Instances of a Service with systemd

  1. Create a systemd template unit, which can be achieved by adding an "@" to the unit's file name, and then manually create different units from this template 1.
  2. Use the Restart=on-failure setting in the Service configuration 2.
  3. Create a main service, which does nothing, and then make the worker services depend on this main service using the PartOf, After and WantedBy dependencies 3.
  4. Queue the systemctl to command to see if any of the worker process are inactive; the status of the main process is not informative.

Set-Up: systemd Unit Files

Create these two files:

/etc/systemd/system/[email protected]:

[Unit]
Description="Queue Worker instance %i"
PartOf=queue_main.service
After=queue_main.service

[Service]
# Pretend that the component is running
ExecStart=/bin/sleep infinity
Restart=on-failure

[Install]
WantedBy=queue_main.service

/etc/systemd/system/queue_main.service:

[Unit]
Description=Queue Main

[Service]
# execute a dummy program, and keep the service active after exit
Type=oneshot
ExecStart=/bin/true
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Creating Multiple Service Instances using systemd templates

First we must decide on the number of instances we want to have and then "enable" these services through systemctl. This will create a symlink for each instance to the service template file:

# systemctl enable queue_worker\@{1..3}.service
Created symlink /etc/systemd/system/queue_main.service.wants/[email protected] → /etc/systemd/system/[email protected].
Created symlink /etc/systemd/system/queue_main.service.wants/[email protected] → /etc/systemd/system/[email protected].
Created symlink /etc/systemd/system/queue_main.service.wants/[email protected] → /etc/systemd/system/[email protected].

Starting and Stopping Multiple Service Instances with systemctl

Because of the way we defined the dependency of the queue worker services on the queue main service, starting the queue_main.service will cause systemd to start each worker service:

# systemctl start queue_main.service # launches all three worker instances successfully.
# systemctl status queue_main.service
● queue_main.service - Queue Main
     Loaded: loaded (/etc/systemd/system/queue_main.service; enabled; vendor preset: enabled)
     Active: active (exited) since Mon 2022-09-19 15:11:46 UTC; 2min 55s ago
    Process: 404801 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 404801 (code=exited, status=0/SUCCESS)

Sep 19 15:11:46 dev systemd[1]: Starting Queue Main...
Sep 19 15:11:46 dev systemd[1]: Finished Queue Main.

# systemctl status
[...]
├─system-queue_worker.slice
│ ├─[email protected]
│ │ └─398812 /bin/sleep infinity
│ ├─[email protected]
│ │ └─398817 /bin/sleep infinity
│ └─[email protected]
│   └─398815 /bin/sleep infinity

As we see, the queue_main.service starting successfully triggered the three worker services. However, the worker services are not dependents of the main service and I was not able to come up with a way of accomplishing this. Also the main service has the peculiar activity status "active (exited)"

Stopping and restarting the worker services can also be straightforwardly achieved through the main service:

# systemctl stop queue_main # terminates all queue worker service instances
# systemctl restart queue_main # restarts all queue worker service instances

Monitoring multiple service instances using only systemctl

Unfortunately, the status of our queue_main.service is not informative about the status of the individual workers. To monitor the workers, we need to check their individual status. We can get output amenable to scripting in the following ways:

# kill -HUP 404818 # manually kill one worker to make the output more interesting

# systemctl list-units "queue_worker@*.service" --all --no-legend # show the status of all instances
[email protected] loaded active   running "Queue Worker instance 1"
[email protected] loaded inactive dead    "Queue Worker instance 2"
[email protected] loaded active   running "Queue Worker instance 3"

# systemctl list-units "queue_worker@*.service" --all --state=inactive --no-legend # show only inactive services
[email protected] loaded inactive dead "Queue Worker instance 2"

This solution was tested on Ubuntu 20.04 with systemd 245

like image 159
flexponsive Avatar answered Dec 08 '25 23:12

flexponsive



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!