In slide 25 of this talk by Twitter's Head of Open Source office, the presenter says that Mesos allows one to track and manage even GPU (I assume he meant GPGPU) resources. But I cant find any information on this anywhere else. Can someone please help? Besides Mesos, are there other cluster managers that support GPGPU?
Mesos does not yet provide direct support for (GP)GPUs, but does support custom resource types. If you specify --resources="gpu(*):8" when starting the mesos-slave, then this will become part of the resource offer to frameworks, which can launch tasks that claim to use these resources. Once some of the gpu resources are in use by a task, only the remaining resources will be offered again, until that task completes and the gpu resources become available again. In this way, the Mesos resource allocator can actually schedule the gpu resources you declared, and ensure that only the amount declared are offered/allocated to frameworks.
Mesos does not yet have support for gpu isolation, but with "pluggable isolator modules", you could build your own gpu isolator to enforce gpu resource limits.
Alternately, if you don't want to allocate individual gpu resources, but only want to declare some nodes as having gpus while others do not, you can just use --attributes="hasGpu:true" or something similar to differentiate the nodes that do/do not have gpus. This information is also passed onto the frameworks in resource offers, but these attributes cannot be "consumed" by a running task, so they will always be offered for that node.
For more information, see https://mesos.apache.org/documentation/attributes-resources/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With