Simple MultiGPU during inference with huggingface

Question

I have two GPU.

How can I use them for inference with a huggingface pipeline?

Huggingface documentation seems to say that we can easily use the DataParallel class with a huggingface model, but I've not seen any example.

For example with pytorch, it's very easy to just do the following :

net = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
output = net(input_var)  # input_var can be on any device, including CPU

Is there an equivalent with huggingface ?

Charbel-Raphaël Segerie · Accepted Answer

I found it's not possible with the pipelines, so:

two ways :

Do it with the Trainer object in huggingface , which also supports inferences, but it's not optimal.
Use Queues from the multiprocessing standard library, but this creates a lot of boiler plate code

Donate For Us