Any plans for GPU in KVM?

jonathanhatch · 16 June 2020 21:01

I’m getting more and more requests from users to get access to the GPUs we have in our bare metal racks. Tesla V100s, to be precise, to run workloads. Are there any plans to include this as a resource we can add to a KVM host? Are there any good tutorials for making GPUs available as a resource?

Thank you!

seffyroff · 16 June 2020 22:31

I believe if you orchestrate with Juju/LXD you can without too much pain, spin up GPU workload-ready containers/metal/k8s.

ltrager · 18 June 2020 20:08

MAAS 2.8 includes support for LXD VM hosts. We plan to eventually support GPU’s but do not currently have an ETA.

In the meantime you can manually add GPUs or other resources to a VM MAAS creates.

jonathanhatch · 18 June 2020 20:51

Is there documentation for manually adding GPU or other resources to VMs that MAAS creates? Can you point me in that direction?

billwear · 19 June 2020 16:20

@jonathanhatch, we don’t have any in the MAAS doc yet, but I’m going to go look for KVM/LXD doc on that and see what I can find. If I find it, I’ll let you know and also add it as a reference to the doc.

jonathanhatch · 20 June 2020 01:44

Thank you, Bill. I really appreciate that!

sabdfl · 20 June 2020 10:45

@jonathanhatch the underlying VM provider in MAAS is moving to LXD, which can certainly do PCIE passthrough to containers and, if it doesn’t already, will soon do the same for VMs. All the KVM hosts in a single MAAS rack will become a LXD cluster. Suggest you dig into LXD and then you’ll be ahead of the curve

stgraber · 29 June 2020 03:06

lxc config device add NAME-OF-INSTANCE gpu1 gpu pci=ADDR

This was introduced in LXD 4.2 but is getting backported to 4.0.x too (is in 4.0.2 that will hit stable tomorrow).

Note that instead of pci=ADDR you can also select based on productid/vendorid if that’s more convenient.

I have such a VM here that’s managed by MAAS:

root@athos:~# lxc config show maas-vm12
architecture: x86_64
config:
  volatile.eth0.hwaddr: 00:16:3e:73:1f:58
  volatile.eth1.hwaddr: 00:16:3e:73:e3:7a
  volatile.last_state.power: STOPPED
  volatile.vm.uuid: 63f91205-2ee8-47ac-9406-d821210c6f7c
devices:
  gpu1:
    pci: "0000:04:00.0"
    type: gpu
  gpu2:
    pci: "0000:82:00.0"
    type: gpu
ephemeral: false
profiles:
- maas
stateful: false
description: ""

db0west · 30 August 2020 05:47

Did you see this? Automatically configure pass through for Tesla GPUs with MAAS. https://ubuntu.com/blog/hardware-discovery-and-kernel-auto-configuration-in-maas

jonathanhatch · 30 August 2020 16:06

OMG. This is exactly what I needed. I’m being asked to turn a Supermicro 4124GS into multiple 1x GPU virtual machines. I think this is answer. Thank you, db0west, so much!

db0west · 30 August 2020 17:20

That looks like a sweet piece of kit! Good luck!

system · 1 September 2020 17:20

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.