KVM node only pinning KVM guests to a single numa node

ryan1336 · 6 July 2020 18:09

Hi,

Is this normal?

It seems like every VM is using a single CPU/numa node

billwear · 7 July 2020 13:47

@ryan1336, that might be intended behavior; let me look at the spec for a minute…

billwear · 7 July 2020 13:52

@ryan1336, I’m not sure what the current behavior is/intended to be, but I know that our goal is to help users deploy high-perf workloads on KVM hosts, so that means aligning the VMs to a single NUMA node. If it’s not normal now, it’s going to be eventually, for performance reasons. Does that help?

ryan1336 · 10 July 2020 10:00

It makes sense to pin them to a single NUMA node, but at the moment it seems like it is only using the first NUMA node and is not assigning any VMs to the other.

I would expect that some VMs should be pinned to the other, so that both CPUs can be used effectively and performantly.

dvnt · 10 July 2020 13:06

Nice catch @ryan1336

I noticed this as well.

I also noticed when deploying with Juju, MaaS has no round robin strategy for KVM machine placements. It will exhaust all resources on a single host before composing elsewhere instead of composing and placing machines evenly across all registered KVM hosts.

The problem here is that when deploying a fault tolerant architecture, majority (or all) of the infrastructure is placed on the same failure domain.

billwear · 10 July 2020 13:46

hmmm, something to think about…

ryan1336 · 11 July 2020 08:00

Yes, effectively right now my dual-CPU machine is being half-utilised. The other NUMA node will seemingly never be used, thus completely wasting resources and meaning that other KVM nodes will need to be used to place VMs because MAAS is not interested in pinning anything to the other NUMA node.

dvnt · 12 July 2020 13:09

Okay I’m glad to confirm this is actually not the case.
It bothered me so much that I ran an experiment in the Lab.
I have a physical machine configured as a KVM pod with 2x Xeon 5630’s.
Numa Node 0 = CPU 0->7, Numa Node 1 = CPU 8->15

I created a bunch of different VM’s and from this output we can see that juju-home-lab-0 is a 2 core machine using CPU 12 & 4, frank-quagga is a single core machine on CPU 15 and mighty-roughy is consuming CPU 6.
So we’re definitely using both Numa nodes in the physical brick.

root@witty-boa:~# for i in $(virsh list --all | tail +3| awk '{print $2}' ); do echo $i; virsh vcpuinfo $i ; done

juju-home-lab-0
VCPU:           0
CPU:            12
State:          running
CPU time:       1104.6s
CPU Affinity:   yyyyyyyyyyyyyyyy

VCPU:           1
CPU:            4
State:          running
CPU time:       1063.0s
CPU Affinity:   yyyyyyyyyyyyyyyy

frank-quagga
VCPU:           0
CPU:            15
State:          running
CPU time:       781.8s
CPU Affinity:   yyyyyyyyyyyyyyyy

mighty-roughy
VCPU:           0
CPU:            6
State:          running
CPU time:       781.9s
CPU Affinity:   yyyyyyyyyyyyyyyy

billwear · 13 July 2020 12:55

nice work. i was skeptical, based on the dev conversations we had when we were coding this, but i didn’t code it myself (on this team, i’m the tech writer), so… hadn’t had time to check or test yet. thanks for this!