MAAS should ignore ephemeral network it doesn't find on deploy, or should clearly surface that a deployment failed because a network device is missing

Using MAAS snap 2.8.6-8602-g.07cdffcaa.

I just had a node failed to deploy because a network device that was
present during commissioning wasn’t present anymore, making cloud-init
sad. To be precise, the node deployed properly, rebooted, and during the
post-deploy boot, cloud-init got sad with :

RuntimeError: Not all expected physical devices present:
{‘be:65:46:cb:58:b7’}

(full stacktrace at https://pastebin.canonical.com/p/9Ycxwk5rRy/)

I was indeed able to find the network device with MAC address
‘be:65:46:cb:58:b7’, and it’s an ephemeral NIC that gets created when
someone logs in the HTML5 console (this is a Gigabyte server by the
way). So someone was probably logged on the HTML5 console when the node
was commissioned.

I deleted this ephemeral device from the node in MAAS, and was then able
to deploy it properly.

These ephemeral NICs appear to have random MAC addresses. I was logged
on the HTML5 console during the boot logged above, and you can see
there’s a device named “enx5a099ca01d4b” with MAC address
“5a:09:9c:a0:1d:4b” (which doesn’t match a known OUI).

This is actually a cdc_ether device :
$ dmesg|grep cdc_ether
[ 29.867170] cdc_ether 1-1.3:2.0 usb0: register ‘cdc_ether’ at usb-0000:06:00.3-1.3, CDC Ethernet Device, 5a:09:9c:a0:1d:4b
[ 29.867296] usbcore: registered new interface driver cdc_ether
[ 29.958137] cdc_ether 1-1.3:2.0 enx5a099ca01d4b: renamed from usb0
[ 205.908811] cdc_ether 1-1.3:2.0 enx5a099ca01d4b: unregister ‘cdc_ether’ usb-0000:06:00.3-1.3, CDC Ethernet Device

(the last time is very probably when I logged off the HTML5 console,
which removes the device).

So I think :

  • MAAS should ignore these devices by default
  • cloud-init shouldn’t die when a cdc_ether device is missing.

Thanks

1 Like

fyi, sounds like LP: #1936972