For the first dozen HPE DL360/BL460 I installed with MAAS (2.9, a few months back), I just needed to configure the machine to PXE boot and enable IPMI in the ILO, and MAAS would automatically add a user account to the ILO and the commissioning would work. At some point this seems to have stopped working, but it’s hard to tell when. I made several changes to the subnets configuration in MAAS, at times I was trying to commission a host without having enabled IPMI in ILO, at some point I was removing hosts and re-discovering them due to memory and disk changes and this seemed to be the easiest method to get the new parameters detected by MAAS. But MAAS kept on working for deploying/releasing machines.
But now I have a bunch of new machines, and after booting them once manually, they show up in “Commissioning” state with a generated name, but after you see the ubuntu login (maas-enlisting-node), the system shuts down again, while MAAS still shows status “Commissioning”. There is no maas user in ILO, and no parameters for the BMC are configured in MAAS on the new machines in “commissioning” state. There is only 1 event for the host (Node changed status - From ‘New’ to ‘Commissioning’), all items in the commissioning tab are “pending” state. The snap logs command shows only the dhcp leases given to the hosts. (PS: I upgraded to MAAS 3.0 from snap, tried to use ubuntu 18.04 instead of 20.04, but the result didn’t change).
A second (also manually initiated) boot, will do a very similar thing: it will boot up the host, this time it shows the maas-assigned hostname, but shortly after it goes into shutdown again. The state in MAAS seems to move from just “Commissioning” to “Commissioning - loading ephemeral”, it won’t end up in ready state, it will boot and shutdown within minutes. The logs scrolling by seem to indicate downloading packages, executing cloud-init, applying the network config etc.
I have video of the output if that could pin down the problem: https://youtu.be/5igJpJhi5B0
The only way to get the hosts ready is to enter an IPMI config in MAAS manually (and of course creating a matching user in ILO). Then I can use the “Abort” in MAAS, hit “Commission” again and everything will work, now all the details of the host are discovered, and I can continue deploying an operating system.
But how to fix the automatic commissioning?