Cannot commision nodes with duplicate UUIDs on MAAS 2.8

Running MaaS 2.8.1 (8567-g.c4825ca06) on Ubuntu 20.04 via snap

I have 4 dell blades I am trying to commission.

When there are no nodes in MaaS, PXE booting any node will commission it successfully.
When a single node has been commissioned, any other node PXE boots, but fails to commission.

When commissioning with shutdown disabled, I connect to a node that failed to commission:

Looking at logs, there are no errors.

On the controller, there is no entry for that node under rsyslogs (the successful node has logs).

When trying to manually commission, running
maas-run-remote-scripts --no-download --no-send .
It successfully executes.

I have looked through the scripts, but can’t identify an issue.

EDIT: Trying to commission with, and deploy 18.04. I have tried other versions and have the same issue.

EDIT2: [There was other errors here which have been checked and do not apply] - The first machine to be commissioned (out of any I have), will always commission successfully.

1 Like

Check how big your dynamic range in your dhcp pool is.
I saw this when I reserved only 4 dynamic IPs on a small public block. Nodes 5+ would fail their commissioning / pxe boot as they never get an IP from the dhcp pool.

Thanks for your response!

My DHCP pool is 150 addresses - and IPs are fine, I can ssh into each one during the commissioning process.

The issue seems to be related to UUIDs, it seems that Dell has the same GUID on all my servers. Shouldn’t MaaS give an actual warning/error on this? I am still not certain, but trying to differentiate the UUIDs (Not sure how).

Is this something that should be raised as a bug / feature request? Why not use the MAC address of the first NIC to determine a UUID - ensuring there are no duplicates but also that a unique UUID is generated?

The lack of a warning/information has really ticked me off. I spent at least 12 hours over the weekend - to no avail (yet).

1 Like

To anyone still following this thread:

I have modified the hooks.py, to convert the uuid to a number, and add the hash of the chassis serial number. I am testing to see if this will solve the issue.

Unfortunately, it seems the Service Tag cannot be changed, and the Hardware-reported UUID is based on the service tag. This has caused 8 nodes in my rack to report the same UUID.

I am curious as to why MAAS, when performing the initial PXE boot to enlist, doesn’t flag this node as a duplicate, but takes it to the “New” state, ready for commissioning?

1 Like

To anyone landing here from a likely search:

The follow-up threads are here (interim solution~), and here (request for formal solution)

thanks, @knaledge, for summarizing that – very helpful!

1 Like

You’re welcome @billwear! We’ve become increasingly aware of your presence on these forums so it’s cool to hear from you.

Looking forward to engaging in the outcome on this situation. While it presented a challenge to encounter, it’s equally fun and encouraging to see this situation (duplicate UUIDs) get some traction both from the community and the contributors.

Thanks for your efforts! :slight_smile:

1 Like