As described in Cannot commision more than one node on MaaS 2.8, MAAS fails to commission a group of my servers, as they have the same uuid. This is due to them having the same service tag (for reasons unknown). It seems the service tag cannot be modified.
I am trying to modify MaaS to change the UUIDby adding a hash of the chassis serial (which is unique), in order to generate unique uuids.
In
metadataserver/builtin_scripts/hooks.py
I am modifying the node.hardware_uuid. Once the node has enlisted, I check the database:
maas_db=# select hardware_uuid from maasserver_node;
4c4c4544-004d-3410-94e1-77647ea92daf which is correct, and unique
And when enlisting a second node, it fails. regiond.log shows that the uuid is a duplicate, and the interface shows that commissioning failed.
I am unable to find where the original UUID is stored, and and how the machine-resources which is downloaded to the node for commissioning gets a UUID different from what is stored in the database.
Any help is greatly appreciated! I have been trying to debug this issue for the last 24h straight.
The UUID of a machine comes from the mainboard, it should always be unique. UUID support was added to MAAS because some machines, such as IBM Z series mainframes, don’t have consistent MAC addresses. When booting IBM S390X and BIOS x86 system identify themselves using the UUID first. On BIOS x86 if the server(MAAS) returns a 404 it tries with the MAC address the system booted with.
Starting with MAAS 2.7 we gather most hardware information from the commissioning script 50-maas-01-commissioning. This script downloads a binary, runs it, and captures the output. The binary itself is actually just the /resources and / endpoints from LXD. lshw is still run as you can configure MAAS to automatically create tags based on lshw output.
I would consider this a bug in your systems firmware and report it to your manufacturer. I understand that may take awhile to get fixed, so you’re looking for a workaround to get MAAS working now. MAAS still supports booting by MAC address as its still the primary method for UEFI and other firmwares, it can operate without using the UUID if your platform supports it.
Neither of these options would be officially supported but either should fix your problem.
The easiest fix is by modifying the MAAS source code to just not store the hardware_uuid. If you’re using the Debian packages keep in mind upgrading will remove this fix. If you’re using the Snap you’ll have to build the Snap yourself.
diff --git a/src/metadataserver/builtin_scripts/hooks.py b/src/metadataserver/builtin_scripts/hooks.py
index 26c7e1137..fe219081c 100644
--- a/src/metadataserver/builtin_scripts/hooks.py
+++ b/src/metadataserver/builtin_scripts/hooks.py
@@ -415,7 +415,11 @@ def _process_system_information(node, system_data):
uuid = system_data.get("uuid")
# Convert "" to None, so that the unique check isn't triggered.
- node.hardware_uuid = None if uuid == "" else uuid
+ # node.hardware_uuid = None if uuid == "" else uuid
+ # Disable gathering the UUID as this environment contains non-unique
+ # hardware UUID's. Systems will be identified during boot using their
+ # MAC address.
+ node.hardware_uuid = None
# Gather system information. Custom built machines and some Supermicro
# servers do not provide this information.
You could also modify this in the database. Keep in mind that you’ll have to run this after enlisting/commissioning every machine and machines will have to be added serially.
These servers are not under a service contract (I bought them for training). Dell firmware generates a UUID based on the Service Tag, and the Service Tags are all the same, so the UUIDs are all the same. I tried to modify the Service Tag, but to no avail - it can’t be modified.
The motherboard serial was unique, so I added it to the reported UUID to generate a unique UUID for use.
I went with: (excuse the verbose logging)
@@ -32,7 +34,7 @@ from provisioningserver.refresh.node_info_scripts import (
from provisioningserver.utils import kernel_to_debian_architecture
from provisioningserver.utils.ipaddr import parse_ip_addr
from provisioningserver.utils.lxd import parse_lxd_cpuinfo
-
+from uuid import UUID
logger = logging.getLogger(__name__)
@@ -412,10 +414,35 @@ def _process_system_information(node, system_data):
)
else:
NodeMetadata.objects.filter(node=node, key=key).delete()
+ print("[UUID] Adding Property {} = {}".format(key, value))
- uuid = system_data.get("uuid")
+ read_uuid = system_data.get("uuid")
# Convert "" to None, so that the unique check isn't triggered.
- node.hardware_uuid = None if uuid == "" else uuid
+
+ print("[UUID] Read UUID: {}".format(read_uuid))
+ serialNumber = system_data.get('motherboard').get('serial')
+ print("[UUID] Read Motherboard Serial: {}".format(serialNumber))
+ serialHash = hash(serialNumber)
+ print("[UUID] Hashed Serial: {}".format(serialHash))
+
+ if read_uuid is None or len(read_uuid) < 10:
+ print("[UUID] Read UUID was invalid")
+ print("[UUID] Generating from serial only")
+ uuid = UUID(int=int(int(serialHash)))
+ else:
+ print("[UUID] Generating composite UUID from hardware and serial")
+ uuid_num = UUID(read_uuid).int
+ print("[UUID] hardware uuid integer: {}".format(uuid_num))
+ uuid_comp_num = UUID(int = int(uuid_num + int(serialHash)))
+ print("[UUID] Composite UUID {}".format(uuid_comp_num))
+ #UUID is now derived from HardwareUUID + chassis serial
+ #Covert to string
+ uuid = uuid_comp_num
+ #In any case, stored uuid is uuid
+ print("[UUID] Final UUID: {}".format(uuid))
+ node.hardware_uuid = uuid
+ print("============== UUID END =============================")
+
I’m glad that’s over - it took 24h to figure out the code and flow without a debugger!