I would also like to add that it is only enlistment that is failing, while commissioning and deployment work.
One more thing that i observed that is different when enlistment fails is that , the hostname defaults to ubuntu, and does not show the hostname as maas-enlisting-node.
I also see that enlistment works fine, when i enable DHCP on MAAS, and it is only failing when i use the external DHCP.
I would like to debug, on why enlistment with external DHCP suddenly stopped working.
Hi @r00ta, original poster from the thread you were referring to here: that looks like the same problem I’ve been facing too. I wouldn’t say there was a solution provided in the other thread really as the service is still broken for me and for the moment I’ve stopped working on it (and may very well need to switch to something other than MAAS). If I find a solution though I’ll post back here.
Unfortunately the solution has to be specific to your env as the setup with an external DHCP server is not something we can generalize in this thread without any additional information.
The post I linked contains some info to enable you to start investigating the issue.
In case you get back working on it and you want to share some details in this community we are willing to listen and hopefully help
And the behavior looks to be the same, where i don’t see the respective access logs beyond the time, when the node hangs waiting for metadata crawler, and errors out.
One of the differences with my scenario is that, the external dhcp server that i had configured was working perfectly fine, until something changed and it stopped working.
As part of the debugging process:
I had repeated the steps that i did initially to configure the external dhcp … such as:
Copy the exact dhcpd.conf that maas uses , when dhcp is enabled in MAAS to an external DHCP and try enlisting. When i tried this about an year ago, it did work perfectly fine, where MAAS was able to enlist fine and recognize the server.
I had later removed each of the options from that dhcpd.conf until i saw what was the bare minimum needed. This configuration worked for many months, until it broke few days ago.
I do suspect that it could be related to instance id metadata, as mentioned in this statement:
“One possibility could be that your DHCP server is not correctly set up to provide the necessary boot options for MAAS. For MAAS to work with an external DHCP server, the DHCP server needs to be configured with specific boot options that tell the machines where to get their PXE boot images and metadata. The MAAS documentation provides more information about PXE and DHCP for MAAS.”
May i please know what boot options are these and if they are included in the dhcpd.conf that MAAS uses or if MAAS dynamically adds something to its dhcpd.conf, or leases file that the external dhcp is might be missing when i make a copy?
I see that the client is able to get pxe images, but i am not able to confirm if the client is able to get the metadata, and what dhcp boot options would help the same.
Also - i see that regiond logs are not in the same time zone, as the rackd controller. I did mention ntp and only to use ntp server, but will that not make regiond use the same ntp? I am just thinking if metadata retrieval is somehow failing because of this.