Enlistment times out, fails with external DHCP server

To provide a little more detail, when I’m using external DHCP the client hangs during cloud-init before enlistment happens, as I see no new data showing up in /var/snap/maas/common/log/rsyslog/maas-enlisting-node/$ip_addr like I do when I have MAAS DHCP management enabled.

In comparing my client where DHCP is MAAS-managed vs. not I see in /var/snap/maas/common/log/httpd/access.log this when it’s working:

10.2.191.255 - - [16/May/2023:10:10:59 -0700] "GET /ipxe.cfg HTTP/1.1" 200 238 "-" "iPXE/1.20.1+ (g4bd0)"
10.2.191.255 - - [16/May/2023:10:10:59 -0700] "GET /ipxe.cfg-3a%3A13%3A7b%3A4d%3A1a%3Acb HTTP/1.1" 404 5 "-" "iPXE/1.20.1+ (g4bd0)"
10.2.191.255 - - [16/May/2023:10:10:59 -0700] "GET /ipxe.cfg-default-amd64 HTTP/1.1" 200 593 "-" "iPXE/1.20.1+ (g4bd0)"
10.2.191.255 - - [16/May/2023:10:10:59 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/boot-kernel HTTP/1.1" 200 11570216 "-" "iPXE/1.20.1+ (g4bd0)"
10.2.191.255 - - [16/May/2023:10:11:00 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/boot-initrd HTTP/1.1" 200 116315156 "-" "iPXE/1.20.1+ (g4bd0)"
10.2.191.255 - - [16/May/2023:10:11:15 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/squashfs HTTP/1.1" 200 464191488 "-" "Wget"
111.111.8.42 - - [16/May/2023:10:11:28 -0700] "GET /MAAS/rpc/ HTTP/1.1" 200 297 "-" "provisioningserver.rpc.clusterservice.ClusterClientService"
111.111.8.42 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1333 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
10.2.191.255 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1333 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
111.111.8.42 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/2012-03-01/meta-data/instance-id HTTP/1.1" 200 17 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
10.2.191.255 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/2012-03-01/meta-data/instance-id HTTP/1.1" 200 17 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
111.111.8.42 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/2012-03-01/meta-data/instance-id HTTP/1.1" 200 17 "-" "python-requests/2.25.1"
10.2.191.255 - - [16/May/2023:10:11:29 -0700] "GET /MAAS/metadata/2012-03-01/meta-data/instance-id HTTP/1.1" 200 17 "-" "python-requests/2.25.1"

But I see this when it’s not working:

10.110.0.9 - - [16/May/2023:10:46:30 -0700] "GET /ipxe.cfg-3a%3A13%3A7b%3A4d%3A1a%3Acb HTTP/1.1" 404 5 "-" "iPXE/1.20.1+ (g4bd0)"
10.110.0.9 - - [16/May/2023:10:46:30 -0700] "GET /ipxe.cfg-default-amd64 HTTP/1.1" 200 575 "-" "iPXE/1.20.1+ (g4bd0)"
10.110.0.9 - - [16/May/2023:10:46:30 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/boot-kernel HTTP/1.1" 200 11570216 "-" "iPXE/1.20.1+ (g4bd0)"
10.110.0.9 - - [16/May/2023:10:46:31 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/boot-initrd HTTP/1.1" 200 116315156 "-" "iPXE/1.20.1+ (g4bd0)"
10.110.0.9 - - [16/May/2023:10:46:46 -0700] "GET /images/ubuntu/amd64/ga-22.04/jammy/stable/squashfs HTTP/1.1" 200 464191488 "-" "Wget"
111.111.8.42 - - [16/May/2023:10:46:58 -0700] "GET /MAAS/rpc/ HTTP/1.1" 200 297 "-" "provisioningserver.rpc.clusterservice.ClusterClientService"
111.111.8.42 - - [16/May/2023:10:46:58 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1288 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
10.110.0.9 - - [16/May/2023:10:46:58 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1288 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
111.111.8.42 - - [16/May/2023:10:47:00 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1288 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"
10.110.0.9 - - [16/May/2023:10:47:00 -0700] "GET /MAAS/metadata/latest/enlist-preseed/?op=get_enlist_preseed HTTP/1.1" 200 1288 "-" "Cloud-Init/23.1.1-0ubuntu0~22.04.1"

One thing I’ve noticed in enlistment logs where clients are working is that cloud-init reports handing out an IPv6 address to clients. I don’t currently provide IPv6 addresses via my external server, but I’m not sure if this is relevant or related to the failures I’m seeing.