Cloud-init "no datasource found" during commissioning

I recently started getting an error just like this post: Unable to comission, cloud-init: Can not apply stage final, no datasource found!

When I add a machine and it starts commissioning, it PXE boots and the IPMI control is working, but then I get the error shown above.

I was running 3.2.7, I believe. Since getting this error, I decided to upgrade Ubuntu to 22.04 and MaaS to 3.3.4. On 3.3.4, I’m having the same issue.

Are there any logs I should be looking at to dig deeper?

Are you using an outside dhcp server? I ran into the same issues when I was doing that. I had to rearrange the whole network and create a seprate subnet for maas so it could control the dhcp and then it started working. Well assuming that it was the first networking device anyways.

I am using an external DHCP server, but it has been working fine this way for years. I don’t directly manage the DHCP server, so not sure how easy it would be for us to move the subnets around and have MaaS manage DHCP.

I’d love to get to the bottom of this if possible.

Interesting, good to know it was not just me.

I 100% agree, I SOOO want to be able to use another DHCP server and not be stuck with maas controling it. We are setup using dhcp static leases right now and it works great due to how much hardware is moved around.

Trying to make the maas controlled dhcp work is really adding alot of complexity.

Yeah… I don’t want all of my machines to be inaccessible if the MaaS machine goes down.

yes, that is a concern I have as well, I am going to be virtulized it in proxmox with high availability to help with it but would still much prefer to have dhcp handled in the router.

I didn’t see a way to edit my original post, but I ran tcpdump port 69 on the MaaS machine to view TFTP traffic, and found this after starting commissioning for a machine:

14:11:59.401800 IP aa.aa.aa.aa.2070 > bb.bb.bb.bb.tftp: TFTP, length 27, RRQ "pxelinux.0" octet tsize 0
14:11:59.407666 IP aa.aa.aa.aa.2071 > bb.bb.bb.bb.tftp: TFTP, length 32, RRQ "pxelinux.0" octet blksize 1456
14:11:59.655678 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 41, RRQ "ldlinux.c32" octet tsize 0 blksize 1408
14:11:59.696875 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 79, RRQ "pxelinux.cfg/44454c4c-4e00-1037-8044-b3c04f484b32" octet tsize 0 blksize 1408
14:11:59.706043 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.706196 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 63, RRQ "pxelinux.cfg/01-80-18-44-dd-9a-2c" octet tsize 0 blksize 1408
14:11:59.716057 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.716211 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 51, RRQ "pxelinux.cfg/0AF44341" octet tsize 0 blksize 1408
14:11:59.717194 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.717343 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 50, RRQ "pxelinux.cfg/0AF4434" octet tsize 0 blksize 1408
14:11:59.718851 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.718997 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 49, RRQ "pxelinux.cfg/0AF443" octet tsize 0 blksize 1408
14:11:59.720288 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.720440 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 48, RRQ "pxelinux.cfg/0AF44" octet tsize 0 blksize 1408
14:11:59.721707 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.721855 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 47, RRQ "pxelinux.cfg/0AF4" octet tsize 0 blksize 1408
14:11:59.723473 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.723598 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 46, RRQ "pxelinux.cfg/0AF" octet tsize 0 blksize 1408
14:11:59.725009 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.725133 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 45, RRQ "pxelinux.cfg/0A" octet tsize 0 blksize 1408
14:11:59.726523 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.726649 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 44, RRQ "pxelinux.cfg/0" octet tsize 0 blksize 1408
14:11:59.728291 IP bb.bb.bb.bb.tftp > aa.aa.aa.aa.49153: TFTP, length 19, ERROR ENOTFOUND "File not found"
14:11:59.728415 IP aa.aa.aa.aa.49153 > bb.bb.bb.bb.tftp: TFTP, length 50, RRQ "pxelinux.cfg/default" octet tsize 0 blksize 1408

Note that the aa.aa.aa.aa IP is the machine being commissioned, and the bb.bb.bb.bb IP is the machine running MaaS.

Are these file not found errors expected as part of the DHCP/PXE boot process?

Hello @wespiard

You behaviour seems to be similar to Cloud-init fails to fetch MAAS datasource from metadata_url missing port

Thats indeed a bug and we are already working on a fix.
You can track the progress at https://bugs.launchpad.net/maas/+bug/2022926

You can also try to apply this change before we make a new build

Thank you, that does indeed look exactly like my problem.

I’m not quite sure how to apply the fix. Do I modify rackd.nginx.conf.template? I have installed MaaS 3.3.4-13189-g.f88272d1e via snap, so the file is readonly here: /snap/maas/28521/lib/python3.10/site-packages/provisioningserver/templates/http/rackd.nginx.conf.template.

Do I need to install via PPA instead to temporarily try this fix?

You can always fallback to PPA, but even with a snap version of MAAS it is possible to apply certain changes, repack snap and try things out.

root@maas:/var/lib/snapd/snaps$ snap install maas
root@maas:/var/lib/snapd/snaps$ unsquashfs maas_xxx.snap
# do the changes to rackd.nginx.conf.template
root@maas:/var/lib/snapd/snaps$ snap pack ./squashfs-root
root@maas:/var/lib/snapd/snaps$ sudo snap install --dangerous maas_xxx.snap
root@maas:/var/lib/snapd/snaps$ snap connections maas | awk '$1 != "content" && $3 == "-" {print $2}' | xargs -r -n1 sudo snap connect
root@maas:/var/lib/snapd/snaps$ sudo maas init region+rack --database-uri maas-test-db:///

Here is a related post:

I re-installed MAAS using the 3.4/latest channel (which now includes the commit that fixed this bug), and I am now past this step.

I’m now running into a new issue that I’ll probably need to make a new post for if I don’t find anything from searching around.

Basically, I can start commissioning a machine, but it has some network issue where cloud-init can’t run “netplan apply”, and then any commissioning scripts that require network access (installing packages, for example) fail with 403 Forbidden errors.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.