PXE boot failing on LXD VM's

Using the LXD Server installed by MaaS (and upgrade lxd to version 5.21), the VM’s are not able to find the DHCP server.

A part of vm console log from lxd:

lxc console worthy-marmot  --project maas 
To detach from the console, press: <ctrl>+a q
: Server response timeout.
BdsDxe: failed to load Boot0002 "UEFI PXEv4 (MAC:00163EF9CF9C)" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0)/MAC(00163EF9CF9C,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0): Not Found

>>Start PXE over IPv6
  PXE-E21: Remote boot cancelled.
BdsDxe: failed to load Boot0003 "UEFI PXEv6 (MAC:00163EF9CF9C)" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0)/MAC(00163EF9CF9C,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000): Not Found

>>Start HTTP Boot over IPv4.

So, any idea?

Hi @vasartori
What version of MAAS are you using?
Is DHCP service is up and running?

Also could be a network issue?

hi @troyanov
We are using the latest version of MaaS, 3.4.1.

Yeah, the DHCP Server is OK, because we can deploy physical machines using maas.
BTW, the LXD server was deployed by maas.
The only thing I made on server, was update the LXD snap, due a bug with a version 5.20. I have updated to version 5.21/candidate.

Hi @vasartori ,

Verify that the networking is functional and that the container DHCP request can reach the network where the DHCPd is located (tcpdump is your friend).

Anyway, my experience is that using an LXD deployed and configured via MAAS never worked. But doing a manual LXD installation and hooking it to MaaS worked. Not sure where/why the automatic failed. My gut instinct is that the networking was messed up somehow.

In my case, I also use MaaS to deploy the host (22.04), but then a manual installation of LXD is needed.

BR/Patrik

Hi @pal-arlos

Good to known this (never worked on an automatic install).
I’ll try to do a manual installation of LXD and see what happens :slight_smile:

When I have some spare time, I’ll test it out.

@pal-arlos To save time on research and avoid unreliable installation guides.

Do you have any installation guide for the LXD server that you can share?

Hej,

Think it was dead simple;
snap install lxd

Then follow the instructions in the MaaS UI, “Add LXD”.

Br/Patrik

Hi @pal-arlos,

When you mean, snap install lxd, you removed the default lxd that was available as part of MAAS configured server, and just re-installed it? Also, did you do a simple snap remove (or) is it snap remove --purge? Did you also create a new bridge (br0) in your LXD server and configured LXD to use this instead of lxdbr0 during lxc init?

Reason being, I also have this same issue, and I did try removing LXD multiple times in the server. But, still having the same issue.

Hej @skaliamu,

AFAIR; I tried to have Maas do the deployment (selected register as Lxd, or similar). It did not work (deployment never completed).

So my procedure was to use MAAS to deploy a standard Ubuntu 22.04. Add bridge-utils, and create a bridge. You can probably use the bridge created by lxd, but you need the VMs to be on the same network as MAAS deployed the host (or hook them to another that MAAS handles the PXE booting). In my case, the use case is that the VMs are ‘identical’ to the physical node.

As I did not let MaaS install LXD, I did not remove any snaps. I just installed what I needed on a clean canvas.

BR/Patrik

Hi @pal-arlos ,

I did the following. Created 2 VMs in a private subnet network that is DHCP managed by MAAS. Both VMs were of Ubuntu 22.04 LTS provided default by MAAS. I did not register both of them as KVM host. I converted the default eth0 to br0 provided by MAAS from the private network through MAAS UI. The Ubuntu 22.04 Image provided by MAAS already had LXD installed in it. When I initialized LXD, and asked it to use br0, the VMs I created were hanging with the same error above. I even tried removing the default LXD instance and re-installing again. Still was facing the same issue.