ARM server can not pxe boot under MAAS 3.5.1 directives

The Gigabyte server R292-P92 can not boot Ubuntu Jammy 22.04 via PXE boot under MAAS control. When PXE boot happens I see that IP address gets assigned and server receives two files bootaa64.efi and grubaa64.efi via TFTP. But after that the boot process stops and the console shows that the server drops to the GRUB shell.

Gigabyte tam is involved in troubleshooting and they were able to pxe boot the same server using Fedora grubaa64.efi. Can someone help/suggest how to troubleshoot this pxe boot issue from the MAAS perspectives.

One observation is that after the BIOS upgrade from 31r to 33j very first time the server was able to successfully perform PXE boot, but then after next restart and attempt to boot from the network the same problem is visible again and reproducible every time I try to boot from the network. So there is definitely something wrong in the HW because it was able to successfully perform PXE boot immediately after the BIOS upgrade and then again failed.

Can you upload the maas-rackd logs? Try to deploy the machine and then extract the with

journalctl --since "1 hour ago" -t maas-rackd

assuming you are using 3.5.1

Hi, please find maas logs here.
Last PXE boot was unsuccessful:

Based on what I understood from various reading dropping to the grub> shell means system can not find or execute grub.conf file

This reminds me to Arm and UEFI clients dont get boot from MaaS - #20 by marosg

Not sure about the solution in above thread: "in my DHCP I had to change the file path of the arm architecture. "
Do I need to set this somewhere in the MAAS?

Ok, this one. I’m in contact with Gigabyte team and their reply is kind of:
" The R282-P91 website recommends using version 20.04. It is advised that the customer downloads the complete ISO file directly (without manually modifying the boot-initrd files).

https://cdimage.ubuntu.com/releases/20.04/release/

https://cdimage.ubuntu.com/releases/20.04/release/ubuntu-20.04.5-live-server-arm64.iso"

Is it possible to test this using MAAS?

If you are hitting the bug I linked above the problem is in the bootloader, not in the image. You might try to use an older version of the bootloaders by

Let me know if they work

With older version of bootloades I’m able to pxe boot 20.04 and 22.04 versions.

This is becoming a critical issue for us. The workaround to set IPv6 pxe boot priority before IPv4 for the same interface works sometime but it is not consistent. When I deploy Ubuntu 22.04 first time I have to make a lot of manual work:

  1. Connect to the remote session and see when 1-st time pxe boot process drops to the GRUB menu 2.06

  1. I restart the server and it starts from IPv6 then after around 30 sec it switches to IPv4 and get IP from IPv4 then starts booting and successfully deploying the OS. Notice “Booting under MAAS direction”

  2. Next server is restarted and BIOS switches network boot priority by putting IPv4 on top of the list before IPv6. With this config it can not boot from the disk. I see “Out of memory error” :

and server is unable to boot from the disk:

When I manually set IPv6 before IPv4 it successfully boots from the disk to 22.04.

However when I release the node and try to repeat Ubuntu 22.04 deployment it consistently fails with “out of memory” error and in unable to boot from the disk even if I set PXE boot priorities properly IPv6 before IPv4. I tried various workarounds without success whernever MAAS instructs the node to boot from the local disk “Out of memory” appears and node does not boot due to above issue “unable to mount root fs”

Any suggestion or recommendations I can try to resolve above issue while using MAAS for the PXE booting?

is this related to the issue you originally posted?

This is the continuation of the story. Initially it did not boot using 2.06 bootloader and grub, it dropped to the grub menu as I described from the very beginning. I used older one 2.04 which you propose and I was able to move on (enlist, commission and deploy 22.04). I the meantime Gigabyte team suggested this workaround to use network boot IPv6 before IPv4 and with that I was able to proceed with 2.06 but it is not consistent and “out of memory” errors appears quite often, although smetime I’m able to successfully deploy 22.04 HWE. And based on my observation “out of memory” happens with both versions 2.04 and 2.06.

This seems not to be a MAAS issue but rather something like Bug #1842320 “Can't boot: “error: out of memory.” immediately af...” : Bugs : OEM Priority Project. Something related to the screen resolution during boot possibly

Well, could be but how can I manipulate screen resolution with MAAS controlled boot? BTW where can I find the latest bootaa64.efi and grubaa64.efi for the ARM? I want to test it with 22.04 HWE kernel.

You should be able to get the latest version by

  1. downloading the latest deb package from here grub2-signed package : Ubuntu
  2. opening the deb as archive
  3. extract the data.tar.zst
  4. extract grubaa64.efi.signed from the /usr/lib/grub/arm64-efi-signed/ path in the archive

This did not help in my case.

You mean you tried the latest build and the issue is still there?

Yeah sorry for not being specific.
I used the bootloader from 1.209.1, booted the arm machine (gracehopper devkit),
and landed in the cli.
The boot menu only contains “local boot” which does nothing.
manually doing configfile (hd5,gpt1)/efi/ubuntu/grub.cfg gets me the local menu and I can boot.