After I set up maas according to the documentation, I encountered this problem:
“Rack import error - Unable to find a copy of bootx64.efi, grubx64.efi in the SimpleStream and the packages shim-signed, and grub-efi-amd64-signed are not installed. The uefi_ebc_tftp bootloader type may not work.”
this is the picture
Hey @billwear,
I see you’re pretty active on here so hopefully you’re be able to help out.
We’re also experiencing this issue. I’ll try to provide as much context as possible here for you (And anyone who potentially finds this in the future!)
I’ve broken it into three sections since I included some screenshots.
(Sadly new users can only include one screenshot per post )
Infrastructure
2x Region+Rack Controller VMs
1x PostgresDB VM
Network
MAAS Manages its own subnet as the DHCP server.
Systems to be deployed by MAAS are connected to this subnet.
Both have a second interface connected to be accessible via our internal network.
PostgresDB is only connected to the MAAS servers, but not the network managed by MAAS DHCP.
Images Page
As you requested with Tangshibo I thought this would be good to include:
Post install the logs for both Region+Rack controllers show the following error:
Rack import error - Unable to find a copy of bootx64.efi, grubx64.efi in the SimpleStream and the packages shim-signed, and grub-efi-amd64-signed are not installed. The uefi_ebc_tftp bootloader type may not work.
<Imagine Screenshot Here>
As a result to this when booting a system into a PXE Boot, the system is detectable by MAAS, gets assigned an IP, starts PXELinux, but stalls before being able to continue.
<Imagine Screenshot1 Here> PXELinux Stalled After Approx 5 min it notes it failed to load
Strangely this did not occur during our testing. We made a PoC system with a similar setup and for a short time this was functioning. When moving to a production deployment this error occured. We cannot find any major differences between our PoC and Prod deployment. For some reason this is now consistent on all installs.
Attempted Solutions
Consider each of the below as its own step, with nothing carrying over from the previous step, reverting back to the setup stated in Context
Re-installed via Ansible Playbook
Reverted to Ubuntu 20.04
Reverted to version 3.2.7 via Snap
Manually installing MAAS as a Region+Rack controller
Completely wiping and re-imaging VMs
Returned to PoC System for temporary live use
Removing all firewall policies from subnet during installation
Changing base commisioning image to Ubuntu 22.04
Removing all base images, and resyncing
Manually setting Image Sync Destination
Changing DNS Servers
All in all, incredibly annoying.
Thanks in advance anyone who helps
MAAS is a wonderful thing and we want to use it!
I was able to resolve this by doing the following:
Changed primary DNS to 8.8.8.8 (Was set to an internal DNS)
Downloaded image through new DNS, forced sync with rack
Changed commissioning image to Ubuntu 22.04
Set Minimum kernal lvl to Jammy GA 22.04
Rebooted rack + region controllers
Changed primary DNS to internal DNS
To be honest, I’m not sure which step resolved it. It could be they just need to be restarted after installing MAAS as we did not have this step in the ansible playbook
My current theory:
System used 8.8.8.8 to sync an image once restarted and stored it locally. Then when I removed 8.8.8.8 and changed back to our internal DNS the system continues using the local copy and it keeps working. Some core part of the comissioning image just wasn’t downloaded at first for some strange reason.