Supermicro is telling me that with the latest BIOS update on my blade server, that I can only boot from NVMe drives if I set the BIOS to “UEFI” boot mode and then install Ubuntu in EFI in boot mode.
I made the BIOS change, but have no idea how to tell MAAS to configure Ubuntu to boot in UEFI mode. I was hoping it would detect that my server only supported UEFI and automatically handle this properly. The deploy looks like it’s working, but then when it reboots at the end, it just dumps into a “UEFI Interactive Shell” and the deploy fails:
Hi @spockdude we are using SuperMicro with UEFI, too. What is happening is that your mainboard doesn’t find anything to boot from and then in the end drops into the EFI shell. This is your computer’s last resort to do something. A number of things might be happening:
Most likely your boot order for PXE boot is broken. You should pick UEFI PXE boot as the first option.
Probably a good idea to pick UEFI HDD / SSD boot as the second one. This ensures that if you lose network connectivity to the MAAS server, at least the system will boot up.
I would probably disable all other options beyond these two, unless you strongly feel that you need the others.
Some SuperMicro BIOS versions let you choose the boot order on the NICs and the HDDs. But not all. Or all versions. Or all hardware (yes, I cannot find any rhyme and reason in which ones do or don’t).
Also - make sure that your NIC is up to date. If you’re using Mellanox use mlxup and then use their firmware tools to allow for PXE boot.
Unfortunately all of this info above was obtained the hard way ;-).
I don’t seem to have a “UEFI PXE” boot option. So, I tried setting the first option to the closest thing I could find to that, which I think would be “UEFI Network”:
That said, I’m unclear how this would help since the problem is that it’s not finding the NVMe to boot from after the install is done.
It still boots to the UEFI shell after installing Ubuntu via MAAS.
Available boot options:
On big clue here is that Supermicro support says the detected drive should show up looking like this:
(But, as you can see above, my system just has a “UEFI Hard Disk” option without any mention of the actual Samsung NVMe drive.)
They mentioned that after installing this option would appear under “UEFI Hard Disk Drive BBS Priorities” on the boot page in BIOS, but “UEFI Hard Disk Drive BBS Priorities” does not appear as an option on my boot page at all.
Yes, UEFI Network is the category you need to select. Depending on your bios you get to select the Network BBS priorities, too and this is where you can choose from your different NICs to decide which one to boot from.
The reason why the specific HDD isn’t showing up yet is that there’s no OS installed on the HDD so far (Maas needs to do that). That said, UEFI HDD should be smart enough to detect a bootable partition. A good way to check whether such a thing exists is to ensure that in the filesystem layout you have a /boot/efi or similar partition available. And yes, Maas screws this up when you choose to make a drive bootable and don’t follow its standard layout. You will need to manually create said 512MB-1GB partition with a windows compatible format, say fat32.
The easiest way to debug all this mess is to watch the board boot up using console.
Thanks for the reply, @smolix . This definitely helped me get going in the right direction. I still have been unsuccessful getting MAAS to deploy Ubuntu using UEFI, but as a test I did try a manual install of Ubuntu and it worked fine. So, I just need to find some simple step by step instructions on how to layout the filesystem so that MAAS does the right thing.
As an added complication, I want to do RAID1. I tried creating a fat32 partition on both drives and then put the second partition on both drives in a RAID1 array, but that didn’t work. Tried again laying it out on just one of the drives as a simple fat32 and ext4 partition and it still failed.
I am using the console, but everything flies by too fast for me to see any clues as to what is going wrong.
OK, here’s where you can find out what is going on with your node: look at /var/log/maas/rsyslog for info about all of your nodes. The new ones are grouped in the maas-enlisting-node directory. The only bad news is that the updates are not always in real time and sometimes when a node hangs it’s not updated.
For me, the logs were located at /var/snap/maas/common/log/rsyslog/maas-enlisting-node. But yeah, it seems that because my deploys never finished it didn’t log anything on the failed deployments.
I wasn’t aware of an “automatic” vs “UEFI” setting. My MAAS is pretty much a default install. I tried version 3.2 and the latest stable release of 3.3. I’ve done tons of searches on MAAS and UEFI and haven’t run into any mention of this setting. Where is it?
It sounds like MAAS is installing Ubuntu on the machine fine, but the resulting OS cannot boot, which is typically a sign that the wrong boot material/location was used.
In my experience, if the machine was originally commissioned under legacy BIOS settings, it will need to be recommissioned with the UEFI BIOS settings enabled in order for MAAS to recognize the change. You should be able to retain storage and network configurations if desired, though the subsequent deployment may need specific partitioning for UEFI (such as the small fat32 partition for /boot/efi).
Under each machine’s config page there is “Power boot type” option
Oh my, I never saw that option before! Unfortunately, I didn’t receive a notification from this forum that you replied and I ended up doing a manual install of the server and it’s already in production now. So, it may be a while before I’m able to confirm whether the solution could be as simple as switching this option from “Automatic” to “EFI Boot”…
Thanks… I bet this will solve it!
As a side note, I tested this with MAAS 3.2, and MAAS 3.3 and both failed at detecting that the machine was only capable of UEFI boot.
Thanks for your reply. Yeah, that was my understanding too and so I was careful to make sure that the BIOS was set to UEFI boot only before commissioning. But, MAAS failed to detect UEFI booting still.
I’m anxoius to try the solution that @jeremy-mordkoff suggested on my next server deployment (probably will be a few months from now).
(Legacy booting would actually be preferrable, but Supermicro has dropped support of Legacy booting from NVMe drives in it’s latest BIOS update on it’s MicroBlade platform. Kind of a bummer since it’s a bit of an undocumented mess to set up a redundant UEFI partition in a RAID1 environment with Ubuntu.)
I also have a recollection that some very old intel BIOS builds were returning boot IDs with the wrong casing on the hexidecimal digits, e.g. the spec said use 00a and the bios was returning 00A (or vice versa). Do you have more than 10 bootable devices? This only affected me on machines with 6 dual port 10 gbps nics and this very old bios.
Thanks for the reply. I haven’t needed to set up any more Supermicro Microblades since I started this post, but I don’t think there were more than 10 bootable devices. Also, this was working on my older MicroBlade servers, it was actually a new server that I had this issue with last November, so it doesn’t seem like it was caused by the BIOS being too old.