We have a new supermicro server which failed in commissioning script “30-maas-01-bmc-config”. up until now we had only Dell PowerEdge servers with now issues.
modprobe: ERROR: could not insert 'ipmi_si': No such device
Unable to get Number of Users
ERROR: Unable to add BMC user!
INFO: Loading IPMI kernel modules...
INFO: Checking for HP Moonshot...
INFO: Checking for Redfish...
ERROR: Redfish write() argument must be str, not None
Traceback (most recent call last):
File "/tmp/user_data.sh.RRV4Px/scripts/commissioning/30-maas-01-bmc-config", line 1132, in detect_and_configure
if bmc.detected():
File "/tmp/user_data.sh.RRV4Px/scripts/commissioning/30-maas-01-bmc-config", line 1065, in detected
self._detect()
File "/tmp/user_data.sh.RRV4Px/scripts/commissioning/30-maas-01-bmc-config", line 1050, in _detect
self._configure_network(iface, data)
File "/tmp/user_data.sh.RRV4Px/scripts/commissioning/30-maas-01-bmc-config", line 1023, in _configure_network
netplan.write(netplan_config)
TypeError: write() argument must be str, not None
INFO: Checking for IPMI...
INFO: IPMI detected!
INFO: Reading current IPMI BMC values...
INFO: Configuring IPMI Lan_Channel...
INFO: Configuring IPMI Lan_Channel_Auth...
INFO: Lan_Channel_Auth settings unavailable!
WARNING: No K_g BMC key found or configured, communication with BMC will not use a session key!
INFO: Configuring IPMI Serial_Channel...
INFO: Serial_Channel settings unavailable!
INFO: Configuring IPMI SOL_Conf...
INFO: SOL_Conf settings unavailable!
INFO: Configuring IPMI BMC user "maas"...
INFO: IPMI user number - None
INFO: IPMI user privilege level - Administrator
I’m starting to see the same problem (redfish write argument) on random nodes out of a bunch. Some blades get this, others don’t. I cannot figure out what is or isn’t happening .
Downloading the logs has been uninformative.
---------------------------- 30-maas-01-bmc-config ----------------------------
ipmi_cmd_set_user_access: invalid parameters
INFO: Loading IPMI kernel modules...
INFO: Checking for HP Moonshot...
INFO: Checking for Redfish...
ERROR: Redfish write() argument must be str, not None
Traceback (most recent call last):
File "/tmp/user_data.sh.sKiyi2/scripts/commissioning/30-maas-01-bmc-config", line 1134, in detect_and_configure
if bmc.detected():
File "/tmp/user_data.sh.sKiyi2/scripts/commissioning/30-maas-01-bmc-config", line 1067, in detected
self._detect()
File "/tmp/user_data.sh.sKiyi2/scripts/commissioning/30-maas-01-bmc-config", line 1052, in _detect
self._configure_network(iface, data)
File "/tmp/user_data.sh.sKiyi2/scripts/commissioning/30-maas-01-bmc-config", line 1025, in _configure_network
netplan.write(netplan_config)
TypeError: write() argument must be str, not None
INFO: Checking for IPMI...
INFO: IPMI detected!
INFO: Reading current IPMI BMC values...
INFO: Configuring IPMI Lan_Channel...
INFO: Configuring IPMI Lan_Channel_Auth...
INFO: Lan_Channel_Auth settings unavailable!
WARNING: No K_g BMC key found or configured, communication with BMC will not use a session key!
INFO: Configuring IPMI Serial_Channel...
INFO: Configuring IPMI SOL_Conf...
INFO: Found existing IPMI user "maas"!
INFO: Configuring IPMI BMC user "maas"...
INFO: IPMI user number - User3
INFO: IPMI user privilege level - Administrator
WARNING: Unable to set User3:Serial_Enable_Link_Auth to Yes!
INFO: IPMI Version - LAN_2_0
INFO: IPMI boot type - efi
Please note that the logs are actually wrong because it’s not really an ERROR: your machine is going to be configured with IPMI as you can see. So it’s not fatal or something, it simply says that it failed to configure redfish.
My next questions are:
Has MAAS ever been able to use Redfish on the same machine? Did you upgrade the firmware or make any changes?
Has MAAS ever been able to use Redfish on the same machine?
We have >5 dozen of these blades in use via MaaS – never seen this before today.
Did you upgrade the firmware or make any changes?
These are entirely new machines. The first two I tried worked just fine, then we started seeing individual blades fail with these errors. Firmware is identical on the blades that failed versus the blade that succeeded.
So I finally made a video and slowed it down, and what I see is from the screenshot. This flashes by way too fast to see normally.
I can confirm from the server logs that it does send a POST and get a 401 response, but then it immediately (same second!) sends another POST with a 200 response
In the logs, if we wait 30 minutes for provisioning to fail then we get the following logs… which yes you haven’t heard from it because it powered off!
Thu, 20 Feb. 2025 17:22:12 Failed to query node's BMC - (admin) - No rack controllers can access the BMC of node x36
Thu, 20 Feb. 2025 17:21:52 Failed to query node's BMC - (admin) - No rack controllers can access the BMC of node x36
Thu, 20 Feb. 2025 17:21:52 User powering down node - Node stopped because SSH is disabled
Thu, 20 Feb. 2025 17:21:52 Node changed status - From 'Commissioning' to 'Failed commissioning'
Thu, 20 Feb. 2025 17:21:52 Marking node failed - Node has not been heard from for the last 30 minutes
Thu, 20 Feb. 2025 16:51:49 Script result - 30-maas-01-bmc-config changed status from 'Pending' to 'Passed'
It was the BIOS time. No matter that you are setting up NTP during the boot, if the date and time of the BIOS are too far wrong they fail to authenticate to MAAS to get the user scripts.