Currently, I have added 2 machines added with Power configuration as IPMI & commissioned them successfully.
I was able to power off the system from MAAS, but while I try to power on it fails with the following error in regiond.log
maasserver.websockets.handlers.machine: [error] Bulk action (on) for <random> failed: on action is not available for this node.
I have installed maas 3.5.0~alpha1-14430-g.86b188ccd 29199 latest/edge canonical✓, still facing the same trouble.
configured the environment as a single rack+region controller.
Adding additional information, After setting up maas 3.5.0~alpha1-14430-g.86b188ccd, following is the status
>> maas status
Service Startup Current Since
apiserver enabled active today at 10:21 UTC
bind9 disabled active today at 10:21 UTC
dhcpd disabled active today at 10:21 UTC
dhcpd6 disabled inactive -
http disabled active today at 10:21 UTC
ntp disabled active today at 10:21 UTC
proxy disabled inactive -
rackd enabled active today at 10:21 UTC
regiond enabled active today at 10:21 UTC
syslog disabled active today at 10:21 UTC
Also, I don’t find rackd/region logs under /var/snap/maas/common/log, only maas.log displayed.
observed following logs after switching on server through BMC of the server.
2023-08-08 23:11:27 maasserver.regiondservices.active_discovery: [info] Active network discovery: Active scanning is not enabled on any subnet. Skipping periodic scan.
2023-08-08 23:11:41 maasserver.websockets.handlers.machine: [critical] Failed to update power state of machine.
Traceback (most recent call last):
--- <exception caught here> ---
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/snap/maas/29927/lib/python3.10/site-packages/maasserver/websockets/handlers/machine.py", line 1259, in eb_unknown
failure.trap(UnknownPowerType, NotImplementedError)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
self.raiseException()
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
raise self.value.with_traceback(self.tb)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1946, in _massageError
error.trap(RemoteAmpError)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
self.raiseException()
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
raise self.value.with_traceback(self.tb)
twisted.internet.defer.CancelledError:
2023-08-08 23:12:09 maasserver.websockets.handlers.machine: [critical] Failed to update power state of machine.
Traceback (most recent call last):
--- <exception caught here> ---
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/snap/maas/29927/lib/python3.10/site-packages/maasserver/websockets/handlers/machine.py", line 1259, in eb_unknown
failure.trap(UnknownPowerType, NotImplementedError)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
self.raiseException()
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
raise self.value.with_traceback(self.tb)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1946, in _massageError
error.trap(RemoteAmpError)
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
self.raiseException()
File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
raise self.value.with_traceback(self.tb)
twisted.internet.defer.CancelledError:
It looks like there is an issue with power management for this IPMI-enabled machine in MAAS. Here are some suggestions on how I’d troubleshoot it, with the caveat that I’m just a technical author who uses MAAS a lot, not a MAAS developer:
Check the IPMI credentials set in MAAS match what you use with ipmi-power. Make sure the username, password, and address are all correct.
Verify IPMI traffic can flow between the MAAS server and the BMC IP address. Double check firewall rules and networking.
Inspect rack log entries around the time a power action is attempted for any errors talking to the BMC or IPMI issues.
Try resetting or re-adding the machine in MAAS to force it to redetect the power parameters.
Consider downgrading to an older MAAS version like 2.8 if it worked more reliably there. Could be a regression.
Try toggling IPMI power driver settings like ‘power_off_mode’ or ‘power_reset_mode’ to workaround any quirks with that hardware.
As a last resort, see if another IPMI tool like ipmitool can control power. If not, may indicate a BMC/firmware issue.
If all else fails, file a bug against MAAS with full logs showing the power failure.
Let me know if you discover anything else when trying to power on the node.