MAAS - Facing trouble during power on machine

I am currently using the following version of MAAS installed through snap & configured as a single rack+region controller.

maas 3.3.4-13189-g.f88272d1e 28521 3.3/stable canonical✓ -

Currently, I have added 2 machines added with Power configuration as IPMI & commissioned them successfully.
I was able to power off the system from MAAS, but while I try to power on it fails with the following error in regiond.log

maasserver.websockets.handlers.machine: [error] Bulk action (on) for <random>  failed: on action is not available for this node.

Looking for guidance to resolve.

I use a similar MAAS configuration with MAAS 2.8 version and in that environment I don’t face such power on/power off trouble.

Hello @shan2234

Can you please check rackd.log for any other errors?

MAAS relies on ipmi-power util from freeipmi-tools package for IPMI power driver type. Are you able to power on your machine using ipmi-power?

@troyanov sorry for the late reply. Yes, I am able to switchon and switchoff through freeipmi-tools, ipmi-power command.

I have installed maas 3.5.0~alpha1-14430-g.86b188ccd 29199 latest/edge canonical✓, still facing the same trouble.
configured the environment as a single rack+region controller.

However ipmi-power -h <IP> -u ADMIN -p <pass> -n works fine.

Adding additional information, After setting up maas 3.5.0~alpha1-14430-g.86b188ccd, following is the status

 >> maas status
Service    Startup   Current   Since
apiserver  enabled   active    today at 10:21 UTC
bind9      disabled  active    today at 10:21 UTC
dhcpd      disabled  active    today at 10:21 UTC
dhcpd6     disabled  inactive  -
http       disabled  active    today at 10:21 UTC
ntp        disabled  active    today at 10:21 UTC
proxy      disabled  inactive  -
rackd      enabled   active    today at 10:21 UTC
regiond    enabled   active    today at 10:21 UTC
syslog     disabled  active    today at 10:21 UTC

Also, I don’t find rackd/region logs under /var/snap/maas/common/log, only maas.log displayed.

Looking out for inputs to resolve this issue.

In MAAS 3.5 (which is the current edge) we are moving to Pebble. That migration changed how logs can be accessed.

Can you please try journalctl -feu snap.maas.pebble.service -g "\[rackd\]"

Or -g '\[(regiond|rackd)\]' for both region and rack.

thanks for sharing log-checking commands.
while trying to switch on a server from MAAS following error displayed

maasserver.websockets.handlers.machine: [error] Bulk action (on) for 7fkewq failed: on action is not available for this node.

@troyanov,

observed following logs after switching on server through BMC of the server.

 2023-08-08 23:11:27 maasserver.regiondservices.active_discovery: [info] Active network discovery: Active scanning is not enabled on any subnet. Skipping periodic scan.
 2023-08-08 23:11:41 maasserver.websockets.handlers.machine: [critical] Failed to update power state of machine.
         Traceback (most recent call last):
         --- <exception caught here> ---
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
             current.result = callback(  # type: ignore[misc]
           File "/snap/maas/29927/lib/python3.10/site-packages/maasserver/websockets/handlers/machine.py", line 1259, in eb_unknown
             failure.trap(UnknownPowerType, NotImplementedError)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
             self.raiseException()
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
             raise self.value.with_traceback(self.tb)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
             current.result = callback(  # type: ignore[misc]
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1946, in _massageError
             error.trap(RemoteAmpError)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
             self.raiseException()
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
             raise self.value.with_traceback(self.tb)
         twisted.internet.defer.CancelledError:

 2023-08-08 23:12:09 maasserver.websockets.handlers.machine: [critical] Failed to update power state of machine.
         Traceback (most recent call last):
         --- <exception caught here> ---
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
             current.result = callback(  # type: ignore[misc]
           File "/snap/maas/29927/lib/python3.10/site-packages/maasserver/websockets/handlers/machine.py", line 1259, in eb_unknown
             failure.trap(UnknownPowerType, NotImplementedError)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
             self.raiseException()
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
             raise self.value.with_traceback(self.tb)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
             current.result = callback(  # type: ignore[misc]
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1946, in _massageError
             error.trap(RemoteAmpError)
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
             self.raiseException()
           File "/snap/maas/29927/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
             raise self.value.with_traceback(self.tb)
         twisted.internet.defer.CancelledError:

@shan2234,

It looks like there is an issue with power management for this IPMI-enabled machine in MAAS. Here are some suggestions on how I’d troubleshoot it, with the caveat that I’m just a technical author who uses MAAS a lot, not a MAAS developer:

  • Check the IPMI credentials set in MAAS match what you use with ipmi-power. Make sure the username, password, and address are all correct.
  • Verify IPMI traffic can flow between the MAAS server and the BMC IP address. Double check firewall rules and networking.
  • Inspect rack log entries around the time a power action is attempted for any errors talking to the BMC or IPMI issues.
  • Try resetting or re-adding the machine in MAAS to force it to redetect the power parameters.
  • Consider downgrading to an older MAAS version like 2.8 if it worked more reliably there. Could be a regression.
  • Try toggling IPMI power driver settings like ‘power_off_mode’ or ‘power_reset_mode’ to workaround any quirks with that hardware.
  • As a last resort, see if another IPMI tool like ipmitool can control power. If not, may indicate a BMC/firmware issue.
  • If all else fails, file a bug against MAAS with full logs showing the power failure.

Let me know if you discover anything else when trying to power on the node.

1 Like

Thanks for your reply @billwear,

  • Sure I shall downgrade the environment to 2.8 and post the results soon.
  • As mentioned earlier, through ipmitool I was able to poweron & poweroff systems remotely.

Reinstalled MAAS : 3.3.4 and now the power Issue has been resolved.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.