Experiencing MAAS Power Control Issues with UCS C225 M8S Servers

Hi team,

Is anyone here using MAAS to manage Cisco UCS C225 M8S servers? We’ve got a couple of these machines, and I’m running into trouble getting MAAS to control their power.

I’ve manually added the servers to MAAS, but every time I try to “Check Power,” I get a “Power error.” I’ve already tested both the IPMI and Redfish power drivers, but neither seems to work.

Has anyone successfully integrated these specific UCS models with MAAS, and if so, what power configuration did you use?

thanks!

I got it working with Redfish. For UCSC225M8 machines, the user need to be explicitly enabled to work with CIMC.

IPMI still fails though with the below error in logs:

May 28 20:05:49 hostname regiond[48698]: maasserver.websockets.handlers.machine: [critical] Failed to update power state of machine.
May 28 20:05:49 hostname regiond[48698]: #011Traceback (most recent call last):
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/internet/asyncioreactor.py", line 271, in _onTimer
May 28 20:05:49 hostname regiond[48698]: #011    self.runUntilCurrent()
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/internet/base.py", line 991, in runUntilCurrent
May 28 20:05:49 hostname regiond[48698]: #011    call.func(*call.args, **call.kw)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 700, in errback
May 28 20:05:49 hostname regiond[48698]: #011    self._startRunCallbacks(fail)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 763, in _startRunCallbacks
May 28 20:05:49 hostname regiond[48698]: #011    self._runCallbacks()
May 28 20:05:49 hostname regiond[48698]: #011--- <exception caught here> ---
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
May 28 20:05:49 hostname regiond[48698]: #011    current.result = callback(  # type: ignore[misc]
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/maasserver/websockets/handlers/machine.py", line 1256, in eb_unknown
May 28 20:05:49 hostname regiond[48698]: #011    failure.trap(UnknownPowerType, NotImplementedError)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
May 28 20:05:49 hostname regiond[48698]: #011    self.raiseException()
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
May 28 20:05:49 hostname regiond[48698]: #011    raise self.value.with_traceback(self.tb)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 244, in inContext
May 28 20:05:49 hostname regiond[48698]: #011    result = inContext.theWork()  # type: ignore[attr-defined]
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 260, in <lambda>
May 28 20:05:49 hostname regiond[48698]: #011    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 117, in callWithContext
May 28 20:05:49 hostname regiond[48698]: #011    return self.currentContext().callWithContext(ctx, func, *args, **kw)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 82, in callWithContext
May 28 20:05:49 hostname regiond[48698]: #011    return func(*args, **kw)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 856, in callInContext
May 28 20:05:49 hostname regiond[48698]: #011    return func(*args, **kwargs)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 203, in wrapper
May 28 20:05:49 hostname regiond[48698]: #011    result = func(*args, **kwargs)
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/maasserver/workflow/power.py", line 332, in convert_power_action_to_power_workflow
May 28 20:05:49 hostname regiond[48698]: #011    task_queue=get_temporal_task_queue_for_bmc(machine),
May 28 20:05:49 hostname regiond[48698]: #011  File "/usr/lib/python3/dist-packages/maasserver/workflow/power.py", line 275, in get_temporal_task_queue_for_bmc
May 28 20:05:49 hostname regiond[48698]: #011    raise UnroutablePowerWorkflowException(
May 28 20:05:49 hostname regiond[48698]: #011maasserver.workflow.power.UnroutablePowerWorkflowException: Error determining BMC task queue for machine xfx8t4

Ours is an all in one set up of maas-3.5.6 setup via apt packages. So connectivity between rack and regiond ( as mentioned here ) is not a problem.

Is this server’s management port on the same L2 of the rack?

Is this server’s management port on the same L2 of the rack?

It is not, but there is routing set. I am only facing issues with the UCS C225 M8S machines. The same regiond manages 100s of UCS C240 M5S machines (with power driver as IPMI) without any issues.

To add, I can control the machine using ipmitool command. A tcpdump on the regiond also shows traffic flowing to and from BMC on UDP port 623. To me it doesn’t look a network issue and instead I suspect the way MaaS handles the task.

Something strange I noticed in the machine is, for Redfish to work, I had to also enable IPMI. Is there any dependency between Redfish and IPMI power driver?

If you run ipmitool from the rack can you talk to the BMC?

If you run ipmitool from the rack can you talk to the BMC?

Yes I can.

any chance to share an sos report?

sudo snap install --classic sosreport

and

sudo sos report -o maas --all-logs

Please note that the archive might contain potential confidential information that you have inside the DB. You might create a new simple env to reproduce it and take the sos report from that in case

Sure thing, I will get back with the logs.

@r00ta, Does MaaS have a recommended preference between Redfish and IPMI for BMC communication?

The bmc_config script appears to prioritise Redfish over IPMI. Does this mean there is a preference for Redfish?

Thanks!

Does MaaS have a recommended preference between Redfish and IPMI for BMC communication?

Both are supposed to work fine in MAAS. However given that in general Redfish is better than ipmi it makes sense to prefer it.