Hi all,
I am building my new homelab with MAAS after having a successful homelab with ARM64 based boards.
For this one I have Dell Optiplex 5070 Small Form Factor nodes with Intel AMT enabled on them.
I am trying the latest MAAS 3.6 version from Snap and having issues with deployment.
When I try to deploy, the following happens
- Machine is powered on
- The PXE is trying to boot
- At some random time (between pxe boot and installation complete) the machine will reboot.
- The machine will start installation again over PXE.
I was going crazy to figure out the issue, if this was
- Kernel crashing
- BIOS Watchdog Timer
- any other possibility.
I thought to try powering the node myself using “meshcmd” tool from MeshCommander and set the MAAS power control to Manual. This made things work as expected, no unexpected reboots between deployment start and finish.
meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123 --poweron --bootdevice pxe
To figure out the issue, I setup a test instance of maaspower - maaspower — maaspower 1.0.1.dev2+g5923ff0 documentation in front and set the power configuration as webhook
maaspower config file used
name: my maas power control webhooks
ip_address: 0.0.0.0
port: 5000
username: maas_user
password: maas_pass
devices:
- type: CommandLine
name: '192.168.\d{1,3}.\d{1,3}'
on: 'meshcmd AmtPower --host \g<0> --pass MyPAAS!123 --poweron --bootdevice pxe'
off: 'meshcmd AmtPower --host \g<0> --pass MyPAAS!123 --poweroff'
query: 'meshcmd AmtPower --host \g<0> --pass MyPAAS!123'
This made things work as expected, no reboots while deploying and while staring at the maaspower logs, I saw that MAAS is actually trying to start the node more than once while the deployment started.
device: 192.168.20.101 command: query
EXECUTE command line: meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123
Current power state: Soft off
response: status : stopped
192.168.20.2 - - [14/Jun/2025 10:40:12] "GET /maaspower/192.168.20.101/query HTTP/1.1" 200 -
device: 192.168.20.101 command: on
EXECUTE command line: meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123 --poweron --bootdevice pxe
SUCCESS
response: None
192.168.20.2 - - [14/Jun/2025 10:40:25] "POST /maaspower/192.168.20.101/on HTTP/1.1" 200 -
device: 192.168.20.101 command: query
EXECUTE command line: meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123
Current power state: Power on
response: status : running
192.168.20.2 - - [14/Jun/2025 10:40:46] "GET /maaspower/192.168.20.101/query HTTP/1.1" 200 -
device: 192.168.20.101 command: on
EXECUTE command line: meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123 --poweron --bootdevice pxe
SUCCESS
response: None
192.168.20.2 - - [14/Jun/2025 10:40:56] "POST /maaspower/192.168.20.101/on HTTP/1.1" 200 -
device: 192.168.20.101 command: query
EXECUTE command line: meshcmd AmtPower --host 192.168.20.101 --pass MyPAAS!123
Current power state: Power on
response: status : running
This isn’t right but still wasn’t sure why this would work with meshcmd and MAAS restarting when working directly.
Looking at MAAS power driver for AMT things made sense -
if (
self.wsman_query_state(
ip_address, power_user, power_pass, port
)
== "on"
):
self.wsman_power_on(
ip_address, power_user, power_pass, port, restart=True
)
else:
self.wsman_power_on(ip_address, power_user, power_pass, port)
AMT Power driver is being smart by checking if the machine is already powered on, then it tells it to restart.
Combining the 2 issues together and you get machine that keeps restarting while deployment.
- MAAS is trying to power on a machine twice when deployment is started.
- AMT Power driver is trying to restart machine when MAAS is asking it to poweron and the machine is already powered on.
Looking at other power drivers, it seems like the expectation of the power driver is to restart the machine if its already powered on, so I take it that issue is with MAAS trying to power on the machine twice.
Please let me know if I am doing something incorrect.
Thanks