Questions about IPMI driver workflow and retry intervals

hello @r00ta and @troyanov

Quick question about the IPMI power driver and how it uses ipmipower (from freeipmi-tools). Just want to make sure I have the exact workflow straight:

1. Temporal grabs the task, figures out the right worker group, and routes the power-on action to the Go Agent worker queue for that node’s VLAN.

2. The Go code handles this inside PowerOn(), maps out the driver arguments (swapping underscores for dashes, like --power-address), and kicks off the shell command.

We’re hitting an issue with some of our servers because they are unable to boot up after 2 or 3 successful deployments. The logs show the error: “unable to control BMC. Please check credential”.

Because of this, I wanted to ask:

1. How many times does this sequence actually get called?

2. Is there any wait time between retries if a server is slow to spin up?

We’re currenly running MAAS 3.5.4

are you deploying the machine or are you making simple power operations (power-on, power-off)?

deploying machine so I think its power-cycle option ? with –cycle –on-if-off

There are multiple retries implemented in different places. 3.5 is pretty old and the logic to power on the machine during deployment has slightly changed in 3.6. I’d suggest to upgrade to a newer version