Problem with MaaS/Masakari Charm

Hi,

I’m not sure this is the right place to ask so I put this here and I’ll post the same request on the Juju Discourse.

I have a strange issue with latest MaaS release and/or openstack Charms (probably Masakari since this is the only one that interacts with MaaS)
here is the case : I deployed a working Openstack cluster with Masakari enabled
Before, when I was testing Masakari, I just initiated a “reboot” on a compute node and the node simply restarted
Now, when I do the same thing, the node is stuck shutoff (those are physical nodes with MaaS provider)
When I try to start them through MaaS (so through ipmi), the node starts then as soon as it try to start PXE, it simply shutdown itself, it’s not even initiating a DHCP/PXE request, simply shuting down before the bios ends its initialization phase

I tried to start those nodes many times through MaaS and it always do the same (nodes are Dell PowerEdge R630)
Only way to start those nodes is to start them from the idrac console
It might not be a “Charms” related issue since there was also new MaaS releases recently but since there are interaction between the Masakari charm and MaaS, I don’t really know where to route that issue
So any help gladly appreciated.

May this R630 machine has some error?

I deployed openstack base in four R610 machines,one of them is always can‘t control by maas ,when I deploy OS on it.

Hi,

Really don’t think so, until very recently (and that’s where I can’t pinpoint the root cause of this issue since there were major Charms releases and MaaS updates within the same time frame) I never had that issue and I never had (and still don’t) have any difficulty to deploy machines with MaaS.
I really think this is more related to the relation between Masakari, IPMI and MaaS but don’t really know how exactly and how to debug this.

It is possible that stonith is killing your node using maas api

@Hybrid512, did you ever find the answer to this one?

I don’t really know, this was in our Lab and we didn’t had the opportunity to work on that again.
We signed with Canonical and now have a working supported cluster with a working Masakari setup but this is pretty fresh and we didn’t had time to battle test it yet, I’ll get back there when I have more insights.

1 Like

cool. i’m going to leave this one open for you.