Monitor the status connection of every rack to every region

Hi guys, I’m using endpoint /metrics that MaaS provided with regions and racks. I got it all metrics into my grafana.

I’m searching for a metric for ensure rack and regions rpc communication is fine, like when I have failure in the rack, but what side of failure is it on rack ou region? I want see region and rack communication for solute this case

I assume you are interested to know in general if the rack is “connected” to the region, meaning that it is receiving/sending commands regardless their execution status (fail or success).
With https://github.com/maas/maas/commit/748793e8eed73916a10e05daece31a003aade6d1 you’ll get the services availability in the metrics (racks “status” as well). It will be shipped with 3.5.

Otherwise, in every rack, you can monitor maas_rack_region_rpc_call_latency_count{call="Ping" which is a periodic ping sent from the rack to the region every 30 seconds. If you don’t see an increase, then it’s a warning.

I cannot identify which region is receiving my rack with this metric, I would not be able to know at which point the communication failure is heard, for example I have 3 regions and 3 racks, one of my racks failed to communicate with the region, but which region?

AFAIK this is not available at the moment. I’m marking this as a community feature request.

1 Like