1
I have a MAAS 3.7.2 deployment with separate region and rack controllers. The rack controller (10.x.y.51) cannot connect to the Temporal service on the region controller (10.x.y.50), causing the maas-agent service to crash in a restart loop.
This prevents the httpproxy.sock from being created, which results in 502 errors when machines attempt PXE boot.
- MAAS Version: 3.7.2-17972-g.35e297c4d (both)
- Temporal Port: 5271 (gRPC frontend)
On Rack Controller
maas-agent Temporal connection failures:
maas-agent[46264]: ERR Temporal client error error="failed reaching server: context deadline exceeded while waiting for connections to become ready" maas.pebble[45464]: Service "agent" stopped unexpectedly with code 1 maas.pebble[45464]: Service "agent" on-failure action is "restart", waiting ~500ms before restart (backoff 1)
PXE boot 502 errors (machine 10.x.z.18 requesting bootx64.efi):
maas-rackd[45499]: provisioningserver.rackdservices.tftp: [info] bootx64.efi requested by 10.x.z.18
maas-http[45608]: [crit] connect() to unix:/run/snap.maas/httpproxy.sock failed (2: No such file or directory)
maas-http[45612]: 127.0.0.1 - - "GET /images/bootx64.efi HTTP/1.1" 502 166
On region controller:
Temporal is running and healthy
$ sudo netstat -tlnp | grep temporal
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN temporal-serv
tcp6 0 0 :::5271 :::* LISTEN temporal-serv
tcp6 0 0 :::5272 :::* LISTEN temporal-serv
tcp6 0 0 :::5273 :::* LISTEN temporal-serv
I’ve tried to open firewall tcp/udp all ports between region controller and rack controller and the problem persisted.
1. The Temporal config shows rpcAddress: “localhost:7233” but the server listens on port 5271. Is the rack agent trying to connect to the wrong endpoint?
2. How does the rack controller learn the Temporal endpoint from the region? Is there a configuration file I should check/edit?
3. The Temporal frontend requires TLS client auth (requireClientAuth: true). Could certificate trust be preventing the connection?
4. What’s the correct way to configure the rack’s agent to connect to the region’s Temporal service?
More info:
maas --version
MAAS 3.7.2 (3.7.2-17972-g.35e297c4d)
Postgres 16
Any suggestion / guidance is greatly appreciated!