Question about native TLS and rack controllers

Hi all,

I am attempting to extend our current MAAS setup with dedicated rack controllers.
The region controller is running v3.4 with native TLS enabled on port 443 and a self signed certificate.

On the new rack controller servers I have run the following command:
sudo maas init rack --maas-url https://lab-maas01/MAAS --secret !!!

They are however unable to connect as they are unable to verify the certificate of the region controller server. Is there documentation somewhere on how to add the certificate to a rack controller running in snap?

I have imported the certificate into the machine running the rack controller and am able to login to the MAAS server api and access the API.

Thank you for any assistance you may be able to provide.

1 Like

Some updates here:

I have verified that the snap running the rack controller can read /etc/ssl/certs/ca-certificates.crt which contains the certificate used by the region controller.
But for some reason the python program doesn’t use it with attempting to connect.

Here is also the message that is spamming the log /var/snap/maas/common/log/rackd.log

Blockquote
2024-03-14 06:48:28 provisioningserver.rpc.clusterservice: [critical] Failed to contact region. (While requesting RPC info at https://lab-maas01/MAAS).
/!-- logs truncated --!/
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [(‘STORE routines’, ‘’, ‘unregistered scheme’), (‘system library’, ‘’, ‘’), (‘STORE routines’, ‘’, ‘unregistered scheme’), (‘system library’, ‘’, ‘’), (‘SSL routines’, ‘’, ‘certificate verify failed’)]>]

1 Like

I am getting very similar errors.
It’s so wonderful to see you haven’t received any replies or support in 2 weeks.

I’ve been beating my head against this all day and couldn’t be more frustrated.
When the rack controller won’t connect - nothing else works, and it’s the last thing any new user thinks to investigate. I’ve spent days troubleshooting in all the wrong places.

1 Like

so far… looks to me like it might be a python related errror:

2024-06-07 21:50:10 provisioningserver.rpc.clusterservice: [critical] Failed to contact region. (While requesting RPC info at https://10.0.2.3:5240/MAAS).
errors[0].raiseException()
File “/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.py”, line 475, in raiseException
File “/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.py”, line 489, in throwExceptionIntoGenerator
errors[0].raiseException()
File “/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.py”, line 475, in raiseException
File “/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.py”, line 489, in throwExceptionIntoGenerator
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [(‘SSL routines’, ‘’, ‘certificate verify failed’)]>]

1 Like

Hi @kamag, thanks for bringing this up and sorry for the delay! Do you see your self-signed certificate listed in /etc/ssl/certs as a symbolic link on the rack machine? You may see something like below if you ran sudo update-ca-certificates:

ls -l /etc/ssl/certs
...
f081611a.0 -> region_self_signed.pem
region_self_signed.pem -> /usr/local/share/ca-certificates/region_self_signed.crt

If you see your certificate here, can you replace the second symbolic link (the one pointing to somewhere within /usr) with the actual file, then try initializing the rack again? You may need to keep the same filename, however:

sudo cp /usr/local/share/ca-certificates/region_self_signed.crt /etc/ssl/certs/region_self_signed.pem

You may be having this issue because the snap cannot read the endpoint of that symbolic link

For me, the issue was related to multicast net reservations.
I was trying to install MAAS on the same system running Portainer with Docker Swarm (foolish I know). But everything looked like it was working, the web UI and Region Controller came up. There was just no indication why it wasn’t both “Region + Rack controller” - and for a new user, this left me horribly confounded.

Confronted with a relatively new lexicon & not knowing exactly what to expect had me piling through the docs & feeling the fury of inadequate definitions and explanations. I went ahead and and got everything setup on a new system, and then crawled one layer deeper into “Commissioning and Deploying” machines, only to find the docs get even weaker and more cryptic. It is a far cry from the promise of “Easily PXE boot & customize new systems”.

I’m about to throw in my hat on MAAS. It is a beautiful interface with great promise, but just not ready for the maasses (punIntended).

I’m glad you figured out the issue, but sorry about your frustrations with the documentation. They are constantly being improved, so if you would like to raise any specific concerns regarding it, please consider filing a bug against it

I’d rather not, i’ve already burned too much time with this. I have about 300 pages of error logs and troubleshooting attempts in ChatGPT.

The short story is; if MAAS has multicast conflicts - it should say so. It doesn’t, it just appears to be working (Region Controller only), and when one tries to troubleshoot why the Rack Controller isn’t also showing up - it’s not clear at all. Python is throwing errors about invalid certificates & the error log is chock full of stuff that’s irrelevant to a simple network conflict.

Regarding commissioning and deploying new machines - again, what a nightmare. First of all, you should put some big fat red warnings on the Commissioning options that will wipe an active partition. Secondly, the “cloud init” option to “paste your own script” should come with a warning, “Be prepared to learn a whole new language”. The docs for this section are little more than pointers for those that are already familiar with this territory.

I read somewhere it was possible to initiate an ‘autoinstall.yaml’ there, tried a dozen times - no joy. I have another forum post about it somewhere…

Alas, the time I’d hoped MAAS would save me wound up costing me far more than I expected. I put my MAAS machine on the shelf and ‘manually’ spun up 20 machines in a day (and didn’t wipe out any Windows partitions while doing it). I can live with that.

How can I take this statement seriously without a reference?