I am attempting to extend our current MAAS setup with dedicated rack controllers.
The region controller is running v3.4 with native TLS enabled on port 443 and a self signed certificate.
On the new rack controller servers I have run the following command:
sudo maas init rack --maas-url https://lab-maas01/MAAS --secret !!!
They are however unable to connect as they are unable to verify the certificate of the region controller server. Is there documentation somewhere on how to add the certificate to a rack controller running in snap?
I have imported the certificate into the machine running the rack controller and am able to login to the MAAS server api and access the API.
Thank you for any assistance you may be able to provide.
I have verified that the snap running the rack controller can read /etc/ssl/certs/ca-certificates.crt which contains the certificate used by the region controller.
But for some reason the python program doesnât use it with attempting to connect.
Here is also the message that is spamming the log /var/snap/maas/common/log/rackd.log
I am getting very similar errors.
Itâs so wonderful to see you havenât received any replies or support in 2 weeks.
Iâve been beating my head against this all day and couldnât be more frustrated.
When the rack controller wonât connect - nothing else works, and itâs the last thing any new user thinks to investigate. Iâve spent days troubleshooting in all the wrong places.
so far⌠looks to me like it might be a python related errror:
2024-06-07 21:50:10 provisioningserver.rpc.clusterservice: [critical] Failed to contact region. (While requesting RPC info at https://10.0.2.3:5240/MAAS).
errors[0].raiseException()
File â/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.pyâ, line 475, in raiseException
File â/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.pyâ, line 489, in throwExceptionIntoGenerator
errors[0].raiseException()
File â/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.pyâ, line 475, in raiseException
File â/snap/maas/35359/usr/lib/python3/dist-packages/twisted/python/failure.pyâ, line 489, in throwExceptionIntoGenerator
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [(âSSL routinesâ, ââ, âcertificate verify failedâ)]>]
Hi @kamag, thanks for bringing this up and sorry for the delay! Do you see your self-signed certificate listed in /etc/ssl/certs as a symbolic link on the rack machine? You may see something like below if you ran sudo update-ca-certificates:
ls -l /etc/ssl/certs
...
f081611a.0 -> region_self_signed.pem
region_self_signed.pem -> /usr/local/share/ca-certificates/region_self_signed.crt
If you see your certificate here, can you replace the second symbolic link (the one pointing to somewhere within /usr) with the actual file, then try initializing the rack again? You may need to keep the same filename, however:
For me, the issue was related to multicast net reservations.
I was trying to install MAAS on the same system running Portainer with Docker Swarm (foolish I know). But everything looked like it was working, the web UI and Region Controller came up. There was just no indication why it wasnât both âRegion + Rack controllerâ - and for a new user, this left me horribly confounded.
Confronted with a relatively new lexicon & not knowing exactly what to expect had me piling through the docs & feeling the fury of inadequate definitions and explanations. I went ahead and and got everything setup on a new system, and then crawled one layer deeper into âCommissioning and Deployingâ machines, only to find the docs get even weaker and more cryptic. It is a far cry from the promise of âEasily PXE boot & customize new systemsâ.
Iâm about to throw in my hat on MAAS. It is a beautiful interface with great promise, but just not ready for the maasses (punIntended).
Iâm glad you figured out the issue, but sorry about your frustrations with the documentation. They are constantly being improved, so if you would like to raise any specific concerns regarding it, please consider filing a bug against it
Iâd rather not, iâve already burned too much time with this. I have about 300 pages of error logs and troubleshooting attempts in ChatGPT.
The short story is; if MAAS has multicast conflicts - it should say so. It doesnât, it just appears to be working (Region Controller only), and when one tries to troubleshoot why the Rack Controller isnât also showing up - itâs not clear at all. Python is throwing errors about invalid certificates & the error log is chock full of stuff thatâs irrelevant to a simple network conflict.
Regarding commissioning and deploying new machines - again, what a nightmare. First of all, you should put some big fat red warnings on the Commissioning options that will wipe an active partition. Secondly, the âcloud initâ option to âpaste your own scriptâ should come with a warning, âBe prepared to learn a whole new languageâ. The docs for this section are little more than pointers for those that are already familiar with this territory.
I read somewhere it was possible to initiate an âautoinstall.yamlâ there, tried a dozen times - no joy. I have another forum post about it somewhereâŚ
Alas, the time Iâd hoped MAAS would save me wound up costing me far more than I expected. I put my MAAS machine on the shelf and âmanuallyâ spun up 20 machines in a day (and didnât wipe out any Windows partitions while doing it). I can live with that.