Hey Team,
Could really use some help here, our upgrade from Ubuntu 20/Maas 3.1/PG11 ----> Ubuntu 22/Maas 3.3 /PG15 has had some issues that have made the environment unstable. Our setup uses a seperate standalone DB rather than having Pgress running on the same server as the MAAS installation.
The first issue that occurs for us is that DNS completely breaks after the upgrade (no change in config). Our update flow as as follows:
- Stop the services on Maas Region and Rack controllers
- Upgrade DB to PG15
- Upgrade Maas region to Ubuntu 22.04
- Upgrade Maas to 3.3
- DNS is now broken - Need to add nameserver entries to /etc/resolv.conf as a work around
- Maas-regiond tries to start and fails on:
‘’'PermissionError: [Errno 1] Operation not permitted: ‘/var/lib/maas/.secret.tl6e4y00.tmp’
2023-06-26 13:56:17 maasserver.start_up: [error] Error during start-up.
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/maasserver/start_up.py”, line 135, in start_up
yield deferToDatabase(inner_start_up, master=master)
File “/usr/lib/python3/dist-packages/twisted/python/threadpool.py”, line 244, in inContext
result = inContext.theWork() # type: ignore[attr-defined]
File “/usr/lib/python3/dist-packages/twisted/python/threadpool.py”, line 260, in
inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
File “/usr/lib/python3/dist-packages/twisted/python/context.py”, line 117, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File “/usr/lib/python3/dist-packages/twisted/python/context.py”, line 82, in callWithContext
return func(*args, **kw)
File “/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py”, line 857, in callInContext
return func(*args, **kwargs)
File “/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py”, line 203, in wrapper
result = func(*args, **kwargs)
File “/usr/lib/python3/dist-packages/maasserver/utils/orm.py”, line 726, in call_with_connection
return func(*args, **kwargs)
File “/usr/lib/python3/dist-packages/maasserver/utils/init.py”, line 177, in call_with_lock
return func(*args, **kwargs)
File “/usr/lib/python3/dist-packages/maasserver/utils/orm.py”, line 771, in call_within_transaction
return func_outside_txn(*args, **kwargs)
File “/usr/lib/python3/dist-packages/maasserver/utils/orm.py”, line 574, in retrier
return func(*args, **kwargs)
File “/usr/lib/python3.10/contextlib.py”, line 79, in inner
return func(*args, **kwds)
File “/usr/lib/python3/dist-packages/maasserver/start_up.py”, line 198, in inner_start_up
MAAS_SHARED_SECRET.set(security.to_hex(secret))
File “/usr/lib/python3/dist-packages/provisioningserver/utils/env.py”, line 80, in set
atomic_write(value.encode(“ascii”), self.path)
File “/usr/lib/python3/dist-packages/provisioningserver/utils/fs.py”, line 157, in atomic_write
raise error
File “/usr/lib/python3/dist-packages/provisioningserver/utils/fs.py”, line 147, in atomic_write
chown(```
I go to check /var/lib/maas and there is no .secret file as described in the error above. There IS a secret file and the secret within reflects the secret in the MAAS DB in the table maasserver.secret table. I delete that secret from the region controller and MAAS starts working, regenerates the secret and seems to work for a while until server startup is initialized again. Then the box goes into this error loop trying to access .secret.tmp files that don’t exist until I remove that secret file again and it works for a time.
I’m not exactly sure how to troubleshoot the DNS issue, I’ve poked around trying to see if config files have been altered and everything seems to be the same as it was before the upgrade. Any help or suggestions would be greatly appreciated!
Thanks
Anthony