Hello everyone,
I have experienced several critical failures since MAAS 3.3 including failed commissioning and deployment. Since those nodes are always wiped and clean, MAAS binaries are downloaded integrally from Snap, and the network is just a simple L2 switch, it must be something fantastic like data consistency inside of the database led to the error. We need some tool or built-in function to validate the running configuration of MAAS.
Thank you.
Hi @maasuser1
Can you please provide more details about the failure you have?
Was there anything interesting in rackd and regiond logs?
Since those nodes are always wiped and clean
In case of a failed deployment you can use Rescue mode and inspect the state of your machine.
Hi @troyanov , thanks for your reply.
Since those nodes are always wiped and clean
I wanted to say: all my machine nodes are wiped into a clean state after the “Release” process, and MAAS binaries are the latest stable version from the upstream. If there is anything that can cause those bugs on me solely, must be some data inconsistency problem in our database rather then MAAS itself.
Sorry for delay.
Regarding LP:2097302 I think it should be fixed now, as it is very likely was caused by a cloud-init bug
As for the database-integrity check: I don’t think it is possible at the moment, but we are working on improving state handling in general.
Thank you @troyanov !
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.