HA DHCP configuration broke MaaS

Hey there,

I enabled High Availability DHCP on my 2 rack and region+rack controllers, 192.168.20.2 and 192.168.5.2
and for some reason I am unable to figure out why exactly, my DHCP services are no longer working and fail to restart.

I have attempted to sudo snap restart maas.supervisor and sudo maas init region+rack... and sudo maas init rack...
None of these worked. I have attached the log files below, kindly let me know if I am missing any crucial bits of information.

CloudOperator.maas - 192.168.20.2 - region+rack controller
/var/snap/maas/common/log/maas.log
/var/snap/maas/common/log/dhcpd.log
/var/snap/maas/common/log/rackd.log
/var/snap/maas/common/log/regiond.log
/var/log/syslog


Ubuntu18S3.maas - 192.168.5.2 - rack controller
/var/snap/maas/common/log/maas.log
/var/snap/maas/common/log/dhcpd.log
/var/snap/maas/common/log/rackd.log
/var/log/syslog

From my understanding the rack controller (Ubuntu18S3.maas) should be on VLAN5 as it has IP address of 192.168.5.2 but the MaaS UI is showing VLAN20. I think the HA-DHCP configuration must have messed it up but my assumption may be proven wrong here.

I have pasted the subnets and VLANS here as well:

Not sure why so many IPV6 subnets show up but I only marked the VLANS as tagged in order to ensure MAAS-provided DHCP would occur.

I am unable to commission any of my devices as none of them are currently receiving the DHCP offer. I would appreciate any insight from members of this community. Thank you for your time and attention.

I attempted to manually modify the dhcpd.conf file since that is what the error logs point to:

Internet Systems Consortium DHCP Server 4.4.1 Copyright 2004-2018 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ /var/snap/maas/common/maas/dhcpd.conf line 35: semicolon expected. peer address fd8a: ^ /var/snap/maas/common/maas/dhcpd.conf line 120: failover peer failover-vlan-5006: not found failover peer "failover-vlan-5006" ^ Configuration file errors encountered -- exiting

But somehow it seems the file is in active use so after every change I make, the file is overwritten back to its present unmodified state.

I’m not entirely sure if I should attempt to pause the snap maas.supervisor or something in order to reset the configuration file. It would be nice if I didn’t have to reconfigure my MaaS cluster from scratch just to overcome this error. Any tips on how this could be investigated further would be greatly appreciated.