How We Use MAAS to Manage 5k+ Machines in the world.

Last year, I encountered several bugs in MAAS, and after discussions with r00ta, I’d like to share how we utilize MAAS in our environment.

We currently manage over 5,000 bare-metal servers distributed across more than 200 data centers worldwide. While managing 5,000 servers itself might not sound large-scale, their global distribution significantly increases operational complexity.

We initially adopted MAAS version 2.9 in 2019, upgraded to 3.1 in 2021, and are currently migrating to version 3.5. Due to our highly distributed infrastructure—where some data centers host as few as 10 servers, while others handle upwards of 500—we operate approximately six independent MAAS regions globally. For instance, Frankfurt manages servers across EMEA, Singapore oversees Asia-Pacific, and Phoenix handles the Americas. In certain geographical areas, we even maintain two distinct MAAS regions to separately manage different server clusters.

Each MAAS region uses a dedicated /16 private IP block (e.g., 10.0.0.0/16), with the first /24 subnet (10.0.0.0/24) specifically reserved for the control plane.

Every top-of-rack (TOR) switch, corresponding to a Rack controller, has its own exclusive /24 subnet (e.g., 10.0.1.0/24) dedicated for BMC access. Typically, these racks use IP addresses such as 10.0.1.1 or 10.0.1.2 to interface with their BMCs. Most servers utilize a “sharelink” setup, which introduces a management challenge since a server cannot directly manage its own BMC. To address this, we implemented High Availability (HA) Rack controllers. We provided a custom patch to solve this issue in MAAS 3.1; however, migrating to MAAS 3.5 reintroduced complexities due to the adoption of Temporal workflows. A definitive fix is still under active development.

In our environment almost every server possessed a public IP address, with the region API exposed directly to the Internet, allowing rack controllers (rackd) straightforward connectivity. However, in MAAS 3.5, we transitioned to OpenVPN combined with FRRouting (FRR) to handle network connectivity. BGP is utilized to advertise a /32 route for each server’s BMC IP, effectively resolving the “sharelink” management issue. Within our OpenVPN setup, maintaining consistent IP addresses for rack controllers connecting to regional controllers is crucial. We use the ifconfig-pool-persist setting to achieve IP stability. An important consideration here is that enabling both duplicate-cn and ifconfig-pool-persist simultaneously is problematic. Specifically, when duplicate-cn is enabled, network instability can result in changing VPN IP addresses for racks. Consequently, the region mistakenly continues to use the outdated IP, causing the DHCP service to stop functioning entirely until the region daemon (regiond) is manually restarted. That is I meet the problem here: MAAS DHCP Server Issue and Bug Report #2089222.

We deploy rackd, OpenVPN, and FRR within a single privileged LXC container, offering several operational benefits:

  • Provides MAAS with a “bare-metal-like” environment, greatly simplifying upgrades and migrations.
  • Enables direct manipulation of the host’s routing table and iptables rules.
  • Avoids noisy kernel logs typically associated with Snap packaging and AppArmor restrictions.

Additionally, while we maintain our own Chrony configurations for time synchronization, MAAS has a tendency to overwrite these settings. By containing MAAS components within LXC containers, we isolate these configuration changes, thus preserving host settings.

We maintain numerous custom patches tailored specifically to our operational needs. These include features such as Single Sign-On (SSO) integration, delete PXE boot commands before machine power on (as certain older BMC models may unexpectedly reset if they receive PXE boot and power-on commands within a short interval), and many other adjustments to ensure smooth operation.

Finally, MAAS alone isn’t sufficient for managing our extensive infrastructure. To enhance operational efficiency, we’ve developed additional tools that help identify the physical location of servers, and facilitate operations such as OS reinstalls or system reboots.

This is just a brief overview of how we use MAAS in our environment. If you have any questions or want to discuss further, please feel free to reach out.

1 Like

Thank you so much for sharing!

we operate approximately six independent MAAS regions globally.

You mean you have 6 independent MAAS regions or 6 regions in HA mode?

If you can share, do you have some statistics about the number of concurrent deployments you have on the MAAS installation that is managing most of the servers?

The answer is Yes.

Before we started the upgrade to MAAS 3.5, we maintained six independent MAAS regions to manage all of our production servers (excluding staging and testing environments). Now, as we migrate to version 3.5, we expect to eventually have between five and eight independent MAAS regions, depending on how we decide to segment them geographically and by business unit. For servers originally deployed with MAAS 3.1, we utilize the import feature to migrate them smoothly into the new environment.

Typically, we deploy around 30 machines simultaneously, though occasionally this number rises to about 70. Thanks to Squid caching, network traffic between regions and racks remains moderate—generally around 30-50 Mbps.

Additionally, since we rely on a cloud-hosted PostgreSQL service, and because MAAS always installs a local PostgreSQL instance when deploying regiond, we’ve repackaged MAAS from source to remove PostgreSQL as a dependency.