Hi team,
I’ve been experimenting with MAAS to evaluate whether it fits our use case.
We’re currently running a single-DC deployment with ~100 leased servers, but we’re planning a transition to a multi-DC/multi-AZ architecture — eventually managing around 200 servers across 3–4 data centers operated by various vendors.
1. Power Management
In our current DC, IPMI access is restricted to a separate VPN. While this is manageable for individual servers, it’s cumbersome at scale. The DC vendor offers a power control API, but it’s not directly compatible with MAAS’s Webhook power driver.
From what I understand, this would require creating a custom power driver.
Questions:
- Is there any documentation or guidance on writing a custom power driver for MAAS?
- Would this require maintaining a fork, or is there a cleaner, supported way to extend MAAS with custom drivers?
- What’s the recommended approach in this situation?
2. Single-DC Setup
For now, let’s focus on a single DC. Since we don’t own the servers, switches, or other hardware, I want to confirm whether I’m even on the right track. Here’s what we’re trying to achieve:
- Day 1: Automate provisioning of bare-metal servers
- Day 2: Automate updates (OS patches, configuration drift correction)
- Then: Use the MAAS Cluster API Provider to provision a Kubernetes cluster on those servers
- Finally: Deploy our product and its third-party dependencies via Kubernetes
I’m currently evaluating MAAS only for the Day 1 provisioning aspect. My assumptions are:
- MAAS can be used if it can power-cycle the servers (via the custom driver)
- MAAS can PXE-boot the servers
Are these assumptions sound? Would you recommend a different approach given that we don’t own the hardware?
3. Multi-DC Architecture
From what I gather, MAAS isn’t explicitly designed for multi-DC operations — but I’ve seen some community members use a single MAAS installation with separate regions per DC.
- Is this the recommended pattern for multi-DC management with MAAS?
- Are there known limitations or gotchas in doing this?
- Would you instead recommend a separate MAAS deployment per DC?
Some context: we rarely provision new servers. Our scaling strategy is to add new “availability zones” — each AZ comprising one or more racks within a DC, each independently hosting our product. A DC can have multiple AZs.
Our goals with this are:
- Enable canary-style upgrades by isolating AZs
- Eliminate single points of failure
- Move toward full Infrastructure-as-Code, which we currently lack
To clarify: we’re not a data center provider, and we don’t provision machines for end users. Our focus is internal platform stability and operational automation.
I’ll pause here. Any insights or suggestions would be very welcome!
Thanks in advance.