Serving multiple DHCP subnets from MAAS

I’ve got two different VLANs (“production” and “testing”) which I would like to use with MAAS. So we’ve got MAAS sitting in VLAN “infrastructure”, and a L3 switch routes between “production”, “testing” and “infrastructure”.

The L3 switch also provides DHCP forwarding (which I think is also called DHCP helper in some products) to forward the DHCP broadcasts from “production” to “infrastructure” and from “testing” to “infrastructure”.

We’ve got a single Region+Rack setup.

This works for production. The requests are served, and the machine is able to contact MAAS for commissioning. However, when testing makes a request it gets issued an IP address in production, and therefore commissioning fails.

I believe this has been raised before: https://bugs.launchpad.net/maas/+bug/1640298, but apparently this functionality should be possible based on: https://bugs.launchpad.net/maas/+bug/1254807

Can anyone please advise on what the configuration should be?

Did you ever get a solution for this? We are facing the same problem!

Hi, I’m curious how your DHCP relay is configured. If you check this article, it states that the source address should come from the subnet it is being relayed from (in your case for test it should be 192.168.52.x which is the IP on the DHCP relay which I assume is the default gateway for that subnet).

That is the information that a DHCP server (in this case MAAS) would need to see to know which DHCP pool/scope to use to reply.

If you’re not sure what the relay is using for a source address, you could do a tcpdump on the MAAS server receiving the DHCP requests from the relay and check.

This link explains it fairly well: https://community.spiceworks.com/topic/444712-how-does-a-dhcp-server-know-what-subnet-to-assign

I was never able to find a solution to this. In the end we disposed of the DHCP forwarding on the switches and gave MAAS a trunk connection to the network instead (and a subinterface in every VLAN that we needed to do provisioning in).

That’s probably a more elegant solution anyway, but obviously won’t suit everyone. Sorry, I’ve not found out a solution.

Hi Anton,

I can’t speak for @seitz-a, but it looks like we were having the same issue so I’ll describe how mine was set up.

  • Core switch configured with multiple VLANs
  • Core switch has an L3 interface in each VLAN, providing inter VLAN routing
  • MAAS (with DHCP server) is in one of the VLANs (VLAN X)
  • Other VLANs (e.g. VLAN Y) have no DHCP server available, but on the switch the logical VLAN interface specifies a relay (on our HPE switch it is dhcp select relay and dhcp relay server-address x.x.x.x)

Therefore when a device in LAN Y makes a DHCP broadcast, the switch hears it and forwards it out on LAN X to the specified server (MAAS), with the GIADDR field set to its own address in LAN Y. MAAS’s DHCP therefore replies with an offer for an address in LAN Y. And the PXE boot works. So the problem is not with the DHCP helper/forwarding features as far as I can see.

The problem is when the booted device gets its networking configuration from MAAS (which I believe is not from DHCP but is instead provided a static IP address via a web service call) it is provided an IP address in LAN X, not LAN Y, and we end up with a running machine with no network connectivity.

Hi Forky,

ok I understand your setup but we need more info. Can you share how one of your test machines is configured in MAAS (after deployment)? What does it’s netplan file look like?

Also how your subnets and VLANs are configured in MAAS. I assume you have DHCP enabled on each subnet that you want to allocate from.

It’s likely that we should get you to file a bug and provide all this info - could you do so on launchpad? https://bugs.launchpad.net/maas

if you do, please provide as much info as you can. We need to see what happened during commissioning and also during/after deployment. Full config of how your VLANs and subnets are configured in MAAS, and how you configured your machines before deployment.

Regards,
Anton

Hi @anton5mith,

Understood your need for more concrete information like logs and detailed configuration but I abandoned the problem a long time ago. I’d be willing to do the work at some point but I’m too busy just now to get back into the issue. Perhaps @seitz-a, who is currently going through the issue, is in a better position to provide evidence?

hi @forky2, understand. This is an important feature for us to support so appreciate any inputs we can get. @seitz-a - let me know if you file a bug report.