DNS - Best practice query when integrating into existing networks

I’ve been playing around with MaaS 3.5 for a little while now (snap install), and keep running into quirky network related issues. Unfortunately, I’m not a network guy, so I have every confidence that this issue may make me look a little silly.

MaaS has been deployed into an existing network segment, complete with it’s own DNS and DHCP services. To that end, DHCP services in MaaS were turned off - althrough reserved ranges where added (via webgui) as it still seems to want to assign it’s own IP’s during deployment.

And that’s the problem - MaaS is wanting to handle this stuff itself when I’d prefer it rely on existing services. When I turn off “managed allocation”, it then assigns an IP address outside of the DHCP scope of the network. If I allow managed allocation, it picks a valid IP, but then my DNS server has a DHCP allocated address in it’s table that doesn’t match what MaaS has. I’ve made various changes with mixed results. I can’t seem to fix things things like AMT power address being lost after deployment, unable to find host using hostname, etc.

I’m not sure what the best set up is here, and I’m really just hacking away at things quite badly. I’d think this is a common scenario, so was curious if there is a reference to a best practice way of integrating MaaS into an existing network?

Hi @scottblackburn,

If I understand correctly, you are using MAAS with an external DHCP. Could you take a look to the troubleshooting documentation? Specifically this section:

https://maas.io/docs/how-to-troubleshoot-common-issues#external-dhcp-configuration

In addition to that, you might also find useful this video published not long time ago here in Discourse about how to use external DHCP with MAAS:

Thanks javier-fs. I’ve already run into those resources, but have taken a second look anyhow. I’m aware that each network can be very specific and tuned for each environment, but I feel I’m missing something a little more fundemental here.

So … I’ll dump some information and perhaps to someone else it will stand out like a sore thumb (it’s overly long - sorry!).

I’ve also made changes since this post was made and am seperating things out into vlans using SDN under a hypervisor host. Consequently, MaaS server now has two interfaces. Funnily enough, I still have the same problem…

To explain …

MaaS server has been placed on an “infrastructure” vlan (10.0.100.0/24). This vlan is supplying DHCP, DNS and Gateway services through 10.0.100.1. MaaS server itself has a reserved IP 10.0.100.225, which is assigned via DHCP mac filtering. It has internet access and can supply snap and apt updates as required.

I have another vlan, which we could consider a “user vlan” with IP range 192.168.10.0/24. This vlan is also being supplied with DHCP, DNS and Gateway services, pointing to IP 192.168.10.1. This vlan also has internet access and everything tests out OK.

MaaS server is virtualized, and has two tagged interfaces - one for infrastructure and one for user, enabled via netplan. Network priority is set using dhcp4-overrides | route-metric values, with the priorty route set to the infrastructure interface. The IP address for the user vlan on the second MaaS interface is set to 192.168.10.225 and is applied via DHCP reservation using mac address as well.

In MaaS, there is a default fabric-0. This is linked to a subnet range of 10.0.10.0/24, with gateway and dns options set in sync with the external network service for that vlan. I have also created a new fabric-1 which is linked to subnet 192.168.10.0/24. Details reflect external network service for that vlan as well.

Options across both subnets:

  • Active Discovery = enabled
  • Proxy access = allowed
  • Allow DNS resolution = Disallowed
  • Managed allocation = Disabled
  • DHCP = No DHCP

Finally, back to the external network provider: for the “user vlan”, network boot is enabled, TFTP and next server settings are set to 192.168.10.225, and default BIOS filename is set to bootx64.efi.

Additionally, I note the following options across both the external provider and MaaS:

  • Infrastructure vlan has access to everything
  • User vlan is restricted to it’s own subnet
  • Network discovery in MaaS is ENABLED and set to poll every 10 mins.
  • For the MaaS controller, under Network, it shows both interfaces in a connected state and linked to the appropriate fabric for each.
  • For subnet of fabric-1, a added a reserved DHCP range from 192.168.10.220 to 192.168.10.254, as well as an entry for 192.168.10.1. I understand DHCP is turned off for this subnet, but it still seems to want to know about it. This is all in alignment with the external network provider.
  • I should also mention that I have created a seperate domain in MaaS as well, and set it as the default. I cannot remove the maas domain, but records are populating under the new one. It is not set to Authoritive.

Now, if I create a new VM with a tagged virtual interface and pxe-boot, it picks up a 192.168.10.0/24 ranged IP, finds MaaS and runs through the enlisting process until it reaches a status of ready. The host is multi-homed with the second interface connected to a dumb switch and bound to the user vlan. If I connect a phyiscal device and set it to boot from network it enlists the same way without issue. From there, I can configure the power driver for both - for the physical device, that would be Intel AMT. MaaS can then connect and demonstrate control over the machine. I then commission and await a status of ready.

AMT Issues: As a pre-requisite for this test, I reset all network configuration in AMT back to defaults (unconfigure / configure)

Following this, I enlisted the device which picked up IP address 192.168.10.17. Via the webgui I updated the hostname to “physical-pc” instead of leaving it the default random name. Under external network service, I can see that the device has picked up two IP’s - a maas-listing-node (192.168.10.17) and maas-enlist (192.168.10.18). Once I configured the AMT power driver (IP 192.168.10.17), MaaS then listed both IP’s under Machines in the webgui. In MaaS DNS (again - webgui), it showed the physical-pc with an IP address of 192.168.10.17 and a second entry of enp031f6.physical-pc with an IP of 192.168.10.18.

At this point, numbers across everything line up. Device was commissioned and when ready, I allocated it then deployed a version of ubuntu. Immediately the IP address switched to 192.168.10.226. This is outside of the DHCP scope for both the external network service as well as MaaS. After deployment, the IP address remained the same, and the power driver showed an error. I updated the configuration, and put in the only IP address listed in MaaS (192.168.10.226). Again, it failed. With neither external provider or webgui showing a IP address of 192.168.10.18, I entered it into the power driver config and ran a test. Working. I then went to release the machine. Running through it’s processes, it eventually failed. I then updated the power driver IP again, this time to 192.168.10.17. Working. Re-ran the release process, this time completing without further issue, and I’m back to a status of ready.

I’ve mucked around and run a lot of different tests. IP’s will vary but the general gist if the same - maas allocates an IP outside of the range I’ve set and the range that the external network provider will give. It looses the power IP of the device and I have to manually make changes based on historical knowledge to get control back of the machine.

I don’t know where it’s all falling down …

what’s the IP mode of the interfaces of the deployed machine? If you use AUTO IP, then MAAS will pick an IP from the subnet outside the dynamic and the reserved ranges. If you want to let your DHCP server to assign an IP to that machine when it’s deployed, then change the IP mode to DHCP

Thanks r00ta - Looking at my enlisted machine, I can see that for network, DHCP is set to “No DHCP”, but IP ADDRESS is set to “Auto assign”; is this the mode you’re referring to?

Poking about, I can’t see any obvious way to change it (via webgui), so will this need to be a CLI change, and would it need to be done for each enlisted device? Could it be changed to a default instead?

UPDATE 01: I see where it can be changed now (ACTIONS → Edit Physical). I’ll run a test now …

I guess the final query remains however - how could this be set as a default for any enlisted machine?

UIPDATE 02: Testing complete. Things seem better - IP is in range, but AMT power IP was lost for this one. Device shows as having 2x IP’s (192.168.10.17 & .18), where 192.168.10.17 was used for control during deployment process, and it now needs to be updated to 192.168.10.18 once the device has been deployed.

Release process looks to fail in a similar way, where I need to flip the AMT IP config the other way for it to work. How … odd.

If you are using DHCP to assign an IP to your BMC this is expected

where 192.168.10.17 was used for control during deployment process, and it now needs to be updated to 192.168.10.18 once the device has been deployed.

Your external DHCP server should then inform MAAS about the IP leases

So - I’m still having trouble here and can’t seem to escape it.

As I was having a DNS sync-fight with MaaS, I gave up and reconfigured the network, turning off external DHCP and DNS services to let MaaS do what it wants. However, there are still some issues.

A quick summary for context (where “my” replaces organisation identifiers):

Hostname: my-maas-server
Interfaces (static addresses):

  • enp6s18 - 10.0.100.225 (10.0.100.0/24) | MaaS fabric-0
    [infra network. Maas server lives here. Internet enabled via gateway 10.0.100.1]
  • enp6s19 - 192.168.10.225 (192.168.10.0/24) | MaaS fabric-1
    [isolated user network; can only access it’s own subnet and internet via gateway 192.168.10.1)
  • enp6s20 - 192.168.20.225 (192.168.20.0/24) | MaaS fabric-2
    [isolated user network; can only access it’s own subnet and internet via gateway 192.168.20.1)
  • enp6s21 - 192.168.1.225 (192.168.1.0/24) | MaaS fabric-3
    [not used at the moment. Just “there” for now]

I can run nslookups from my-maas-server quickly and without issue. However, I have deployed two machines to the isolated networks (fabric-1, fabric-2), and they are problematic (one ubuntu, one windows). Both are able to ping external IP’s, but neither can resolve DNS. Looking at it more closely, both have a primary name server of 10.0.100.225, while secondary server is set correctly for both.

I would have thought that would simply give a delayed response (something I could live with), but it seems that DNS cannot be served at all. For the windows box, I manually set 192.168.20.225 for the name server and re-tested, which gave the following reply:

Server: enp6s20.my-maas-server.my.lan
Address: 192.168.20.225

DNS request timed out.
    timeout was 2 seconds.
*** enp6s20.my-maas-server.my.lan can't find www.google.com: server failed.

I’ve now blown the ubuntu machine away, and attempted pxe booting from scratch, and it can’t do so.

I think my routing is stuffed, and this means that my netplan isn’t right. The active plan is:

network:
  version: 2
  ethernets:
    enp6s18:
      dhcp4: false
      addresses:
        - 10.0.100.225/24
      nameservers:
        search: [my.lan]
        addresses: [10.0.100.225]
      routes:
        - to: default
          via: 10.0.100.1
        - to: 10.0.100.0/24
          via: 10.0.100.1
          table: 1
      routing-policy:
        - from: 10.0.100.0/24
          table: 1
    enp6s19:
      dhcp4: false
      addresses:
      - 192.168.10.225/24
      nameservers:
        search: [my.lan]
        addresses: [192.168.10.225]
      routes:
        - to: 192.168.10.0/24
          via: 192.168.10.1
          table: 2
      routing-policy:
        - from: 192.168.10.0/24
          table: 2
    enp6s20:
      dhcp4: false
      addresses:
        - 192.168.20.225/24
      nameservers:
        search: [my.lan]
        addresses: [192.168.20.225]
      routes:
        - to: 192.168.20.0/24
          via: 192.168.20.1
          table: 3
      routing-policy:
        - from: 192.168.20.0/24
          table: 3
    enp6s21:
      dhcp4: true
      dhcp4-overrides:
        route-metric: 400

Perhaps my whole approach is wrong, or I just need a CCNA. Either way… if anyone can help bash it into shape, I’m be greatly appreciative …

[also, so it makes sense as to why I’m going about it this way - this is a virtualised box that two teams of people will use. Each team will access their machine resources via a VPN tunnel into the isolated network that they’ve been alocated]

Here my try:

network:
version: 2
ethernets:
enp6s18:
dhcp4: false
addresses:
- 10.0.100.225/24
nameservers:
search: [my.lan]
addresses: [10.0.100.225]
routes:
- to: default
via: 10.0.100.1
-# I think the default route will use routing table 254 (rtb254) for the default MAAS interface
-# the above 2 lines is enough to do the job
-# - to: 10.0.100.0/24
-# via: 10.0.100.1
-# table: 1
-# routing-policy:
-# - from: 10.0.100.0/24
-# table: 1
enp6s19:
dhcp4: false
addresses:
- 192.168.10.225/24
nameservers:
search: [my.lan]
addresses: [192.168.10.225]
routes:
- to: 192.168.10.0/24 #rtb2: definition for default route interface enp6s19
via: 192.168.10.1 #
table: 2 #
routing-policy:
- from: 192.168.10.0/24 #rtb2: whatever traffic came from net 192.168.10.0/24
table: 2 #rtb2: resolves vith rtb2
- from: 192.168.10.0/24 #rtb254: net 192.168.10.0/24 resolves with rtb254
table: 254 #rtb254
to: 192.168.10.0/24 #rtb254
- from: 0.0.0.0/0 #rtb2: whatever net from interface with rtb2
to: 192.168.10.0/24 #rtb2: resolves via rtb2 default route interface above
table: 2 #rtb2
enp6s20:
dhcp4: false
addresses:
- 192.168.20.225/24
nameservers:
search: [my.lan]
addresses: [192.168.20.225]
routes:
- to: 192.168.20.0/24
via: 192.168.20.1
table: 3
routing-policy:
- from: 192.168.20.0/24
table: 3
- from: 192.168.20.0/24 #rtb254: net 192.168.10.0/24 resolves with rtb254
table: 254 #rtb254
to: 192.168.20.0/24 #rtb254
- from: 0.0.0.0/0 #rtb3: whatever net from interface with rtb2
to: 192.168.20.0/24 #rtb3: resolves via rtb2 default route interface above
table: 3 #rtb3
enp6s21:
dhcp4: true
dhcp4-overrides:
route-metric: 400

— Expected result would be something like below:
$ ip r
default via 10.0.100.1 dev enp6s18 proto static
default via <dhcpd_gw_addr> dev enp6s21 proto dhcp src <ip_leased_from_dhcpd_4_MAAS_SRV> metric 400 # ← if enp6s21 is used
10.0.10.0/24 dev enp6s19 proto kernel scope link src 10.0.10.225
10.0.20.0/24 dev enp6s20 proto kernel scope link src 10.0.20.225

ps: just jump in to discourse maas recently. So sorry for the unformated text block @scottblackburn

thanks for jumping in @manuel-paula and welcome to MAAS discourse!

Hello @manuel-paula - and thanks for the feedback. As an fyi - You can add code blocks using 3x back-ticks (`) before and after text.

For the netplan that you’ve listed above, I get the following once it’s formatted. Just want to double check that this looks as intended …

network:
  version: 2
  ethernets:
    enp6s18:
      dhcp4: false
      addresses:
        - 10.0.100.225/24
      nameservers:
        search: [my.lan]
        addresses: [10.0.100.225]
      routes:
        - to: default
          via: 10.0.100.1
      # Default route will use routing table 254 (rtb254) for the default MAAS interface
    enp6s19:
      dhcp4: false
        addresses:
          - 192.168.10.225/24
        nameservers:
          search: [my.lan]
          addresses: [192.168.10.225]
        routes:
          - to: 192.168.10.0/24 #rtb2: definition for default route interface enp6s19
            via: 192.168.10.1 #
            table: 2 #
        routing-policy:
          - from: 192.168.10.0/24 #rtb2: whatever traffic came from net 192.168.10.0/24
            table: 2 #rtb2: resolves vith rtb2
          - from: 192.168.10.0/24 #rtb254: net 192.168.10.0/24 resolves with rtb254
            table: 254 #rtb254
            to: 192.168.10.0/24 #rtb254
          - from: 0.0.0.0/0 #rtb2: whatever net from interface with rtb2
            to: 192.168.10.0/24 #rtb2: resolves via rtb2 default route interface above
            table: 2 #rtb2
    enp6s20:
      dhcp4: false
        addresses:
          - 192.168.20.225/24
        nameservers:
          search: [my.lan]
          addresses: [192.168.20.225]
        routes:
          - to: 192.168.20.0/24
            via: 192.168.20.1
	    table: 3
        routing-policy:
          - from: 192.168.20.0/24
            table: 3
          - from: 192.168.20.0/24 #rtb254: net 192.168.20.0/24 resolves with rtb254
            table: 254 #rtb254
            to: 192.168.20.0/24 #rtb254
          - from: 0.0.0.0/0 #rtb3: whatever net from interface with rtb2
            to: 192.168.20.0/24 #rtb3: resolves via rtb2 default route interface above
            table: 3 #rtb3
    enp6s21:
      dhcp4: true
      dhcp4-overrides:
        route-metric: 400

Yes this is the intended content. Sorry for the trouble @scottblackburn for the formating issue that i brought.

Hope this this helps the problem in your original post.