Thanks javier-fs. I’ve already run into those resources, but have taken a second look anyhow. I’m aware that each network can be very specific and tuned for each environment, but I feel I’m missing something a little more fundemental here.
So … I’ll dump some information and perhaps to someone else it will stand out like a sore thumb (it’s overly long - sorry!).
I’ve also made changes since this post was made and am seperating things out into vlans using SDN under a hypervisor host. Consequently, MaaS server now has two interfaces. Funnily enough, I still have the same problem…
To explain …
MaaS server has been placed on an “infrastructure” vlan (10.0.100.0/24). This vlan is supplying DHCP, DNS and Gateway services through 10.0.100.1. MaaS server itself has a reserved IP 10.0.100.225, which is assigned via DHCP mac filtering. It has internet access and can supply snap and apt updates as required.
I have another vlan, which we could consider a “user vlan” with IP range 192.168.10.0/24. This vlan is also being supplied with DHCP, DNS and Gateway services, pointing to IP 192.168.10.1. This vlan also has internet access and everything tests out OK.
MaaS server is virtualized, and has two tagged interfaces - one for infrastructure and one for user, enabled via netplan. Network priority is set using dhcp4-overrides | route-metric values, with the priorty route set to the infrastructure interface. The IP address for the user vlan on the second MaaS interface is set to 192.168.10.225 and is applied via DHCP reservation using mac address as well.
In MaaS, there is a default fabric-0. This is linked to a subnet range of 10.0.10.0/24, with gateway and dns options set in sync with the external network service for that vlan. I have also created a new fabric-1 which is linked to subnet 192.168.10.0/24. Details reflect external network service for that vlan as well.
Options across both subnets:
- Active Discovery = enabled
- Proxy access = allowed
- Allow DNS resolution = Disallowed
- Managed allocation = Disabled
- DHCP = No DHCP
Finally, back to the external network provider: for the “user vlan”, network boot is enabled, TFTP and next server settings are set to 192.168.10.225, and default BIOS filename is set to bootx64.efi.
Additionally, I note the following options across both the external provider and MaaS:
- Infrastructure vlan has access to everything
- User vlan is restricted to it’s own subnet
- Network discovery in MaaS is ENABLED and set to poll every 10 mins.
- For the MaaS controller, under Network, it shows both interfaces in a connected state and linked to the appropriate fabric for each.
- For subnet of fabric-1, a added a reserved DHCP range from 192.168.10.220 to 192.168.10.254, as well as an entry for 192.168.10.1. I understand DHCP is turned off for this subnet, but it still seems to want to know about it. This is all in alignment with the external network provider.
- I should also mention that I have created a seperate domain in MaaS as well, and set it as the default. I cannot remove the maas domain, but records are populating under the new one. It is not set to Authoritive.
Now, if I create a new VM with a tagged virtual interface and pxe-boot, it picks up a 192.168.10.0/24 ranged IP, finds MaaS and runs through the enlisting process until it reaches a status of ready. The host is multi-homed with the second interface connected to a dumb switch and bound to the user vlan. If I connect a phyiscal device and set it to boot from network it enlists the same way without issue. From there, I can configure the power driver for both - for the physical device, that would be Intel AMT. MaaS can then connect and demonstrate control over the machine. I then commission and await a status of ready.
AMT Issues: As a pre-requisite for this test, I reset all network configuration in AMT back to defaults (unconfigure / configure)
Following this, I enlisted the device which picked up IP address 192.168.10.17. Via the webgui I updated the hostname to “physical-pc” instead of leaving it the default random name. Under external network service, I can see that the device has picked up two IP’s - a maas-listing-node (192.168.10.17) and maas-enlist (192.168.10.18). Once I configured the AMT power driver (IP 192.168.10.17), MaaS then listed both IP’s under Machines in the webgui. In MaaS DNS (again - webgui), it showed the physical-pc with an IP address of 192.168.10.17 and a second entry of enp031f6.physical-pc with an IP of 192.168.10.18.
At this point, numbers across everything line up. Device was commissioned and when ready, I allocated it then deployed a version of ubuntu. Immediately the IP address switched to 192.168.10.226. This is outside of the DHCP scope for both the external network service as well as MaaS. After deployment, the IP address remained the same, and the power driver showed an error. I updated the configuration, and put in the only IP address listed in MaaS (192.168.10.226). Again, it failed. With neither external provider or webgui showing a IP address of 192.168.10.18, I entered it into the power driver config and ran a test. Working. I then went to release the machine. Running through it’s processes, it eventually failed. I then updated the power driver IP again, this time to 192.168.10.17. Working. Re-ran the release process, this time completing without further issue, and I’m back to a status of ready.
I’ve mucked around and run a lot of different tests. IP’s will vary but the general gist if the same - maas allocates an IP outside of the range I’ve set and the range that the external network provider will give. It looses the power IP of the device and I have to manually make changes based on historical knowledge to get control back of the machine.
I don’t know where it’s all falling down …