I have been wrestling with this problem and am struggling.
I have a machine (Minisforum MS-01) is able to be discovered but fails to commission. The failure is odd. I have attached a screenshot from the remote KVM but it seems to download the scripts just fine but appears to timeout during the 20-maas-02-dhcp-unconfigured-ifaces script. I have tried many network combinations to see if it is a certain config that is causing problems. When I browse to that URL I am able to get the response back no problem.
To be honest I am not sure where to start with troubleshooting this as I am not sure I am able to log into the machine via the console during that process to view the network stack.
It actually has 4 NIC’s in total. 2x2.5G and 2x10G SFP+
My hope was to be able to bond the 2.5G ports.
The 2.5G ports are both on the same native VLAN. (vlan18) Which is the same vlan/subnet MAAS is on. the 2 x 10G interfaces are on a non-routed unmanaged switch that I use for my storage network.
My router services dhcp for vlan1 (different subnet).
vlan18 is dedicated to MAAS for both DNS and DHCP. MAAS is of course a statically assigned IP.
Interestingly enough. When I look at the ports during discovery and commissioning it appears that the non PXE nic never comes up all the way. It stays in a FastEthernet state (I believe that means 100M) while the other one registers as a full 2.5G
More color around these ports. They are vPro (AMT) capable ports. So while the machine is “off” these ports remain on in a low power state to allow for AMT remote access. It feels to me that the second port is not fully coming “UP” during the pxe process.
Well, I am seeing the same behavior with an old beelink mini pc I have laying around. When I provision to LXD, running on the maas server I do not see this.
Welp, jumbo frames appear to be the cause of all of these problems. A little packet snooping showed a lot of re-transmits and malformed packets. While I would like this to be set, that is a problem for another day.