This appendix lists frequent, high-impact failure patterns (the “known bads”), including quick signatures, fast confirmation steps, and practical fixes. Use it as a first-pass triage aid before deep dives.
KB-01: Rogue DHCP on the VLAN
Signature
- Two DHCP offers from different MACs or with conflicting options.
- Nodes pick unexpected gateway or DNS.
Fast Confirm
tcpdump -n -e -vv -i <iface> '(port 67 or 68)' -c 200
Look for multiple OFFERs with different siaddr or option 66/67.
Fix
- Disable libvirt or LXD dnsmasq on hosts attached to the production VLAN.
- Shut down consumer routers accidentally bridged into the fabric.
- Keep only one authoritative DHCP per broadcast domain.
KB-02: DHCP Relay Mispointed or Missing
Signature
- No offers when clients and rack are on different subnets.
- Offers appear when client is moved to the rack subnet.
Fast Confirm
- Check switch or router
ip helpertargets. - Capture on rack interface to see no inbound relayed packets.
Fix
- Add or correct
helper-addressto point at rack IP. - Repeat per VRF if VRFs are in use.
KB-03: STP Blocks Port During PXE
Signature
- Client link up, but PXE times out within 5 to 10 seconds.
- Switch shows listening/learning state during boot.
Fast Confirm
show spanning-tree interface <port> detail
Fix
- Enable PortFast or Edge on the access port.
- Avoid loop guard or root guard on server access ports unless required.
KB-04: MTU Mismatch Across Overlays
Signature
tracepathshows PMTU reductions or “too big” ICMP.- HTTP image fetches stall partway.
Fast Confirm
tracepath <rack-ip>
Fix
- Align MTU end to end; add 50–100 bytes headroom for VXLAN/GRE/MACsec.
- Set the same MTU on NIC, bond, bridge, and uplink.
KB-05: SSL Interception Breaks Commissioning
Signature
aptorcurlfails with X.509 errors in ephemeral OS.- Proxy works on desktops but not during commissioning.
Fast Confirm
curl -v https://archive.ubuntu.com/ 2>&1 | head -n 20
Fix
- Install corporate root CA in ephemeral and deployed images.
- Bypass SSL bumping for MAAS metadata and Ubuntu mirrors if possible.
KB-06: Libvirt or LXD dnsmasq Conflict
Signature
- DHCP offers originate from 192.168.122.1 or
lxdbr0host. - Two offers with different lease times/options.
Fast Confirm
ps aux | grep -E 'dnsmasq|libvirtd|lxd'
Fix
- Disable default libvirt dnsmasq and LXD dnsmasq on hosts bridged to production.
- Isolate lab bridges from MAAS fabrics.
KB-07: DHCP Snooping or IP Source Guard Drops Replies
Signature
- Discoveries seen, but offers never reach clients.
- Works when rack moved to a different switchport.
Fast Confirm
- Check switch
show dhcp snoopingfor bindings and trusted ports. - Packet capture on rack shows offers leaving but not arriving at client.
Fix
- Mark rack port as trusted.
- Verify snooping database and source guard policies allow rack replies.
KB-08: Option 82 Expectations Mismatch
Signature
- DHCP server ignores relayed requests, or client ignores offers.
- Only fails behind specific relay devices.
Fast Confirm
- Check relay policy for option 82 insertion or stripping.
- Inspect offers in capture for relay agent options.
Fix
- Standardize consistent option 82 handling.
- If using external DHCP, ensure next-server and bootfile are set correctly.
KB-09: Wrong VLAN or Native VLAN on Access Port
Signature
- No DHCP seen, or DHCP from an unexpected subnet.
- LLDP shows the wrong VLAN name/ID.
Fast Confirm
lldpcli show neighbors
tcpdump -i <iface> '(port 67 or 68)'
Fix
- Set the intended untagged/native VLAN on the access port.
- Verify trunk allowed-VLANs for hypervisors or bonded hosts.
KB-10: Proxy Auth or Category Filter Blocks APT
Signature
- 407 Proxy Auth Required, or 403 Blocked by Category.
- Commissioning reaches metadata but Curtin fails.
Fast Confirm
apt-config dump | grep -i proxy
curl -I http://<rack-ip>:3128/
Fix
- Inject APT proxy credentials via cloud-init.
- Allowlist Ubuntu mirrors and Snap endpoints.
KB-11: DNS Split-Horizon Confusion
Signature
- Name resolves differently from ephemeral vs deployed node.
- Intermittent “Temporary failure in name resolution”.
Fast Confirm
resolvectl status
dig @<maas-dns-ip> <node>.maas A +noall +answer +ttlid
Fix
- Set forwarders in MAAS for corporate zones.
- Ensure deployed nodes use the intended stub or recursive resolver.
KB-12: UEFI HTTP Boot vs Legacy PXE Mismatch
Signature
- Firmware attempts HTTP boot but TFTP is configured (or vice versa).
- DHCP options 66/67 correct but firmware ignores them.
Fast Confirm
- Read firmware boot entries and mode (UEFI vs Legacy).
- Observe client requests in capture: HTTP vs TFTP.
Fix
- Align firmware mode with MAAS boot method.
- Prefer UEFI HTTP boot where supported.
KB-13: Predictable Interface Name Mismatch in Netplan
Signature
- Deployed node returns without network.
- Netplan references
eno1but hardware exposesenp3s0.
Fast Confirm
ip link
netplan get
journalctl -u systemd-networkd
Fix
- Update MAAS interface mapping or disable renaming if appropriate.
- Re-render netplan and apply.
KB-14: Region and Rack Out of Sync or Unreachable
Signature
- Rack shows offline in UI.
- Machines fail to fetch metadata despite DHCP success.
Fast Confirm
ss -ltnup | grep 5240
ip route get <region-ip>
journalctl -u snap.maas.rackd
Fix
- Open required ports and correct routing between rack and region.
- Restart services after link recovery; verify time sync.
KB-15: MAC Limits and Port Security
Signature
- Node fails on first boot after cable moves.
- Switch shows security violation on the port.
Fast Confirm
show port-security interface <port>
show mac address-table interface <port>
Fix
- Increase max MACs on access ports for provisioning.
- Clear sticky MAC entries or disable sticky on lab ports.
KB-16: NTP Unreachable, Time Skew
Signature
- Token or TLS failures during commissioning.
- Logs show clock skew warnings.
Fast Confirm
chronyc tracking
Fix
- Permit UDP 123 to NTP servers.
- Configure NTP sources reachable from ephemeral networks.
KB-17: HTTP Keep-Alive Disabled on Proxy
Signature
- Many short HTTP requests are slow.
- Proxy shows high connection churn.
Fast Confirm
- Examine proxy config and access logs for reuse.
- Measure with
curl -vand look for “Connection: close”.
Fix
- Enable keep-alive and tune connection pooling on the rack proxy.
- Upgrade or scale out proxies under load.
KB-18: IPv6 RAs or DHCPv6 Interfering
Signature
- Clients pick IPv6 route and fail to reach IPv4-only endpoints.
- Logs show DHCPv6 attempts in mixed networks.
Fast Confirm
tcpdump -i <iface> 'icmp6 or port 546 or 547'
Fix
- Ensure dual-stack reachability end to end, or limit to IPv4 during provisioning.
- Set correct router advertisements for desired behavior.
KB-19: Content Filter Blocks Metadata Path
Signature
- Curl to /MAAS/ returns 403 through middleboxes.
- Works on admin subnet but not on commissioning VLAN.
Fast Confirm
curl -I http://<region-ip>:5240/MAAS/
Fix
- Allowlist region metadata endpoints.
- Remove path-based filtering for commissioning subnets.
KB-20: TFTP Bound to Wrong Interface
Signature
- DHCP succeeds, TFTP times out, rack has multiple NICs.
Fast Confirm
ss -uanp | grep ':69 '
Fix
- Bind TFTP to the VLAN interface serving PXE clients.
- Prefer HTTP boot if firmware supports it.
Quick Triage Loop
- Reproduce and capture just long enough to confirm the signature.
- Match to a known bad pattern above.
- Apply the minimal fix and retest.
- If still failing, escalate to the symptom-driven playbooks and tools catalog.
Next Steps
Wrap up with Appendix E: Lab Patterns and Cookbook for Multipass, LXD, and KVM environments that coexist peacefully with MAAS.