When a MAAS workflow stalls, the symptoms you see may not reveal the real cause. This document provides playbooks you can follow step by step, organized by common failure points. Each playbook includes: what you see, fast triage, deep dive checks, useful tools, and common fixes.
Cannot PXE or iPXE boot at all
Symptoms:
- No boot prompt, no PXE banner, machine falls back to disk or errors immediately.
Fast triage:
- Check link lights.
- Confirm VLAN tagging on switchport.
- Ensure STP is not blocking the port.
Tools:
ip link
ethtool <iface>
lldpcli show neighbors
Fixes:
- Enable portfast/edge on access switchports.
- Configure the correct untagged/native VLAN.
- Allow expected VLANs on trunks.
Client gets no DHCP offer
Symptoms:
- PXE starts but times out “No DHCP or proxyDHCP offers received.”
Fast triage:
- Capture DHCP traffic.
Tools:
tcpdump -n -e -vv -i <iface> port 67 or port 68
Fixes:
- Configure DHCP relay to point to the MAAS rack controller.
- Disable rogue DHCP servers (e.g. libvirt, Wi‑Fi routers).
- Confirm MAAS DHCP is enabled on the VLAN.
DHCP works but TFTP times out
Symptoms:
- DHCP succeeds, but PXE client retries “TFTP open timeout.”
Fast triage:
- Filter for port 69 traffic.
Tools:
tcpdump -n -i <iface> udp port 69
atftp --get -r pxelinux.0 <tftp-ip>
Fixes:
- Open UDP 69 on firewalls.
- Ensure TFTP is bound to the correct interface on the rack controller.
Bootloader loads, kernel fetch fails
Symptoms:
- PXE banner shows, but kernel/initrd download never starts.
Fast triage:
- Test HTTP from the client subnet.
Tools:
curl -I http://<maas-host>/images/...
Fixes:
- Permit HTTP through firewalls.
- Verify proxy configuration.
- Confirm rack controller is serving the correct image.
Kernel boots, commissioning fails
Symptoms:
- Machine boots ephemeral OS but fails during commissioning.
Fast triage:
- Test metadata and DNS.
Tools:
curl -I http://<maas-host>/MAAS/
dig rackd.maas
Fixes:
- Allow metadata traffic through ACLs.
- Confirm DNS resolution of the MAAS region host.
- Check rack logs for commissioning errors.
Commissioning finishes but deploy fails at curtin/apt
Symptoms:
- Curtin installer errors on package download.
Fast triage:
- Test proxy and mirror reachability.
Tools:
curl -I http://archive.ubuntu.com/
apt-config dump | grep -i proxy
Fixes:
- Configure MAAS proxy correctly.
- Allow apt mirrors and proxy ports through firewalls.
- Add root certificates if SSL interception is in place.
Deployed host has no network after reboot
Symptoms:
- Deployment completes, machine reboots, but has no IP.
Fast triage:
- Check netplan render.
Tools:
netplan get
journalctl -u systemd-networkd
Fixes:
- Correct VLAN/bond configuration in MAAS.
- Ensure predictable interface names match MAAS netplan.
Slow image downloads or installs
Symptoms:
- PXE/curtin steps crawl or stall.
Fast triage:
- Test throughput and MTU.
Tools:
iperf3 -c <server>
tracepath <maas-ip>
Fixes:
- Fix MTU mismatches across bridges and tunnels.
- Adjust TCP offloads if buggy NIC drivers.
- Balance load on MAAS proxy.
Intermittent DNS failures
Symptoms:
- Nodes sometimes resolve names, sometimes fail.
Fast triage:
- Inspect systemd‑resolved state.
Tools:
resolvectl status
dig +trace <hostname>
Fixes:
- Configure correct search domains.
- Avoid conflicting DNS services on the host.
- Flush stale leases in MAAS DNS.
Region and rack lost contact
Symptoms:
- MAAS UI shows rack offline.
Fast triage:
- Check rack ↔ region reachability.
Tools:
journalctl -u snap.maas.*
ss -plant | grep 5240
ip route get <region-ip>
Fixes:
- Open required ports between rack and region.
- Correct routing if multiple VRFs.
- Restart MAAS services after link recovery.
Next steps
These playbooks cover the most visible failures. For a deeper look at each tool and how to apply it, see the Tools catalog in the next document.