Network troubleshooting: Security and compliance

Enterprise controls can quietly break PXE, DHCP, DNS, TFTP, HTTP boot, metadata, and APT traffic. This document maps common security features to MAAS workflows, shows how to detect breakage, and offers fixes that preserve compliance without blocking provisioning.


Threat Model and Blast Radius

  • PXE and DHCP are unauthenticated by default. A rogue DHCP or TFTP can hijack boots.
  • HTTP boot and image fetches may cross security zones; TLS interception and ACLs are common.
  • Packet captures can expose credentials and metadata — capture responsibly and redact artifacts.
  • Region and rack controllers are critical infrastructure; restrict access and monitor aggressively.

Baseline Allow Rules for MAAS Traffic

Open these between nodes, racks, regions, and any intermediate firewalls. Tune to your IP ranges and interfaces.

  • DHCP: UDP 67, 68
  • TFTP: UDP 69
  • HTTP boot and images: TCP 80 (and 443 if you serve over HTTPS)
  • MAAS API and metadata: TCP 5240 to region controller
  • DNS: UDP/TCP 53
  • NTP: UDP 123
  • Proxy: TCP 3128 (or your configured port)
  • Rack ↔ Region internal ports: Allow TCP from rack to region for API, images, websockets if used

Example nftables sketch:

nft add rule inet filter forward tcp dport {80,443,3128,5240} counter accept
nft add rule inet filter forward udp dport {53,67,68,69,123} counter accept
# Be stricter in production: scope src/dst subnets, interfaces, and rack/region IPs

TLS Interception (SSL Bump) Pitfalls

Symptoms

  • Commissioning reaches metadata, but curl or apt fails with X.509 errors.
  • Cloud-init or Curtin logs show certificate verify failed.
  • Browsers work on corporate desktops, but ephemeral OS cannot validate the proxy CA.

Quick Checks

curl -v http://<rack-ip>:3128/
curl -v https://images.maas.io/ 2>&1 | sed -n '1,20p'
openssl s_client -connect <mirror>:443 -showcerts </dev/null | head -n 40

Fixes

  • Install the corporate root CA into the ephemeral and deployed environments:
    In MAAS settings → Commissioning and Deployment → add custom scripts or cloud-init snippets to drop the CA into /usr/local/share/ca-certificates and run update-ca-certificates.
  • If possible, bypass TLS interception for rack and region egress to known Ubuntu mirrors and MAAS metadata endpoints.
  • Prefer HTTP boot for speed, but ensure the final OS trusts the CA before APT updates.

Verification

update-ca-certificates -v
curl https://archive.ubuntu.com/ubuntu/dists/ --head
apt-get update -o Acquire::https::Verify-Peer=true

DHCP Snooping, Option 82, and IP Source Guard

What It Does

  • DHCP snooping marks ports as trusted or untrusted and builds bindings.
  • Option 82 inserts relay information; some servers require it.
  • IP source guard enforces bindings and can drop replies from racks.

MAAS Impact

  • Rack replies may be dropped if rack port is not trusted.
  • Relay devices may strip or require Option 82 in ways MAAS does not expect.

Checks

# On switches
show dhcp snooping binding
show ip source binding
tcpdump -n -e -vv -i <rack-if> '(port 67 or 68)'

Fixes

  • Mark rack controller ports as trusted DHCP.
  • Ensure relay forwards both DISCOVER and OFFER back to client VLAN.
  • If external DHCP is used, set next-server and bootfile correctly.

ARP Inspection, ND Inspection, and MAC Limits

Symptoms

  • Node never acquires IP or loses connectivity after boot.
  • Logs show gratuitous ARP or DAD failures.

Fixes

  • Allow expected ARP/ND frames from nodes and racks.
  • Increase per-port MAC limits to account for ephemeral MACs during provisioning.
  • Clear stale MAC table entries on access ports.

802.1X and MACsec Edges

  • 802.1X can block PXE before supplicant starts. Exempt PXE VLANs or use MAC-auth bypass for provisioning ports.
  • MACsec introduces overhead that reduces MTU; align MTU across bridges and tunnels.

Checks

journalctl -k | grep -i 'mtu.*exceed'
tracepath <region-ip>

Proxies, Mirrors, and Content Filters

Symptoms

  • APT 407 Proxy Auth Required, or 403 Blocked by Category.
  • Metadata reachable, but Curtin fails on package fetch.

Checks

apt-config dump | grep -i proxy
curl -I http://<rack-ip>:3128/
curl -I http://archive.ubuntu.com/

Fixes

  • Configure MAAS proxy correctly and scope by space.
  • Allowlist Ubuntu mirrors, Snap endpoints, and MAAS metadata.
  • For authenticated proxies, inject APT auth configuration via cloud-init.

DNS Policies and Split-Horizon

  • Corporate DNS may serve different answers inside vs outside.
  • systemd-resolved on servers can conflict with MAAS-provided DNS.

Checks

resolvectl status
dig @<maas-dns-ip> <node>.maas A +noall +answer +ttlid
dig +trace archive.ubuntu.com

Fixes

  • Define forwarders in MAAS for corporate zones.
  • Raise TTL for stability during mass deployments.
  • Ensure deployed nodes use the intended stub or upstream resolver.

Hardening MAAS Hosts

  • Restrict SSH and API to admin subnets.
  • Apply least privilege on snap services and log file permissions.
  • Enable system and application auditing for rack and region.
  • Back up MAAS configs and database with encryption at rest.

Quick Lens

ss -ltnup | grep -E ':(5240|3128|67|69|53)'
nft list ruleset | sed -n '1,120p'

Safe Packet Capture and Redaction

Do

  • Capture the minimum duration and scope needed.
  • Store pcaps in a restricted path with timestamps.
  • Redact or anonymize IP/MAC addresses where policy requires.

Avoid

  • Long captures on busy production VLANs without approval.
  • Exporting raw pcaps to third parties.

Redaction Helpers

editcap -C secret.map in.pcap out.pcap
tshark -r in.pcap -Y 'bootp or dhcp' -T fields -e frame.time -e eth.src -e ip.src -e bootp.option.dhcp || true

Evidence for Audits and Incidents

Collect

  • MAAS events: maas $PROFILE events query level=WARNING limit=500
  • Service journals: journalctl -u snap.maas.regiond -u snap.maas.rackd
  • Config snapshots: netplan get, nft list ruleset, bridge vlan
  • Short, targeted pcaps per symptom

Package

tar czf maas-net-evidence-$(date +%F).tar.gz /var/snap/maas/common/log/ /var/log/cloud-init*.log /var/log/curtin/install.log /tmp/*.pcap

Runbooks

RB-1: SSL Intercept Breaks Commissioning

  1. Reproduce with curl -v https://archive.ubuntu.com/ from ephemeral network.
  2. Export chain with openssl s_client -showcerts.
  3. Install corporate CA via cloud-init snippet.
  4. Verify update-ca-certificates and rerun commissioning.

RB-2: DHCP Snooping Drops Rack Offers

  1. Check switch snooping bindings.
  2. Mark rack port as trusted.
  3. Verify with tcpdump 'port 67 or 68' during a boot attempt.

RB-3: Proxy Auth Blocks APT

  1. Confirm apt-config dump shows expected proxy.
  2. Add APT auth file via cloud-init.
  3. Test apt-get update and watch proxy logs.

Next Steps

Next up: checklists, runbooks, and an evidence bundle script you can drop into any MAAS deployment to standardize troubleshooting and escalation.