Machine stuck PXE boot during comissioning

Was trying to deploy the clusters using MaaS(3.1)+Juju. The VM’s getting stuck during boot stage of commissioning.

post that it ends up as time out. Boot issues is not occurring when we cancelled the deployment and redeployed it. Everytime the issue occurs it is at same stage. Please let us know any configuration miss match or something to over come this issue

Shown details of stage in which it is getting stuck:

Fri, 11 Feb. 2022 09:53:30 HTTP Request - /images/ubuntu/amd64/ga-20.04/focal/stable/boot-initrd
Fri, 11 Feb. 2022 09:53:29 HTTP Request - /images/ubuntu/amd64/ga-20.04/focal/stable/boot-kernel
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/grub.cfg-00:16:3e:cd:01:f8
Fri, 11 Feb. 2022 09:53:29 Performing PXE boot
Fri, 11 Feb. 2022 09:53:29 PXE Request - installation
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/x86_64-efi/terminal.lst
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/x86_64-efi/fs.lst
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/grub.cfg
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/x86_64-efi/crypto.lst
Fri, 11 Feb. 2022 09:53:29 TFTP Request - /grub/x86_64-efi/command.lst
Fri, 11 Feb. 2022 09:53:27 TFTP Request - grubx64.efi
Fri, 11 Feb. 2022 09:53:27 TFTP Request - bootx64.efi
Fri, 11 Feb. 2022 09:53:27 TFTP Request - bootx64.efi
Fri, 11 Feb. 2022 09:53:17 Node powered on
Fri, 11 Feb. 2022 09:53:16 Powering on
Fri, 11 Feb. 2022 09:53:12 Deploying
Fri, 11 Feb. 2022 09:53:12 Node changed status - From ‘Allocated’ to ‘Deploying’
Fri, 11 Feb. 2022 09:53:12 User starting deployment - (veera)
Fri, 11 Feb. 2022 09:53:09 Node changed status - From ‘Ready’ to ‘Allocated’ (to veera)
Fri, 11 Feb. 2022 09:53:09 User acquiring node - (veera)

Output of bundle deployed:

Added ‘kube-veera-23’ model on rnd-maas-cloud/default with credential ‘rnd-veera’ for user ‘admin’
rndhpv2-control:admin/kube-veera-23 (no change)
Located charm “calico” in charm-store, revision 860
Located charm “containerd” in charm-store, revision 200
Located charm “easyrsa” in charm-store, revision 441
Located charm “etcd” in charm-store, revision 655
Located charm “kubernetes-master” in charm-store, revision 1106
Located charm “kubernetes-worker” in charm-store, revision 838
Executing changes:

  • upload charm calico from charm-store for series focal with architecture=amd64
  • deploy application calico from charm-store on focal
    added resource calico
    added resource calico-arm64
    added resource calico-node-image
    added resource calico-upgrade
    added resource calico-upgrade-arm64
  • set annotations for calico
  • upload charm containerd from charm-store for series focal with architecture=amd64
  • deploy application containerd from charm-store on focal
  • set annotations for containerd
  • upload charm easyrsa from charm-store for series focal with architecture=amd64
  • deploy application easyrsa from charm-store on focal
    added resource easyrsa
  • set annotations for easyrsa
  • upload charm etcd from charm-store for series focal with architecture=amd64
  • deploy application etcd from charm-store on focal
    added resource core
    added resource etcd
    added resource snapshot
  • set annotations for etcd
  • upload charm kubernetes-master from charm-store for series focal with architecture=amd64
  • deploy application kubernetes-master from charm-store on focal
    added resource cdk-addons
    added resource cni-amd64
    added resource cni-arm64
    added resource cni-s390x
    added resource core
    added resource kube-apiserver
    added resource kube-controller-manager
    added resource kube-proxy
    added resource kube-scheduler
    added resource kubectl
  • expose all endpoints of kubernetes-master and allow access from CIDRs 0.0.0.0/0 and ::/0
  • set annotations for kubernetes-master
  • upload charm kubernetes-worker from charm-store for series focal with architecture=amd64
  • deploy application kubernetes-worker from charm-store on focal
    added resource cni-amd64
    added resource cni-arm64
    added resource cni-s390x
    added resource core
    added resource kube-proxy
    added resource kubectl
    added resource kubelet
  • expose all endpoints of kubernetes-worker and allow access from CIDRs 0.0.0.0/0 and ::/0
  • set annotations for kubernetes-worker
  • add new machine 0
  • add new machine 1
  • add relation containerd:containerd - kubernetes-worker:container-runtime
  • add relation containerd:containerd - kubernetes-master:container-runtime
  • add relation kubernetes-master:kube-api-endpoint - kubernetes-worker:kube-api-endpoint
  • add relation kubernetes-master:kube-control - kubernetes-worker:kube-control
  • add relation kubernetes-master:certificates - easyrsa:client
  • add relation kubernetes-master:etcd - etcd:db
  • add relation kubernetes-worker:certificates - easyrsa:client
  • add relation etcd:certificates - easyrsa:client
  • add relation calico:etcd - etcd:db
  • add relation calico:cni - kubernetes-master:cni
  • add relation calico:cni - kubernetes-worker:cni
  • add unit etcd/0 to new machine 0
  • add unit kubernetes-master/0 to new machine 0
  • add unit kubernetes-worker/0 to new machine 1
  • add lxd container 0/lxd/0 on new machine 0
  • add unit easyrsa/0 to 0/lxd/0
    Deploy of bundle completed.

Does any one have inputs on this issue? We getting stuck in PXE boot often and timed out after sometime

hi, @veeraraghavan1,

it might be a long shot, but can you check and make sure that you don’t have overlapping subnets in MAAS? this can sometimes cause the error you’re seeing.

Hi @billwear,

I think both subnets are in same VLAN this create this issue. How can we stop this overlapping?

@veeraraghavan1, ideally, the subnets themselves shouldn’t have overlapping IP ranges, i would think. is that what’s happening?