Commissioning VM in LXD host fails

@codingfreak

Sorry I forgot to mention which command to use: lxc launch ubuntu:jammy dummy --vm --storage default --console

Do you mean under project “maas-project” which is created by maas ? Yes it failed as shown below

# sudo lxc launch ubuntu:jammy dummy --vm --project maas-project 
Creating dummy
Error: Failed instance creation: Failed creating instance record: Failed initialising instance: Failed getting root disk: No root device could be found

But worked fine under project “default” as shown below

# sudo lxc launch ubuntu:jammy dummy --vm --project default 
Creating dummy
Starting dummy
# lxc ls --project default 
+-------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| NAME  |  STATE  |          IPV4          |                      IPV6                       |      TYPE       | SNAPSHOTS |
+-------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| dummy | RUNNING | 10.231.47.151 (enp5s0) | fd42:a037:1fd4:913b:216:3eff:fee0:5b8f (enp5s0) | VIRTUAL-MACHINE | 0         |
+-------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| mm01  | RUNNING | 10.231.47.29 (enp5s0)  | fd42:a037:1fd4:913b:216:3eff:fec2:c06f (enp5s0) | VIRTUAL-MACHINE | 0         |
+-------+---------+------------------------+-------------------------------------------------+-----------------+-----------+

You should also add --project maas-project

@troyanov

# lxc info --show-log testVM --project maas-project 
Name: testVM
Status: STOPPED
Type: virtual-machine
Architecture: x86_64
Created: 2023/10/10 12:19 PDT
Error: open /var/snap/lxd/common/lxd/logs/maas-project_testVM/qemu.log: no such file or directory

@codingfreak from what I see, MAAS is able to talk to LXD API, however it might be something wrong with an LXD configuration itself.

Can you please collect some logs and output of what happens when you try to create a VM via lxc?

I mean command like this lxc launch ubuntu:jammy dummy --vm --project maas-project should work and the error you’ve posted earlier makes me wonder if LXD was configured correctly.

No root device could be found

Might be that storage pool was not initialized properly.

@troyanov

As i was mentioning in my previous replies, I was suspecting the profile of maas-project is incomplete and it might be resulting in this error. As shown in below logs, other projects like default and client1-iso-project are working fine.

If the maas-project is the lxc project which is automatically created by the MAAS UI in the server, does user need to explicitly modify the same to make it work ?

# lxc project ls 
+-------------------------------+--------+----------+-----------------+-----------------+----------+---------------+-------------------------+---------+
|             NAME              | IMAGES | PROFILES | STORAGE VOLUMES | STORAGE BUCKETS | NETWORKS | NETWORK ZONES |       DESCRIPTION       | USED BY |
+-------------------------------+--------+----------+-----------------+-----------------+----------+---------------+-------------------------+---------+
| client1-iso-project (current) | YES    | YES      | YES             | YES             | NO       | NO            |                         | 9       |
+-------------------------------+--------+----------+-----------------+-----------------+----------+---------------+-------------------------+---------+
| default                       | YES    | YES      | YES             | YES             | YES      | YES           | Default LXD project     | 3       |
+-------------------------------+--------+----------+-----------------+-----------------+----------+---------------+-------------------------+---------+
| maas-project                  | YES    | YES      | YES             | YES             | NO       | NO            | Project managed by MAAS | 4       |
+-------------------------------+--------+----------+-----------------+-----------------+----------+---------------+-------------------------+---------+
# 
# lxc ls --project client1-iso-project
+---------------+---------+-----------------------+-------------------------------------------------+-----------------+-----------+
|     NAME      |  STATE  |         IPV4          |                      IPV6                       |      TYPE       | SNAPSHOTS |
+---------------+---------+-----------------------+-------------------------------------------------+-----------------+-----------+
| debian12      | RUNNING | 10.231.47.64 (eth0)   | fd42:a037:1fd4:913b:216:3eff:fe11:8897 (eth0)   | CONTAINER       | 0         |
+---------------+---------+-----------------------+-------------------------------------------------+-----------------+-----------+
| rocky9        | RUNNING | 10.231.47.23 (enp5s0) | fd42:a037:1fd4:913b:d51d:894:e79b:5a21 (enp5s0) | VIRTUAL-MACHINE | 0         |
+---------------+---------+-----------------------+-------------------------------------------------+-----------------+-----------+
| ubuntulobster | RUNNING | 10.231.47.81 (enp5s0) | fd42:a037:1fd4:913b:216:3eff:fea7:86bb (enp5s0) | VIRTUAL-MACHINE | 0         |
+---------------+---------+-----------------------+-------------------------------------------------+-----------------+-----------+
# 
# lxc ls --project default
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+
root@mltr01:/home/codingfreak# 
root@mltr01:/home/codingfreak# lxc ls --project maas-project
+--------+---------+------+------+-----------------+-----------+
|  NAME  |  STATE  | IPV4 | IPV6 |      TYPE       | SNAPSHOTS |
+--------+---------+------+------+-----------------+-----------+
| test01 | STOPPED |      |      | VIRTUAL-MACHINE | 0         |
+--------+---------+------+------+-----------------+-----------+
| testVM | STOPPED |      |      | VIRTUAL-MACHINE | 0         |
+--------+---------+------+------+-----------------+-----------+
# 
# lxc profile show default --project maas-project
config: {}
description: Default LXD profile for project maas-project
devices: {}
name: default
used_by: []
# 
# lxc profile show default --project client1-iso-project
config: {}
description: Default LXD profile for project client1-iso-project
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: lxd-storage
    type: disk
name: default
used_by:
- /1.0/instances/ubuntulobster?project=client1-iso-project
- /1.0/instances/rocky9?project=client1-iso-project
- /1.0/instances/debian12?project=client1-iso-project
# 
# lxc profile show default --project default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: lxd-storage
    type: disk
name: default
used_by: []

So i modified the profile under maas-project accordingly as shown below and then I am able to manually launch new instances

# lxc launch ubuntu:22.04 webserver --project maas-project 
Creating webserver
Starting webserver
#
# lxc ls --project maas-project 
+-----------+---------+---------------------+-----------------------------------------------+-----------+-----------+
|   NAME    |  STATE  |        IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+-----------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| webserver | RUNNING | 10.231.47.89 (eth0) | fd42:a037:1fd4:913b:216:3eff:fe8d:a93a (eth0) | CONTAINER | 0         |
+-----------+---------+---------------------+-----------------------------------------------+-----------+-----------+
#
# lxc profile show default --project maas-project 
config: {}
description: Default LXD profile for project maas-project
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: lxd-storage
    type: disk
name: default
used_by:
- /1.0/instances/webserver?project=maas-project

Now if I am try to create new VM from MAAS it fails as VM is getting created with empty mac-address and ends up in broken stage

So you used “Install KVM” option during machine deployment or you manually installed LXD and then added as an existing KVM?

I tried both options on 3.3.4 earlier today and it worked well. However I was not using Ubuntu Core.

@troyanov

Well I had installed LXD manually in the physical server which is deployed by MAAS with ubuntu-core-22. Then in MAAS I had used “Add LXD host” option to add the same in the MAAS.

Did you take all the steps as mentioned in the docs?

I am out of ideas what else could be wrong.
To me it feels like LXD init was executed after the project was created, so it is not picking correct values.

That might be because the default profile used for VM creation is missing network devices and could be related to the issue with IOMMU group you mentioned earlier.

Hi @troyanov

As explained in my previous reply, I have modified default profile of maas-project to match with default profile in default project. Now I am able to manually launch VM in maas-project but not from MAAS.

I tried deleting and adding back the LXD-HOST to MAAS mapping to the same project i.e. maas-project. Now if I create a vm using MAAS it still fails with empty mac-address.

Is there a show command in lxd which can dump configurations set during lxd init ? this might help in figuring out what caused an issue ?

Hi @codingfreak

To be fair I don’t know if thats possible, thats why I am trying to understand how exactly you configured LXD and if you followed the documentation or there was something else.

Some more ideas:

  1. Maybe something interesting pops up while running lxc monitor --pretty?
  2. What is being returned by lxc config show $broken_vm --project maas-project and lxc network show $your_bridge_network?

Thats what I have in my env, just for a comparison:

❯ lxc profile show default --project maas-kvm
config: {}
description: Default LXD profile for project maas-kvm
devices: {}
name: default
used_by: []

❯ lxc config show great-boar --project maas-kvm
architecture: x86_64
config:
  limits.cpu: "1"
  limits.memory: "2147483648"
  limits.memory.hugepages: "false"
  security.secureboot: "false"
  volatile.cloud-init.instance-id: 0ef585c0-32d5-425f-8baf-ec9a21567f1a
  volatile.eth0.hwaddr: 00:16:3e:6f:f9:1f
  volatile.last_state.power: STOPPED
  volatile.last_state.ready: "false"
  volatile.uuid: 4eedc51f-bc27-4b72-a0cb-c4e8e2ba57b1
  volatile.uuid.generation: 4eedc51f-bc27-4b72-a0cb-c4e8e2ba57b1
  volatile.vsock_id: "2067760905"
devices:
  eth0:
    boot.priority: "1"
    name: eth0
    nictype: bridged
    parent: maas-net
    type: nic
  root:
    boot.priority: "0"
    path: /
    pool: default
    size: "8000000000"
    type: disk
ephemeral: false
profiles: []
stateful: false
description: ""

The reason why MAAS might fail to create/start VM might be hidden in the LXD driver.

  1. You can try adding some debug print statements in the LXD driver responsible for VM creation.
    https://git.launchpad.net/maas/tree/src/provisioningserver/drivers/pod/lxd.py?h=3.3#n422
  2. Follow the logic how interfaces are picked up (there is a certain order of preference)
    It might be that you don’t have a bridge (it should be created) and IOMMU feels like an error from SRIOV, which is a next prefered type.
        attach_preference = [
            InterfaceAttachType.BRIDGE,
            InterfaceAttachType.SRIOV,
            InterfaceAttachType.NETWORK,
            InterfaceAttachType.MACVLAN,
        ]

Hi @troyanov

I tried on a newly deployed server which is running ubuntu-server 22.04 using MAAS.
I have initialized lxd init with below inputs

# lxd init 
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: lxd-storage
Name of the storage backend to use (cephobject, dir, lvm, zfs, btrfs, ceph) [default=zfs]: 
Create a new ZFS pool? (yes/no) [default=yes]: 
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]: 
Size in GiB of the new loop device (1GiB minimum) [default=30GiB]: 500GiB
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: 
What should the new bridge be called? [default=lxdbr0]: 
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
Would you like the LXD server to be available over the network? (yes/no) [default=no]: yes
Address to bind LXD to (not including port) [default=all]: 
Port to bind LXD to [default=8443]: 
Trust password for new clients: 
Again: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]: 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

Then I added it as LXD host in MAAS creating maas-project from MAAS UI.
As shown below default profile in maas-project is empty

# lxc project ls
+-------------------+--------+----------+-----------------+----------+-------------------------+---------+
|       NAME        | IMAGES | PROFILES | STORAGE VOLUMES | NETWORKS |       DESCRIPTION       | USED BY |
+-------------------+--------+----------+-----------------+----------+-------------------------+---------+
| default (current) | YES    | YES      | YES             | YES      | Default LXD project     | 2       |
+-------------------+--------+----------+-----------------+----------+-------------------------+---------+
| maas-project      | YES    | YES      | YES             | NO       | Project managed by MAAS | 2       |
+-------------------+--------+----------+-----------------+----------+-------------------------+---------+
root@op2:/home/ubuntu# 
root@op2:/home/ubuntu# lxc profile show default --project maas-project 
config: {}
description: Default LXD profile for project maas-project
devices: {}
name: default
used_by: []

Now I tried creating VM from MAAS and it is created properly and stuck in commissioning stage as shown below

In server I can see that VM is in running stage

# lxc ls --project maas-project 
+------+---------+------+------+-----------------+-----------+
| NAME |  STATE  | IPV4 | IPV6 |      TYPE       | SNAPSHOTS |
+------+---------+------+------+-----------------+-----------+
| vm01 | RUNNING |      |      | VIRTUAL-MACHINE | 0         |
+------+---------+------+------+-----------------+-----------+

Well better than the previous scenario with ubuntu-core-22 but still failing.

It takes some time to commission. For how long it is being stuck?
Is there anything under the logs tab?

@troyanov

Well its been 10 minutes i guess and nothing much under logs other than powered on

@troyanov

Well in my server2 (ubuntu-server 22.04) if I try to manually launch a container in maas-project it fails as shown below

# lxc launch ubuntu:22.04 vmtest2 --project maas-project 
Creating vmtest2
Error: Failed instance creation: Failed creating instance record: Failed initialising instance: Failed getting root disk: No root device could be found

Logs from lxc monitor

DEBUG  [2023-10-13T20:11:15Z] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2023-10-13T20:11:15Z] Handling API request                          ip=@ method=GET protocol=unix url="/1.0/events?project=maas-project" username=root
DEBUG  [2023-10-13T20:11:15Z] Event listener server handler started         id=3690f3c9-d41c-4515-9db5-333a83c476a7 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2023-10-13T20:11:15Z] Handling API request                          ip=@ method=POST protocol=unix url="/1.0/instances?project=maas-project" username=root
DEBUG  [2023-10-13T20:11:15Z] Responding to instance create                
DEBUG  [2023-10-13T20:11:15Z] Started operation                             class=task description="Creating instance" operation=8cbca1cd-b0d6-4774-999c-039f11f7e205 project=maas-project
DEBUG  [2023-10-13T20:11:15Z] New operation                                 class=task description="Creating instance" operation=8cbca1cd-b0d6-4774-999c-039f11f7e205 project=maas-project
DEBUG  [2023-10-13T20:11:15Z] Connecting to a remote simplestreams server   URL="https://cloud-images.ubuntu.com/releases"
DEBUG  [2023-10-13T20:11:15Z] Handling API request                          ip=@ method=GET protocol=unix url="/1.0/operations/8cbca1cd-b0d6-4774-999c-039f11f7e205?project=maas-project" username=root
DEBUG  [2023-10-13T20:11:16Z] Acquiring lock for image                      fingerprint=b948dd91cd5a8da89f6dcd4949d7189f064cf6d4dc5bd70b7f9b7aff1883babf
DEBUG  [2023-10-13T20:11:16Z] Lock acquired for image                       fingerprint=b948dd91cd5a8da89f6dcd4949d7189f064cf6d4dc5bd70b7f9b7aff1883babf
DEBUG  [2023-10-13T20:11:16Z] Lock acquired for image                       fingerprint=b948dd91cd5a8da89f6dcd4949d7189f064cf6d4dc5bd70b7f9b7aff1883babf
DEBUG  [2023-10-13T20:11:16Z] Image already exists in the DB                fingerprint=b948dd91cd5a8da89f6dcd4949d7189f064cf6d4dc5bd70b7f9b7aff1883babf
DEBUG  [2023-10-13T20:11:16Z] Acquiring lock for image                      fingerprint=b948dd91cd5a8da89f6dcd4949d7189f064cf6d4dc5bd70b7f9b7aff1883babf
DEBUG  [2023-10-13T20:11:16Z] Instance operation lock created               action=create instance=vmtest2 project=maas-project reusable=false
ERROR  [2023-10-13T20:11:16Z] Failed initialising instance                  err="Failed getting root disk: No root device could be found" instance=vmtest2 project=maas-project type=container
INFO   [2023-10-13T20:11:16Z] Creating instance                             ephemeral=false instance=vmtest2 instanceType=container project=maas-project
DEBUG  [2023-10-13T20:11:16Z] Failure for operation                         class=task description="Creating instance" err="Failed creating instance record: Failed initialising instance: Failed getting root disk: No root device could be found" operation=8cbca1cd-b0d6-4774-999c-039f11f7e205 project=maas-project
DEBUG  [2023-10-13T20:11:16Z] Instance operation lock finished              action=create err="Failed getting root disk: No root device could be found" instance=vmtest2 project=maas-project reusable=false
DEBUG  [2023-10-13T20:11:16Z] Event listener server handler stopped         listener=3690f3c9-d41c-4515-9db5-333a83c476a7 local=/var/snap/lxd/common/lxd/unix.socket remote=@

Default profile in maas-project

# lxc profile show default --project maas-project 
config: {}
description: Default LXD profile for project maas-project
devices: {}
name: default
used_by: []

# lxc ls --project maas-project 
+------+---------+------+------+-----------------+-----------+
| NAME |  STATE  | IPV4 | IPV6 |      TYPE       | SNAPSHOTS |
+------+---------+------+------+-----------------+-----------+
| vm01 | RUNNING |      |      | VIRTUAL-MACHINE | 0         |
+------+---------+------+------+-----------------+-----------+

Commissioning failed

These looks good.

Then it is stuck somewhere and fails to boot I guess.

That launches container. You are missing --vm flag to start aVM

Did you check the bridge part from the doc?

The behaviour you observe looks similar to the one mentioned in the doc:

The bridge LXD creates is isolated and not managed by MAAS. If this bridge is used, you would be able to add the LXD VM host and compose virtual machines, but commissioning, deploying, and any other MAAS action which uses the network will fail – so yes is the correct answer here.

I am confused.

The bridge LXD creates is isolated and not managed by MAAS. If this bridge is used, you would be able to add the LXD VM host and compose virtual machines, but commissioning, deploying, and any other MAAS action which uses the network will fail – so yes is the correct answer here.

So should i disable creation of lxdbr0 ?

Looks like lxdbr0 is doing NAT which will not help MAAS to perform certain tasks. If I have to use VMs which are directly connected to my MAAS controlled DHCP server then which networking option I should use ?

If I try the VM creation manually in maas-project it fails saying cant download the image because of name resolution issue. But in the case of default project it just works.

root@op2:/home/ubuntu# lxc launch ubuntu:22.04 vmtest2 --vm --project maas-project 
Creating vmtest2
Error: Failed instance creation: Get "https://cloud-images.ubuntu.com/releases/server/releases/jammy/release-20231010/ubuntu-22.04-server-cloudimg-amd64-lxd.tar.xz": lookup cloud-images.ubuntu.com: Temporary failure in name resolution
root@op2:/home/ubuntu# 
root@op2:/home/ubuntu# 
root@op2:/home/ubuntu# 
root@op2:/home/ubuntu# lxc launch ubuntu:22.04 vmtest2 --vm 
Creating vmtest2
Starting vmtest2

Can you share lxc network show lxdbr0?

# lxc network show lxdbr0
config:
  ipv4.address: 10.156.54.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:7a06:141d:df22::1/64
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/vmtest
- /1.0/instances/vmtest1
- /1.0/instances/vmtest2
- /1.0/profiles/default
managed: true
status: Created
locations:
- none