Make it possible / easier to use custom immutable Operating Systems like talos

Hi,

I stumbled upon maas several days ago and like it! Now, trying it out, I find some limits that make it hard to use for some use cases.

I want to try out talos - the new cool immutable container os / solution. So i tried to build my own image for maas …

found a metal disk.raw of talos … but installing it with maas failed, because there was the curtain/hooks-… missing. Ok, manipulated the raw image, placed a not-doing-anything-hook-file in the uefi partition - and now I am struggling with:

Stdout: start: cmd-install/stage-late/98-validate-custom-image-has-cloud-init/cmd-in-target: curtin command in-target
        Running command ['mount', '--bind', '/dev', '/tmp/tmpbxz4m26h/target/dev'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/proc', '/tmp/tmpbxz4m26h/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/run', '/tmp/tmpbxz4m26h/target/run'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/sys', '/tmp/tmpbxz4m26h/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)
        Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpbxz4m26h/target', 'bash', '-c', 'dpkg-query -s cloud-init || (echo "cloud-init not detected, MAAS will not be able to configure this machine properly" && exit 1)'] with allowed return codes [0] (capture=False)
        chroot: failed to run command ‘bash’: No such file or directory
        Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
        TIMED subp(['udevadm', 'settle']): 0.022
        Running command ['mount', '--make-private', '/tmp/tmpbxz4m26h/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpbxz4m26h/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpbxz4m26h/target/run'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpbxz4m26h/target/run'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpbxz4m26h/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpbxz4m26h/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpbxz4m26h/target/dev'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpbxz4m26h/target/dev'] with allowed return codes [0] (capture=False)
        finish: cmd-install/stage-late/98-validate-custom-image-has-cloud-init/cmd-in-target: FAIL: curtin command in-target

So - ok - but talos never will understand cloud-init. As it’s immutable, there is no need for cloud-init. (but talos works on ec2, hetzner and so on - so there should be no reason why it shouldn’t work on maas without cloud-init).

Is this possible? Are there docs, that I didn’t found that describe how to disable the stage-late jobs?

I know that i have to do something like my own talos runtime that understands the maas metadata…

talos/aws.go at 8b2235c3b6de64abb15bf77e9648bf6bebc18e1f · siderolabs/talos (github.com)

but for now, I don’t get to the point were I can launch / reboot the instance…

if I understand it right, this is because of the base_image that is choosen? It looks like the default is ubuntu? So the ubuntu setup logic is performed?

What if I want to deploy other os than https://github.com/maas/maas/blob/ff897252800a31ba83d3f7fbe4211ba44c1f2bb4/src/maasserver/models/bootresource.py#L44 (ubuntu, centos, rhel)?

Do you have plans to make this work? Or is it out of scope?

I don’t understand, why this allows custom/custom … but the boot-resource create command won’t

https://github.com/maas/maas/blob/36347b29aad4165b04acc7d95f86d9451b8f4650/src/maasserver/tests/test_preseed.py#L427

Why all of this isn’t working.

markussiebert@maas:~$ maas admin boot-resources create name=talos-test architecture=amd64/generic sha256=8f3dacfb715b307ba1102aaf1b3e75aae9980960f8e30d1032ca41f81d3c7a4b content@=/home/markussiebert/disk.raw filetype=ddraw base_image=custom
{“base_image”: [“a base image must follow the format: /”]}
markussiebert@maas:~$ maas admin boot-resources create name=talos-test architecture=amd64/generic sha256=8f3dacfb715b307ba1102aaf1b3e75aae9980960f8e30d1032ca41f81d3c7a4b content@=/home/markussiebert/disk.raw filetype=ddraw base_image=custom/custom
{“base_image”: [“custom images require a valid non-custom OS type base image”]}
markussiebert@maas:~$ maas admin boot-resources create name=talos-test architecture=amd64/generic sha256=8f3dacfb715b307ba1102aaf1b3e75aae9980960f8e30d1032ca41f81d3c7a4b content@=/home/markussiebert/disk.raw filetype=ddraw base_image=ubuntu/custom
{“base_image”: [“please select a valid base image OS and version”]}

The base_image must be specified as os/release and must be one that’s supported by MAAS, as it needs to know how to configure could-init/curtin for the right OS.

Hence you can’t use base_image=custom.

For this to work, Talos needs to be a supported OS first.