Custom RockyLinux Image Curtin Fail(cloudinit, netplan)

Hello MAAS users,

I am currently working on creating a Rocky Linux 8 image managed by the packer-maas Git repository. I have successfully built and uploaded the image to the controller (or so I believe). However, during the deployment process, curtin performs actions contrary to my intentions. Specifically, it calls netplan for Rocky Linux, and upon failure, it results in a deployment failure state. I believe that netplan is not necessary in this situation.

start: cmd-install/stage-late/98-validate-custom-image-has-cloud-init/cmd-in-target: curtin command in-target
Running command ['mount', '--bind', '/dev', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/proc', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/run', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/sys', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)
Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpvoo1465l/target', 'bash', '-c', 'cloud-init --version || (echo "cloud-init not detected, MAAS will not be able to configure this machine properly" && exit 1)'] with allowed return codes [0] (capture=False)
/usr/bin/cloud-init 23.4-7.el8_10.3.0.1
Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
TIMED subp(['udevadm', 'settle']): 0.003
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
finish: cmd-install/stage-late/98-validate-custom-image-has-cloud-init/cmd-in-target: SUCCESS: curtin command in-target
start: cmd-install/stage-late/99-validate-custom-image-has-netplan/cmd-in-target: curtin command in-target
Running command ['mount', '--bind', '/dev', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/proc', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/run', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/sys', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)
Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpvoo1465l/target', 'bash', '-c', 'netplan info || (echo "netplan not detected, MAAS will not be able to configure this machine properly" && exit 1)'] with allowed return codes [0] (capture=False)
bash: netplan: command not found
netplan not detected, MAAS will not be able to configure this machine properly
Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
TIMED subp(['udevadm', 'settle']): 0.009
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
finish: cmd-install/stage-late/99-validate-custom-image-has-netplan/cmd-in-target: FAIL: curtin command in-target
curtin: Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'in-target', '--', 'bash', '-c', 'netplan info || (echo "netplan not detected, MAAS will not be able to configure this machine properly" && exit 1)']
Exit code: 1
Reason: -
Stdout: start: cmd-install/stage-late/99-validate-custom-image-has-netplan/cmd-in-target: curtin command in-target
        Running command ['mount', '--bind', '/dev', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/proc', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/run', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--bind', '/sys', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)
        Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpvoo1465l/target', 'bash', '-c', 'netplan info || (echo "netplan not detected, MAAS will not be able to configure this machine properly" && exit 1)'] with allowed return codes [0] (capture=False)
        bash: netplan: command not found
        netplan not detected, MAAS will not be able to configure this machine properly
        Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
        TIMED subp(['udevadm', 'settle']): 0.009
        Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpvoo1465l/target/sys'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpvoo1465l/target/run'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpvoo1465l/target/proc'] with allowed return codes [0] (capture=False)
        Running command ['mount', '--make-private', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
        Running command ['umount', '/tmp/tmpvoo1465l/target/dev'] with allowed return codes [0] (capture=False)
        finish: cmd-install/stage-late/99-validate-custom-image-has-netplan/cmd-in-target: FAIL: curtin command in-target
        
Stderr: ''

I have found previous discussions on this issue but have not been able to implement a correct solution myself. I would greatly appreciate any fundamental solutions to this problem. I am using MAAS version 3.4, and the content of my preseed.py is as follows:

def get_preseed_context(request, osystem="", release="", rack_controller=None):
    """Return the node-independent context dictionary to be used to render
    preseed templates.

    :param osystem: See `get_preseed_filenames`.
    :param release: See `get_preseed_filenames`.
    :param rack_controller: The rack controller used to generate the preseed.
    :return: The context dictionary.
    :rtype: dict.
    """
    region_ip = get_default_region_ip(request)
    server_host = get_maas_facing_server_host(
        rack_controller=rack_controller, default_region_ip=region_ip
    )
    server_url = request.build_absolute_uri(reverse("machines_handler"))
    configs = Config.objects.get_configs(["remote_syslog", "maas_syslog_port"])
    syslog = configs["remote_syslog"]
    http_proxy = get_apt_proxy(request, rack_controller)
    if not syslog:
        syslog_port = configs["maas_syslog_port"]
        if not syslog_port:
            syslog_port = RSYSLOG_PORT
        syslog = "%s:%d" % (server_host, syslog_port)
    return {
        "osystem": osystem,
        "release": release,
        "server_host": server_host,
        "server_url": server_url,
        "syslog_host_port": syslog,
        "http_proxy": http_proxy,
    }

I suspect that the issue might be related to the incorrect passing of the osystem variable. Additionally, the following code is where the failure message is being triggered:

def get_custom_image_dependency_validation(node, base_osystem):
    if node.get_osystem() != "custom":
        return None

    cmd = {}
    err_msg = "not detected, MAAS will not be able to configure this machine properly"

    deps = DEPS_PER_OS[base_osystem]

    for priority, dep_cmds in enumerate(deps, start=98):
        name = dep_cmds[0]
        executable = " ".join(dep_cmds)
        in_target = f'{executable} || (echo "{name} {err_msg}" && exit 1)'
        cmd[f"{priority}-validate-custom-image-has-{name}"] = [
            "curtin",
            "in-target",
            "--",
            "bash",
            "-c",
            in_target,
        ]
    return cmd

Although similar discussions have taken place before, they have not provided me with the inspiration I need to resolve this issue.

Thank you in advance for your help.

In this state, the machine shows a “Fail deployment” status, but in reality, the deployment is successful, and we can access it. Additionally, network communication is also functioning correctly. However, since the MAAS UI still shows a “Fail deployment” status, we are working to set this status to “Deployed” for proper management.

I’ve resolved the issue, this is now it works.

I was thinking about how to fix the preseed.py inside the Snap package, but I was burdened by the repackage.

The solution was to simply set the OS/release to custom/rocky8 when registering the boot resource, instead of setting it to centos/rocky8

When DEPS_PER_OS values are classified as custom, this seems to work for Ubuntu. I want the other guys to solve it once and for all :wink:

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.