There was also some kind of error that flashed by real fast about openbsd shell as well.
I can not find any login for the pxe environment online so kinda stuck, any ideas?
I also noticed that the system never shows up in MAAS as commissioning like I see in others videos, but not sure if that is supposed to happen after this part of the process.
It feels that this error might be related to a network/firewall configuration.
MAAS generates grub.cfg that contains cloud-config-url parameter which is then used by cloud-init.
Iād suggest you to check what is being placed into the config and if machine is able to reach requested address.
You can do it either by adding some debug print statements here or in the grub boot menu or by listening TFTP traffic between machine and MAAS rack controller.
Thanks for the tips, I will look into that. Seems like packet sniffing would be the easiest option. Is there a particular string I am looking for?
I am running both the host and client on VMās for the time being. I have tried both on the same host and separate hosts and even booting a bare metal server but all have the same results.
All on the same network as well of course so that other thread doesnāt seem to be the same issue.
That said I am using pfsense to handle dhcp but the pxe booting starts just fine?
So I tried a bunch of stuff and the cloud-init still times out during boot of the pxe environment but I donāt get the error once it is booted anymore yet the system still never shows up in the maas webgui?
Any ideas?
Is it possible to remove the cloud-init from the pxe environment although that doesnāt appear to be the issue at this point.
I would start looking for /grub/grub.cfg file request and then examine the contents of that file.
grub.cfg generated by MAAS should have datasource_list and cloud-config-url pointing to MAAS installation.
If you are not using MAAS provided DHCP for PXE booting, that might be the problemā¦
cloud-init is required, because it runs all the scripts provided by MAAS and then communicates back.
Simplified process how it works:
Machine during PXE sends DHCP Discover
MAAS DHCP server replies with DHCP Offer (with option 150 and option 67 being set)
Machine downloads bootloader and ephemeral Ubuntu image from MAAS
Machine boots ephemeral Ubuntu
cloud-init is started and reads kernel options. It knows that is was asked to talk to specific datasource (grub.cfg)
cloud-init fetches required metadata from the datasource, run scripts, report back the status and the information about machine.
MAAS then process the data from the cloud-init and creates a machine.
Letās try to sniff the traffic and see if there any difference.
I personally never tried using MAAS with external DHCP server for PXE booting, but it seems that depending on a DHCP server type certain configuration is required. Here is just one of examples: https://portegi.es/blog/maas-1
I think you are right, I need to rule out pfsense causing issues so I am in the process of setting up a separate network now that I can use for testing, just got to wait for people to not be online so I can rearrange some things. I will try running maas with the internal dhcp server and see if that improves anything before chasing that rabbit.
In production it would be easier to use our pfsense dhcp server but we would only be using 1x of the 2x NICās on each system, so I could just setup a separate network on the 2nd nic for MAAS.
Thinking about it, that might be the best option anyways, just a bit more cost but I think we have some old 1gb switches left over from when we upgraded to 10gb anyways.
Should get time to mess with this a bit more in the next few days and will report what I learn!
Maas has so much more information the community is better then forman so far (the other option I am considering).
Just wanted to report that moving everything to a seprate network with maas handling the dhcp did indeed get things working. I will see if I can sort that out later but at least I can start playing with it now to see if it will work for what we need!
Cool, basically we use pfsense for the router for various reasons.
The reason for using the dhcp on it vs external are a few reasons.
First we use static dhcp leases for most things due to how many changes happen on this network and having it in pfsense allows us to easily keep track of those changes in the same place we deal with the firewall rules etc.
second, it is a single place to deal with all the routing/networking side of things, we are not at the scale that we need lvl3 routers etc yet so we have a pretty basic networking setup and like to keep it that way. pfsense + managed switches are all we need right now.
third, it is what we know, like most things in IT, it is just easier to work with what you know. Everyone knows it and makes it easier to manage. Plus when dealing with 25gb+ internet, there are not a ton of options for routers without spending the BIG bucks.
Technically we could run everything on the maas dhcp but frankly that is a lot of work that I just donāt see a reason to mess with lol.
Now that I know it works on a separate network I am rearranging my homelab to run some tests as we would have it in production. Just got to borrow another server from the office.
I assume that MAAS would not have any issues running in proxmox?
I see and yes, that makes perfect sense to use existing DHCP in that case.
It seems that pfSense is using ICS DHCP and are also moving to Kea DHCP according to https://redmine.pfsense.org/issues/6960 and in theory with Kea it is possible to make interesting integrations using hook libraries
But it should be also possible just to configure pfSense in a way similar in this blog post. I didnāt try it myself, but it seems to be legit.
Regarding proxmox - I donāt see any issues here. I personally been running MAAS in LXD VMs and containers for quite a while.
Yes, I followed a simular guide when I setup pfsense before and it seemed to work fine in that the PXE booting would start without an issue but the cloud-init would fail for some reason.
I think it is possibly an issue with local name resolution or something but when I manually lookup the local name, it resolves to the correct IP address.
Is there a way to override MAAS to use direct IP instead of name resolution for the provisioning environment?
But the only error I can find is : Failed to query nodeās BMC - Request failed with response status code: 401.
This is not a deal breaker, we will be using ipmi for the power control in production, was just wanting to simulate the workflow so having the power control working would be nice here.
Now one issue I am having is getting maas to work on a vlan. It does not seem to want to enable the dhcp server on a vlan, only the untagged network. Once again I can work around this in production but would be a lot easier if it would support vlans.
Anytime I try to enable dhcp for a vlan I get this error: This VLAN is not currently being utilized on any rack controller.
I looked it up online and others said they had the same issue and only untagged would work with dhcp.