Is it possible to customize MAAS to fetch all supported image types via NFS instead of HTTP? HTTP doesn’t perform well under high load and often causes deployment failures. We can try using SquashFS over NFS by customizing curtin_userdata_custom, and we assume it will work. However, it may not be very flexible if the other team used their own custom images.
If this is a question, the answer is no (at least out of the box).
If this is a feature request, it makes sense
If I may ask, what’s the bottleneck you have when using HTTP? How many machines do you deploy in parallel? How many racks do you have? Do you use a dedicated network for PXE and the data network?
Thank you for the response.
Each team currently manages approximately 40 to 50 pairs of rackds, with each pair mapped to a dedicated subnet. We evaluated a consolidation approach by reducing to ~10 pairs, each serving 5 subnets. However, when one team initiated a deployment of ~15 machines, parallel deployments from other teams would fail due to HTTP cannot handle the requests. Each machine uses a 25Gbps NIC, which quickly exhausts available bandwidth during PXE and image transfers.
If this can make into feature request in the future it would be great
Which version of MAAS do you use? I am curious if you can share additional metrics with us to identify the bottleneck.
Do you have metrics about the inbound/outbound traffic for the racks and the regions? What MAAS version are you using?
We are using version 3.5.4.
When deploying 15 machines in parallel, the network bandwidth reaches its maximum capacity of 25 GB.
Do you deploy large custom images?
Yes, we use a custom image to deploy; the size is 8 GB.
Thank you for sharing
Do you also have the installation logs for a machine that failed to deploy due to that?
Sorry for the late reply.
I’ve already released all the machines with failed deployments, so I no longer have access to the logs. However, from what I recall, most of them failed to retrieve the image over HTTP due to a timeout during the download process.
As a workaround, I patched curtin to convert the http://
source to an NFS file://
structure. We haven’t tested it in production yet, but the patch has worked well in our initial tests.
If the NFS fails to mount, we will roll back to using http://
.