Measuring MAAS performance

billwear · 9 February 2023 00:10

The MAAS engineering team actively works to improve the performance of MAAS.

Recent performance measurements

Recently, we improved the API performance of MAAS, by testing it with simulated loads. For this testing, we made the following assumptions:

five rack controllers
48 machines per fabric
five VMs per LXD host
three different architectures
six disks per machine, randomly defined as flat, RAID, LVM, and BCACHE disks
five network interfaces per machine
machines in a random status, but mostly Ready or Deployed (which best emulates a real-world scenario)

To measure performance, we use continuous performance monitoring, arranged like this:

On a daily basis, we generate simulation data based on the assumptions above, for 10, 100, and 1000 machines. These three datapoints help us get a sense of how our performance improvements scale. A Jenkins tool exercises both the REST API and the WebSocket API, trapping the results in a database, from which we can build a dashboard. The dashboard looks something like this:

Note that we always compare the current stable release with the release in development, so that we can spot issues before they become harder to find and fix. We also capture profiling data that allows us to find bottlenecks, generating a heatmap that shows which parts of the code are causing issues at the moment.

For example, comparing MAAS 3.2 to MAAS 3.1, machine listings load, on average, 32% faster for the datasets we’re using.

Performance efforts to date

Here’s a short history of our performance efforts to date:

This video show-and-tell↗ documents recent efforts to improve MAAS peformance, with quantitative results.
Here’s some work done by the UI team↗ to improve the performance of the UI.

Note that this list only captures the bigger, sustained efforts, although there is a constant focus on weeding out slowdowns when we come across them.

Collecting your own metrics

It’s possible to collect your own MAAS metrics – and even share them with the MAAS engineering team. We are keen to know everything we can about machine counts, network sizes, and MAAS performance in all areas. Please use the discourse performance forum↗ to share your feedback and observations.

Recent developments

As part of the MAAS 3.2 development effort, we have taken steps to improve the performance of machine listings. To date, we have measured the speed of listing a large number (100-1000) of machines via the REST API to be 32% faster, on average.

Next steps

Currently, we are actively working to improve MAAS performance for other operations, such as search.

system · 9 February 2024 00:11

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.