Background: I have a large environment with 1400+ hosts , 3 region controllers (HA with TLS offload via haproxy and floating VIP via keepalived, all provisioned with terraform and puppet) , 28 rack controllers, which I just rebuilt as 2.9.x
With maas 2.6 it took 3-8 minutes to load the machines page (ouch!, Our #1 user complaint). now with 2.9, it takes just about a minute (they’ll complain eventually).
Questions:
Why does it need to load all of them ? Search doesn’t work consistently till the FULL LIST is loaded, but in this modern webUI world, what’s a javascript call to the region to do a search going to actually cost?). This master machine list should be cacheable i.e. (memcache or similar) so that’s it’s not so painful to load it on big environments. The search IMHO should avoid the cache for queries (sounds crazy but bear with me) and query the DB every time (freshest data) and UPDATE the cache (such that the “main list” of machines is kept current without needing to ask the DB for all of them again, and time the tab is reloaded.
Make the amount or machines fetched from the DB a configurable setting (it seems ot count by 100’s, not sure if you’re querying for 100 at a time or not) as well as the number listed per page. Those of us with SSD backed DB’s can probably be more aggressive on some queries if the schema is optimal with proper indexes.
Why does clicking on a machine and then going back force it to RELOAD ALL OF THEM AGAIN causing yet another thumb twiddling session. (this should leverage the above mentioned cache…)
NOTE: Based on your docs for a HA setup, a user is confined to a single region controller so this cache doesn’t not need to be sharded/shared or replicated among other region controllers, it could be in effect a simple memcached process in the snap.