Postgresql runs 99% cpu, maas ui/api hangs

Hi MAAS community,

i run the following installation

$ snap list maas
Name Version Rev Tracking Publisher Notes
maas 3.4.0-14321-g.1027c7664 32469 3.4/stable canonical✓ -

few days ago i have upgrade MAAS from 3.3 to 3.4 using channel stable, since then MAAS behave slowly UI and API are slow to respond, i have notice the following process consuming most CPU

top - 14:42:26 up 3:13, 1 user, load average: 3.83, 3.92, 4.05
Tasks: 519 total, 5 running, 513 sleeping, 1 stopped, 0 zombie
%Cpu(s): 12.1 us, 0.3 sy, 0.0 ni, 87.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 96420.2 total, 87971.4 free, 3217.2 used, 5231.5 buff/cache
MiB Swap: 8192.0 total, 8192.0 free, 0.0 used. 92302.7 avail Mem

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                          

59793 postgres 20 0 233496 162552 148224 R 99.7 0.2 7:36.39 postgres
60247 postgres 20 0 233524 163360 148992 R 98.0 0.2 6:11.75 postgres
60170 postgres 20 0 233256 162188 148480 R 97.7 0.2 6:28.60 postgres
60257 postgres 20 0 233408 162896 148736 R 97.7 0.2 5:40.31 postgres
1600 root 20 0 723076 94516 18432 S 1.3 0.1 4:26.08 python3
62127 stack 20 0 10896 4352 3328 R 1.0 0.0 0:00.14 top

any idea why? how to resolve? debug?

$ ps -aux | grep maas
root        1092  0.1  0.0  30012 22752 ?        Ss   11:29   0:16 python3 /snap/maas/32469/bin/supervisord -d /var/snap/maas/32469/supervisord -c /var/snap/maas/32469/supervisord/supervisord.conf -n
root        1597  0.0  0.0 3249140 29696 ?       Sl   11:29   0:03 /snap/maas/32469/usr/sbin/named -c /var/snap/maas/32469/bind/named.conf -S 524288 -g
root        1600  2.2  0.0 724236 95028 ?        Sl   11:29   4:36 python3 /snap/maas/32469/bin/rackd
root        1601  1.2  0.1 1053336 122980 ?      Sl   11:29   2:24 python3 /snap/maas/32469/bin/regiond
root        1772  3.5  0.2 943848 201496 ?       Sl   11:29   7:09 python3 /snap/maas/32469/bin/regiond
root        1773  4.9  0.2 1028772 224180 ?      Sl   11:29  10:01 python3 /snap/maas/32469/bin/regiond
root        1774  1.8  0.1 928072 175440 ?       Sl   11:29   3:44 python3 /snap/maas/32469/bin/regiond
root        1776  2.2  0.1 935816 185904 ?       Sl   11:29   4:26 python3 /snap/maas/32469/bin/regiond
root        1781  0.0  0.0  68224 60564 ?        S    11:29   0:00 python3 /snap/maas/32469/bin/maas-rack observe-beacons enp4s0f0np0
root        1782  0.0  0.0  68232 60556 ?        S    11:29   0:00 python3 /snap/maas/32469/bin/maas-rack observe-beacons virbr0
root        1802  0.0  0.0  84356  5120 ?        S    11:29   0:00 /snap/maas/32469/usr/sbin/chronyd -u root -d -f /var/snap/maas/32469/etc/chrony/chrony.conf
root        1845  0.0  0.0  17964  9216 ?        S    11:29   0:00 /snap/maas/32469/usr/bin/tcpdump -Z root --interface virbr0 --direction=in --no-promiscuous-mode --packet-buffered --immediate-mode --snapshot-length=16384 -n -w - (udp dst port 5240) or (vlan and udp dst port 5240)
root        1846  0.0  0.0  17964  8960 ?        S    11:29   0:00 /snap/maas/32469/usr/bin/tcpdump -Z root --interface enp4s0f0np0 --direction=in --no-promiscuous-mode --packet-buffered --immediate-mode --snapshot-length=16384 -n -w - (udp dst port 5240) or (vlan and udp dst port 5240)
root        1987  0.0  0.0 382340  5632 ?        Sl   11:29   0:00 /snap/maas/32469/usr/sbin/rsyslogd -n -f /var/snap/maas/32469/syslog/rsyslog.conf -i /var/snap/maas/32469/syslog/rsyslog.pid
root        2084  0.0  0.0 105452 10752 ?        Sl   11:29   0:00 /snap/maas/32469/usr/sbin/dhcpd -f -4 -pf /var/snap/maas/common/maas/dhcp/dhcpd.pid -cf /var/snap/maas/common/maas/dhcpd.conf -lf /var/snap/maas/common/maas/dhcp/dhcpd.leases enp4s0f0np0
root        2104  0.0  0.0  10368  6656 ?        S    11:30   0:00 nginx: master process /snap/maas/32469/usr/sbin/nginx -c /var/snap/maas/32469/http/nginx.conf
root        3209  0.0  0.0  68224 60480 ?        S    11:37   0:00 python3 /snap/maas/32469/bin/maas-rack observe-mdns
root        3211  0.0  0.0   7152  3072 ?        S    11:37   0:00 /snap/maas/32469/usr/bin/avahi-browse --all --resolve --no-db-lookup --parsable --no-fail
root        3212  0.0  0.0 705836  4396 ?        Sl   11:37   0:04 /snap/maas/32469/usr/sbin/maas-netmon enp4s0f0np0
root        3213  0.0  0.0 704428  3072 ?        Sl   11:37   0:00 /snap/maas/32469/usr/sbin/maas-netmon virbr0
postgres   62681  0.0  0.0 222780 41512 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54740) idle
postgres   62689  0.0  0.0 219448 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54746) idle
postgres   62690  0.0  0.0 219452 17100 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54760) idle
postgres   62691  0.0  0.0 219452 17100 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54776) idle
postgres   62692  0.0  0.0 219452 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54790) idle
postgres   62693  0.0  0.0 219452 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54804) idle
root       62707  0.0  0.0   4796  3328 ?        S    14:46   0:00 /bin/bash -e /snap/maas/32469/bin/run-squid
snap_da+   62711  0.0  0.0  75696 25600 ?        Sl   14:46   0:00 squid -N -d 5 -f /var/snap/maas/32469/proxy/maas-proxy.conf
postgres   62768  0.0  0.0 220456 31180 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54874) idle
postgres   62770  8.3  0.1 228472 158508 ?       Ss   14:46   0:17 postgres: 14/main: stack maas_db 127.0.0.1(54900) idle
postgres   62814  0.0  0.0 220344 28620 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36234) idle
postgres   62816  0.1  0.0 225072 55024 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36254) idle
postgres   62829  1.3  0.0 232324 61536 ?        Ss   14:47   0:02 postgres: 14/main: stack maas_db 127.0.0.1(36378) idle
postgres   62830  0.0  0.0 223840 37444 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36392) idle
postgres   62842 10.2  0.1 229716 163104 ?       Ss   14:47   0:18 postgres: 14/main: stack maas_db 127.0.0.1(36496) idle
postgres   62843  0.0  0.0 221544 41932 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36498) idle
postgres   62844  0.0  0.0 220444 30156 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36512) idle
postgres   62846  0.0  0.0 220444 35020 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36530) idle
postgres   62848  0.0  0.0 221072 39616 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36546) idle
postgres   62850  0.0  0.0 223012 47052 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36572) idle
postgres   62948  0.2  0.0 226212 60788 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(41234) idle
postgres   62961  0.0  0.0 222984 46284 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(41338) idle
postgres   62976  0.3  0.0 223288 61464 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(41454) idle
postgres   63100  0.0  0.0 220444 33996 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(42444) idle
postgres   63107  0.0  0.0 223004 48844 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(42512) idle
postgres   63184  0.0  0.0 220296 32204 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(43232) idle
postgres   63195  0.1  0.0 225528 55120 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(43336) idle
postgres   63216  0.3  0.0 223248 71012 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(43508) idle
postgres   63225  0.0  0.0 221564 47308 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(43586) idle
postgres   63230  0.1  0.0 222960 56088 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(43636) idle
postgres   63314  0.0  0.0 223816 54852 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(48922) idle
postgres   63339 11.7  0.1 233120 164988 ?       Ss   14:47   0:18 postgres: 14/main: stack maas_db 127.0.0.1(49186) idle
postgres   63647  0.0  0.0 220324 29388 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(48854) idle
postgres   63649  0.0  0.0 220444 37580 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(48872) idle
postgres   63651  0.0  0.0 220444 29900 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(48880) idle
postgres   63735  0.0  0.0 223316 53328 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(49606) idle
postgres   63855  0.0  0.0 221568 44236 ?        Ss   14:47   0:00 postgres: 14/main: stack maas_db 127.0.0.1(50714) idle
postgres   64970  0.0  0.0 220900 34764 ?        Ss   14:48   0:00 postgres: 14/main: stack maas_db 127.0.0.1(59358) idle
postgres   65018  0.1  0.0 220432 37068 ?        Ss   14:49   0:00 postgres: 14/main: stack maas_db 127.0.0.1(38084) idle
postgres   65019  0.0  0.0 221008 40032 ?        Ss   14:49   0:00 postgres: 14/main: stack maas_db 127.0.0.1(38094) idle
stack      65154  0.0  0.0   6612  2304 pts/0    S+   14:50   0:00 grep --color=auto maas

@noama yesterday I migrated an instance from 3.3 to 3.4 and I observed the same. Could you please check if you are affected by Bug #2049508 “MAAS has orphan ip addresses and dns records that ...” : Bugs : MAAS as well?

Like running

select subnet_id, count(*) from maasserver_staticipaddress group by subnet_id;

and

select domain_id, count(*) from maasserver_dnsresource group by domain_id;

and report here the result

you have to switch to the maasdb database first (or anyway the name you gave to the db).

You can

sudo -u postgres psql

and then

\c maasdb

and

select subnet_id, count(*) from maasserver_staticipaddress group by subnet_id;

or

sudo -u postgres psql -d maasdb -c "the query"

@r00ta

$ sudo -u postgres psql -d maas_db -c "select subnet_id, count(*) from maasserver_staticipaddress group by subnet_id;
"
 subnet_id | count 
-----------+-------
         3 |     1
         2 |     1
         4 |     1
         8 |     1
         7 |     1
           |    32
         1 |    72
(7 rows)


$ sudo -u postgres psql -d maas_db -c "select domain_id, count(*) from maasserver_dnsresource group by domain_id;
"
 domain_id | count 
-----------+-------
         0 |     1
(1 row)


Then I’d say no. You are not affected by the bug I linked. How many machines do you have and did you start any large amount of operations at the same time?

have 40 machine, 31 of them was deploy in sequences of 3 nodes at a time but with MAAS 3.3, deployed over and over again with now issue.
now Try to redeploy 3 machines at a time from the deployed machines (release and deploy) and also use machines from pool i use for temporary deployments, this is sometime work and sometime the deployment process hangs on some MAAS operations like Reboot node or deployment of node OS.

has over 7k fabrics, seems like garbage, maybe there is a way to clean all of this ?

uhm hard to say. I’d not suggest any db-surgery thing before understanding why all these subnets are there and how they are still linked to the other tables.

does it looks right?
MAAS install on single node

$ ps -aux | grep maas
root        1092  0.1  0.0  30148 22752 ?        Ss   11:29   0:30 python3 /snap/maas/32469/bin/supervisord -d /var/snap/maas/32469/supervisord -c /var/snap/maas/32469/supervisord/supervisord.conf -n
root        1597  0.0  0.0 4559860 31232 ?       Sl   11:29   0:06 /snap/maas/32469/usr/sbin/named -c /var/snap/maas/32469/bind/named.conf -S 524288 -g
root        1600  2.2  0.0 724236 95540 ?        Sl   11:29   8:52 python3 /snap/maas/32469/bin/rackd
root        1601  1.1  0.1 1053488 125028 ?      Sl   11:29   4:40 python3 /snap/maas/32469/bin/regiond
root        1772  2.7  0.3 1063824 355884 ?      Sl   11:29  10:47 python3 /snap/maas/32469/bin/regiond
root        1773  3.4  0.3 1095100 365304 ?      Sl   11:29  13:42 python3 /snap/maas/32469/bin/regiond
root        1774  2.1  0.3 1008712 385708 ?      Sl   11:29   8:27 python3 /snap/maas/32469/bin/regiond
root        1776  2.4  0.2 958156 278220 ?       Sl   11:29   9:40 python3 /snap/maas/32469/bin/regiond
root        1781  0.0  0.0  68224 60564 ?        S    11:29   0:00 python3 /snap/maas/32469/bin/maas-rack observe-beacons enp4s0f0np0
root        1782  0.0  0.0  68232 60556 ?        S    11:29   0:00 python3 /snap/maas/32469/bin/maas-rack observe-beacons virbr0
root        1802  0.0  0.0  84356  5120 ?        S    11:29   0:00 /snap/maas/32469/usr/sbin/chronyd -u root -d -f /var/snap/maas/32469/etc/chrony/chrony.conf
root        1845  0.0  0.0  17964  9216 ?        S    11:29   0:00 /snap/maas/32469/usr/bin/tcpdump -Z root --interface virbr0 --direction=in --no-promiscuous-mode --packet-buffered --immediate-mode --snapshot-length=16384 -n -w - (udp dst port 5240) or (vlan and udp dst port 5240)
root        1846  0.0  0.0  17964  8960 ?        S    11:29   0:00 /snap/maas/32469/usr/bin/tcpdump -Z root --interface enp4s0f0np0 --direction=in --no-promiscuous-mode --packet-buffered --immediate-mode --snapshot-length=16384 -n -w - (udp dst port 5240) or (vlan and udp dst port 5240)
root        1987  0.0  0.0 382340  5632 ?        Sl   11:29   0:01 /snap/maas/32469/usr/sbin/rsyslogd -n -f /var/snap/maas/32469/syslog/rsyslog.conf -i /var/snap/maas/32469/syslog/rsyslog.pid
root        2084  0.0  0.0 105452 10752 ?        Sl   11:29   0:01 /snap/maas/32469/usr/sbin/dhcpd -f -4 -pf /var/snap/maas/common/maas/dhcp/dhcpd.pid -cf /var/snap/maas/common/maas/dhcpd.conf -lf /var/snap/maas/common/maas/dhcp/dhcpd.leases enp4s0f0np0
root        2104  0.0  0.0  10368  6656 ?        S    11:30   0:00 nginx: master process /snap/maas/32469/usr/sbin/nginx -c /var/snap/maas/32469/http/nginx.conf
root        3209  0.0  0.0  68224 60480 ?        S    11:37   0:00 python3 /snap/maas/32469/bin/maas-rack observe-mdns
root        3211  0.0  0.0   7152  3072 ?        S    11:37   0:00 /snap/maas/32469/usr/bin/avahi-browse --all --resolve --no-db-lookup --parsable --no-fail
root        3212  0.0  0.0 706092  4476 ?        Sl   11:37   0:17 /snap/maas/32469/usr/sbin/maas-netmon enp4s0f0np0
root        3213  0.0  0.0 704428  3072 ?        Sl   11:37   0:00 /snap/maas/32469/usr/sbin/maas-netmon virbr0
postgres   62689  0.0  0.0 219448 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54746) idle
postgres   62690  0.0  0.0 219452 17100 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54760) idle
postgres   62691  0.0  0.0 219452 17100 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54776) idle
postgres   62692  0.0  0.0 219452 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54790) idle
postgres   62693  0.0  0.0 219452 16844 ?        Ss   14:46   0:00 postgres: 14/main: stack maas_db 127.0.0.1(54804) idle
root       62707  0.0  0.0   4796  3328 ?        S    14:46   0:00 /bin/bash -e /snap/maas/32469/bin/run-squid
snap_da+   62711  0.0  0.0 104040 53248 ?        Sl   14:46   0:11 squid -N -d 5 -f /var/snap/maas/32469/proxy/maas-proxy.conf
postgres   62844  0.0  0.0 223456 96372 ?        Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(36512) idle
postgres   62846  0.0  0.1 223480 102232 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(36530) idle
postgres   62848  0.0  0.1 226000 103648 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(36546) idle
postgres   63100  0.0  0.1 226040 102460 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(42444) idle
postgres   63184  0.0  0.1 225992 104936 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(43232) idle
postgres   63647  0.0  0.0 223652 97212 ?        Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(48854) idle
postgres   63649  0.0  0.1 226008 106712 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(48872) idle
postgres   63651  0.0  0.1 223504 104008 ?       Ss   14:47   0:01 postgres: 14/main: stack maas_db 127.0.0.1(48880) idle
postgres   65018  0.0  0.0 223472 98088 ?        Ss   14:49   0:01 postgres: 14/main: stack maas_db 127.0.0.1(38084) idle
postgres  105962  0.1  0.0 228128 53212 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36898) idle
postgres  105964  0.0  0.0 223608 43688 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(36914) idle
postgres  105991  0.0  0.0 225948 50512 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(59484) idle
postgres  105992  0.0  0.0 227912 49504 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(59498) idle
postgres  105993  0.0  0.0 223248 42700 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(59500) idle
postgres  106105  0.0  0.0 227744 47112 ?        Ss   18:00   0:00 postgres: 14/main: stack maas_db 127.0.0.1(50362) idle
postgres  106281  0.0  0.0 224860 40908 ?        Ss   18:01   0:00 postgres: 14/main: stack maas_db 127.0.0.1(33734) idle
postgres  106500  0.0  0.0 224476 41164 ?        Ss   18:01   0:00 postgres: 14/main: stack maas_db 127.0.0.1(42184) idle
postgres  106558  0.0  0.0 223356 40140 ?        Ss   18:01   0:00 postgres: 14/main: stack maas_db 127.0.0.1(42198) idle
postgres  106582  0.0  0.0 224360 41676 ?        Ss   18:01   0:00 postgres: 14/main: stack maas_db 127.0.0.1(33402) idle
postgres  106737  0.0  0.0 228228 47216 ?        Ss   18:01   0:00 postgres: 14/main: stack maas_db 127.0.0.1(33410) idle
postgres  106987  0.0  0.0 222260 36812 ?        Ss   18:02   0:00 postgres: 14/main: stack maas_db 127.0.0.1(46448) idle
postgres  107248  0.0  0.0 222792 37580 ?        Ss   18:02   0:00 postgres: 14/main: stack maas_db 127.0.0.1(38090) idle
postgres  107664  0.0  0.0 222916 39116 ?        Ss   18:03   0:00 postgres: 14/main: stack maas_db 127.0.0.1(34162) idle
postgres  108055  0.0  0.0 222936 38604 ?        Ss   18:03   0:00 postgres: 14/main: stack maas_db 127.0.0.1(53180) idle
postgres  108416  0.0  0.0 221004 34252 ?        Ss   18:03   0:00 postgres: 14/main: stack maas_db 127.0.0.1(50856) idle
postgres  108527  0.0  0.0 221212 35532 ?        Ss   18:04   0:00 postgres: 14/main: stack maas_db 127.0.0.1(41322) idle
postgres  108555  0.0  0.0 221776 36660 ?        Ss   18:04   0:00 postgres: 14/main: stack maas_db 127.0.0.1(56818) idle
postgres  108710  0.0  0.0 221648 36660 ?        Ss   18:04   0:00 postgres: 14/main: stack maas_db 127.0.0.1(37020) idle
postgres  108765  0.0  0.0 220416 30668 ?        Ss   18:04   0:00 postgres: 14/main: stack maas_db 127.0.0.1(41336) idle
postgres  109115  0.0  0.0 220904 35276 ?        Ss   18:04   0:00 postgres: 14/main: stack maas_db 127.0.0.1(52406) idle
postgres  109438  0.0  0.0 221076 35788 ?        Ss   18:05   0:00 postgres: 14/main: stack maas_db 127.0.0.1(50442) idle
postgres  109492  0.0  0.0 221648 36404 ?        Ss   18:05   0:00 postgres: 14/main: stack maas_db 127.0.0.1(44804) idle
postgres  109547  0.0  0.0 220808 33484 ?        Ss   18:05   0:00 postgres: 14/main: stack maas_db 127.0.0.1(44816) idle
postgres  109844  0.0  0.0 221004 34508 ?        Ss   18:05   0:00 postgres: 14/main: stack maas_db 127.0.0.1(52856) idle
postgres  110163  0.1  0.0 221100 36556 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(37732) idle
postgres  110191  0.0  0.0 220752 33228 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(47950) idle
postgres  110215  0.0  0.0 220752 33228 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(47966) idle
postgres  110293  0.1  0.0 220752 33228 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(47790) idle
postgres  110323  0.0  0.0 220752 32972 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(47798) idle
postgres  110351  0.1  0.0 220752 33228 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(47814) idle
postgres  110380  0.4  0.0 220752 33228 ?        Ss   18:06   0:00 postgres: 14/main: stack maas_db 127.0.0.1(35486) idle
stack     110411  100  0.1 163868 117472 pts/0   R+   18:06   0:02 python3 /snap/maas/32469/bin/maas admin fabric delete 412
stack     110439  0.0  0.0   6612  2304 pts/1    S+   18:06   0:00 grep --color=auto maas

not a linux expert but seems like any API call spikes like that

Every 2.0s: ps -aux --sort -pcpu                                                                    co-node-20: Wed Jan 17 18:11:51 2024

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
stack     114220 72.0  0.1 165908 118012 pts/0   R+   18:11   0:02 python3 /snap/maas/32469/bin/maas admin fabric delete 437
root        1773  3.4  0.3 1095100 365304 ?      Sl   11:29  13:45 python3 /snap/maas/32469/bin/regiond
root        1772  2.6  0.3 1063824 355884 ?      Sl   11:29  10:50 python3 /snap/maas/32469/bin/regiond
root        1776  2.4  0.3 995020 344060 ?       Sl   11:29  10:00 python3 /snap/maas/32469/bin/regiond
root        1600  2.2  0.0 724236 9554

/snap/maas/32469/bin/maas admin fabric delete 437 is actually the CLI that is consuming CPU. This is “expected”.

I would suggest you to setup monitoring https://maas.io/docs/monitoring-maas-activities so that you can have evidences on the MAAS performances and possible regressions after an upgrade

@r00ta what about this?

Every 2.0s: ps -aux --sort -pcpu                                                                    co-node-20: Wed Jan 17 20:00:33 2024

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres  194713 55.2  0.1 233436 163876 ?       Rs   19:52   4:28 postgres: 14/main: stack maas_db 127.0.0.1(58976) SELECT
postgres  196433 52.2  0.1 227132 157228 ?       Ss   20:00   0:16 postgres: 14/main: stack maas_db 127.0.0.1(56766) idle
postgres  192666 36.2  0.1 233772 163644 ?       Ss   19:49   3:52 postgres: 14/main: stack maas_db 127.0.0.1(48158) idle
postgres  193984 33.9  0.1 233480 162960 ?       Ss   19:51   3:02 postgres: 14/main: stack maas_db 127.0.0.1(55246) idle
postgres  190469 31.2  0.1 233636 163932 ?       Rs   19:47   4:12 postgres: 14/main: stack maas_db 127.0.0.1(47984) SELECT
postgres  196295 29.1  0.1 228380 158716 ?       Ss   19:59   0:16 postgres: 14/main: stack maas_db 127.0.0.1(48748) idle
postgres  196296 29.0  0.1 227100 157320 ?       Ss   19:59   0:16 postgres: 14/main: stack maas_db 127.0.0.1(48764) idle
postgres  191153 28.7  0.1 233612 163220 ?       Rs   19:47   3:37 postgres: 14/main: stack maas_db 127.0.0.1(47224) SELECT
postgres  192421 27.9  0.1 233452 163608 ?       Ss   19:49   3:03 postgres: 14/main: stack maas_db 127.0.0.1(53346) idle
postgres  190972 24.0  0.1 233512 162328 ?       Rs   19:47   3:04 postgres: 14/main: stack maas_db 127.0.0.1(55056) SELECT
postgres  196174 20.1  0.1 229824 162192 ?       Ss   19:58   0:21 postgres: 14/main: stack maas_db 127.0.0.1(51384) idle
postgres  190704 18.8  0.1 233604 164160 ?       Ss   19:47   2:29 postgres: 14/main: stack maas_db 127.0.0.1(43186) idle
postgres  196175 16.5  0.1 232540 162516 ?       Ss   19:58   0:17 postgres: 14/main: stack maas_db 127.0.0.1(51396) idle
postgres  193648  8.8  0.1 233024 163396 ?       Ss   19:51   0:50 postgres: 14/main: stack maas_db 127.0.0.1(36768) idle 

Hello @noama

Is there anything interesting in

SELECT pid, datname, usename, query FROM pg_stat_activity;

Also if you have pg_stat_statements extension enabled on your PostgreSQL server, maybe you can inspect running queries with something like this?

SELECT 
 pss.userid,
 pss.dbid,
 pd.datname as db_name,
 round((pss.total_exec_time + pss.total_plan_time)::numeric, 2) as total_time, 
 pss.calls, 
 round((pss.mean_exec_time+pss.mean_plan_time)::numeric, 2) as mean, 
 round((100 * (pss.total_exec_time + pss.total_plan_time) / sum((pss.total_exec_time + pss.total_plan_time)::numeric) OVER ())::numeric, 2) as cpu_portion_pctg,
 substr(pss.query, 1, 200) short_query
FROM pg_stat_statements pss, pg_database pd 
WHERE pd.oid=pss.dbid
ORDER BY (pss.total_exec_time + pss.total_plan_time)
DESC LIMIT 30;
  pid   | datname | usename  |                                                                                                                                                                                     query                                                                                                                                                                                      
--------+---------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   5186 |         |          | 
   5188 |         | postgres | 
  23217 | maas_db | stack    | COMMIT
 145755 | maas_db | stack    | COMMIT
  23227 | maas_db | stack    | LISTEN sys_vault_migration;
  23231 | maas_db | stack    | COMMIT
  23230 | maas_db | stack    | COMMIT
  23232 | maas_db | stack    | COMMIT
  23242 | maas_db | stack    | COMMIT
  23238 | maas_db | stack    | SELECT "maasserver_rdns"."id", "maasserver_rdns"."created", "maasserver_rdns"."updated", "maasserver_rdns"."ip", "maasserver_rdns"."hostname", "maasserver_rdns"."hostnames", "maasserver_rdns"."observer_id" FROM "maasserver_rdns" WHERE ("maasserver_rdns"."ip" = '10.209.86.96'::inet AND "maasserver_rdns"."observer_id" = 1) ORDER BY "maasserver_rdns"."id" ASC LIMIT 1
  23244 | maas_db | stack    | COMMIT
  23243 | maas_db | stack    | SET default_transaction_isolation TO DEFAULT
  23250 | maas_db | stack    | COMMIT
 158126 | maas_db | stack    | COMMIT
 151720 | maas_db | stack    | SELECT "django_session"."session_key" FROM "django_session" WHERE ("django_session"."expire_date" > '2024-01-18T07:28:04.957851'::timestamp AND "django_session"."session_key" IN ('6wywxrw60nwggv8223mathghnva1v88z'))
 156390 | maas_db | stack    | COMMIT
 156707 | maas_db | stack    | COMMIT
 155141 | maas_db | stack    | COMMIT
 157824 | maas_db | stack    | COMMIT
 156599 | maas_db | stack    | COMMIT
  23360 | maas_db | stack    | UNLISTEN sys_dhcp_1;
  23367 | maas_db | stack    | LISTEN sys_dhcp_1;
 155147 | maas_db | stack    | COMMIT
  23376 | maas_db | stack    | LISTEN sys_core_25;
  23383 | maas_db | stack    | UNLISTEN sys_dhcp_1;
 155143 | maas_db | stack    | SELECT "django_session"."session_key" FROM "django_session" WHERE ("django_session"."expire_date" > '2024-01-18T07:27:59.957833'::timestamp AND "django_session"."session_key" IN ('6wywxrw60nwggv8223mathghnva1v88z'))
 156710 | maas_db | stack    | SET default_transaction_isolation TO DEFAULT
 156698 | maas_db | stack    | COMMIT
 156708 | maas_db | stack    | COMMIT
 157085 | maas_db | stack    | COMMIT
 157850 | maas_db | stack    | COMMIT
 156712 | maas_db | stack    | SELECT "django_session"."session_key" FROM "django_session" WHERE ("django_session"."expire_date" > '2024-01-18T07:28:09.957884'::timestamp AND "django_session"."session_key" IN ('6wywxrw60nwggv8223mathghnva1v88z'))
 156003 | maas_db | stack    | SELECT "django_session"."session_key" FROM "django_session" WHERE ("django_session"."expire_date" > '2024-01-18T07:27:49.957553'::timestamp AND "django_session"."session_key" IN ('6wywxrw60nwggv8223mathghnva1v88z'))
 155145 | maas_db | stack    | SELECT "django_session"."session_key" FROM "django_session" WHERE ("django_session"."expire_date" > '2024-01-18T07:27:54.957605'::timestamp AND "django_session"."session_key" IN ('6wywxrw60nwggv8223mathghnva1v88z'))
 156688 | maas_db | stack    | COMMIT
 156389 | maas_db | stack    | SET default_transaction_isolation TO DEFAULT
 156391 | maas_db | stack    | COMMIT
 155893 | maas_db | stack    | COMMIT
 156713 | maas_db | stack    | COMMIT
 156781 | maas_db | stack    | COMMIT
 155682 | maas_db | stack    | COMMIT
 156626 | maas_db | stack    | COMMIT
 156709 | maas_db | stack    | COMMIT
 157742 | maas_db | stack    | COMMIT
 158246 | maas_db | postgres | SELECT pid, datname, usename, query FROM pg_stat_activity;                                                                                                                                                                                                                                                                                                                    +
        |         |          | 
   5184 |         |          | 
   5183 |         |          | 
   5185 |         |          | 
(48 rows)


maas_db=# SELECT 
 pss.userid,
 pss.dbid,
 pd.datname as db_name,
 round((pss.total_exec_time + pss.total_plan_time)::numeric, 2) as total_time, 
 pss.calls, 
 round((pss.mean_exec_time+pss.mean_plan_time)::numeric, 2) as mean, 
 round((100 * (pss.total_exec_time + pss.total_plan_time) / sum((pss.total_exec_time + pss.total_plan_time)::numeric) OVER ())::numeric, 2) as cpu_portion_pctg,
 substr(pss.query, 1, 200) short_query
FROM pg_stat_statements pss, pg_database pd 
WHERE pd.oid=pss.dbid
ORDER BY (pss.total_exec_time + pss.total_plan_time)
DESC LIMIT 30;
 userid |  dbid  | db_name | total_time | calls | mean  | cpu_portion_pctg |                                                                                               short_query                                                                                                
--------+--------+---------+------------+-------+-------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  16384 | 118108 | maas_db |      21.26 |     2 | 10.63 |            17.92 | UPDATE "maasserver_staticipaddress" SET "created" = $1::timestamp, "updated" = $2::timestamp, "ip" = $3::inet, "alloc_type" = $4, "subnet_id" = $5, "user_id" = $6, "lease_time" = $7, "temp_expires_on"
     10 | 118108 | maas_db |      12.45 |     3 |  4.15 |            10.49 | SELECT                                                                                                                                                                                                  +
        |        |         |            |       |       |                  |  pss.userid,                                                                                                                                                                                            +
        |        |         |            |       |       |                  |  pss.dbid,                                                                                                                                                                                              +
        |        |         |            |       |       |                  |  pd.datname as db_name,                                                                                                                                                                                 +
        |        |         |            |       |       |                  |  round((pss.total_exec_time + pss.total_plan_time)::numeric, $1) as total_time,                                                                                                                         +
        |        |         |            |       |       |                  |  pss.calls,                                                                                                                                                                                             +
        |        |         |            |       |       |                  |  round((pss.mean_exec_time+pss.mean_plan_time)::nu
  16384 | 118108 | maas_db |      10.64 |    90 |  0.12 |             8.97 | SELECT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maasserver_nod
  16384 | 118108 | maas_db |      10.23 |     6 |  1.71 |             8.62 | UPDATE "maasserver_neighbour" SET "updated" = $1::timestamp, "time" = $2, "count" = $3 WHERE "maasserver_neighbour"."id" = $4
  16384 | 118108 | maas_db |       7.78 |    39 |  0.20 |             6.56 | SET TIME ZONE 'UTC'
  16384 | 118108 | maas_db |       5.34 |     8 |  0.67 |             4.50 | SELECT DISTINCT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maass
  16384 | 118108 | maas_db |       4.47 |     1 |  4.47 |             3.77 | SELECT "maasserver_vlan"."id", ("maasserver_vlan"."primary_rack_id" IS NOT NULL AND "maasserver_vlan"."secondary_rack_id" IS NOT NULL) AS "is_ha" FROM "maasserver_vlan"
  16384 | 118108 | maas_db |       2.51 |     1 |  2.51 |             2.12 | UPDATE "maasserver_controllerinfo" SET "created" = $1::timestamp, "updated" = $2::timestamp, "version" = $3, "update_version" = $4, "update_origin" = $5, "update_first_reported" = $6, "install_type" =
  16384 | 118108 | maas_db |       1.94 |    17 |  0.11 |             1.64 | SELECT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maasserver_nod
  16384 | 118108 | maas_db |       1.74 |    72 |  0.02 |             1.47 | SELECT "maasserver_regioncontrollerprocessendpoint"."id", "maasserver_regioncontrollerprocessendpoint"."created", "maasserver_regioncontrollerprocessendpoint"."updated", "maasserver_regioncontrollerpr
  16384 | 118108 | maas_db |       1.71 |    80 |  0.02 |             1.44 | SELECT "maasserver_service"."id", "maasserver_service"."created", "maasserver_service"."updated", "maasserver_service"."node_id", "maasserver_service"."name", "maasserver_service"."status", "maasserve
  16384 | 118108 | maas_db |       1.67 |    16 |  0.10 |             1.41 | SELECT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maasserver_nod
  16384 | 118108 | maas_db |       1.53 |     1 |  1.53 |             1.29 | SELECT DISTINCT ON (fqdn, is_boot, family)                                                                                                                                                              +
        |        |         |            |       |       |                  |                 CONCAT(node.hostname, $1, domain.name) AS fqdn,                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.system_id,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.node_type,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 staticip.user
  16384 | 118108 | maas_db |       1.31 |     1 |  1.31 |             1.11 | SELECT DISTINCT ON (fqdn, is_boot, family)                                                                                                                                                              +
        |        |         |            |       |       |                  |                 CONCAT(node.hostname, $1, domain.name) AS fqdn,                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.system_id,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.node_type,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 staticip.user
  16384 | 118108 | maas_db |       1.31 |    28 |  0.05 |             1.10 | SELECT "maasserver_staticipaddress"."id", "maasserver_staticipaddress"."created", "maasserver_staticipaddress"."updated", "maasserver_staticipaddress"."ip", "maasserver_staticipaddress"."alloc_type", 
  16384 | 118108 | maas_db |       1.25 |     1 |  1.25 |             1.05 | SELECT "maasserver_event"."id", "maasserver_event"."created", "maasserver_event"."updated", "maasserver_event"."type_id", "maasserver_event"."node_id", "maasserver_event"."node_system_id", "maasserver
  16384 | 118108 | maas_db |       1.25 |    18 |  0.07 |             1.05 | SELECT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maasserver_nod
  16384 | 118108 | maas_db |       1.18 |     1 |  1.18 |             0.99 | SELECT                                                                                                                                                                                                  +
        |        |         |            |       |       |                  |                 COALESCE(dnsrr.fqdn, node.fqdn) AS fqdn,                                                                                                                                                +
        |        |         |            |       |       |                  |                 node.system_id,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.node_type,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 staticip.user_id,                                                                                                                                                                       +
        |        |         |            |       |       |                  |                                                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 COALE
  16384 | 118108 | maas_db |       1.05 |    16 |  0.07 |             0.89 | SELECT "maasserver_service"."id", "maasserver_service"."created", "maasserver_service"."updated", "maasserver_service"."node_id", "maasserver_service"."name", "maasserver_service"."status", "maasserve
  16384 | 118108 | maas_db |       1.02 |     6 |  0.17 |             0.86 | SELECT "maasserver_interface"."id", "maasserver_interface"."created", "maasserver_interface"."updated", "maasserver_interface"."node_config_id", "maasserver_interface"."name", "maasserver_interface"."
  16384 | 118108 | maas_db |       1.01 |     2 |  0.50 |             0.85 | SELECT DISTINCT ON ("maasserver_scriptresult"."script_name", "maasserver_scriptresult"."physical_blockdevice_id", "maasserver_scriptresult"."interface_id", "maasserver_scriptset"."node_id") "maasserve
  16384 | 118108 | maas_db |       1.00 |     1 |  1.00 |             0.84 | SELECT                                                                                                                                                                                                  +
        |        |         |            |       |       |                  |                 CONCAT(node.hostname, $1, domain.name) AS fqdn,                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.system_id,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.node_type,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.owner_id AS user_id,                                                                                                                                                               +
        |        |         |            |       |       |                  |                                                                                                                                                                                                         +
        |        |         |            |       |       |                  |       
  16384 | 118108 | maas_db |       0.94 |    53 |  0.02 |             0.79 | SELECT "maasserver_config"."id", "maasserver_config"."name", "maasserver_config"."value" FROM "maasserver_config" WHERE "maasserver_config"."name" = $1 LIMIT $2
  16384 | 118108 | maas_db |       0.92 |     6 |  0.15 |             0.77 | SELECT "maasserver_interface"."id", "maasserver_interface"."created", "maasserver_interface"."updated", "maasserver_interface"."node_config_id", "maasserver_interface"."name", "maasserver_interface"."
  16384 | 118108 | maas_db |       0.88 |     6 |  0.15 |             0.74 | SELECT "maasserver_neighbour"."id", "maasserver_neighbour"."created", "maasserver_neighbour"."updated", "maasserver_neighbour"."ip", "maasserver_neighbour"."time", "maasserver_neighbour"."vid", "maass
  16384 | 118108 | maas_db |       0.87 |    48 |  0.02 |             0.73 | SELECT "maasserver_regionrackrpcconnection"."id", "maasserver_regionrackrpcconnection"."created", "maasserver_regionrackrpcconnection"."updated", "maasserver_regionrackrpcconnection"."endpoint_id", "m
  16384 | 118108 | maas_db |       0.87 |     1 |  0.87 |             0.73 | SELECT                                                                                                                                                                                                  +
        |        |         |            |       |       |                  |                 CONCAT(node.hostname, $1, domain.name) AS fqdn,                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.system_id,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.node_type,                                                                                                                                                                         +
        |        |         |            |       |       |                  |                 node.owner_id AS user_id,                                                                                                                                                               +
        |        |         |            |       |       |                  |                                                                                                                                                                                                         +
        |        |         |            |       |       |                  |       
  16384 | 118108 | maas_db |       0.85 |     1 |  0.85 |             0.72 | SELECT "maasserver_node"."id", "maasserver_node"."created", "maasserver_node"."updated", "maasserver_node"."system_id", "maasserver_node"."hardware_uuid", "maasserver_node"."hostname", "maasserver_nod
  16384 | 118108 | maas_db |       0.83 |    12 |  0.07 |             0.70 | SELECT COUNT(*) AS "__count" FROM "maasserver_node" WHERE ("maasserver_node"."node_type" IN ($1, $2) AND NOT (EXISTS(SELECT ($3) AS "a" FROM "maasserver_regioncontrollerprocess" U1 WHERE (U1."id" IN (
  16384 | 118108 | maas_db |       0.83 |    12 |  0.07 |             0.70 | UPDATE "maasserver_regioncontrollerprocess" SET "created" = $1::timestamp, "updated" = $2::timestamp, "region_id" = $3, "pid" = $4 WHERE "maasserver_regioncontrollerprocess"."id" = $5
(30 rows)


I’ve discussed this case with Noam, and cleaning up the unused fabrics significantly improved the performance.

When servers are deleted and re-enlisted, NICs that don’t get an IP Address are placed in a new fabric. This environment had over 2000 Fabrics on it.

Inspecting UI / API calls, we see sequential queries to request additional information for each fabric, and that was taking a lot of time.

thank you @jasimioni for reporting back to the community the result

thank you @jasimioni