Postgresql fails to start - cloud not serialize access due to concurrent update


#1
2019-10-02 11:01:46.641 PDT [13445] maas@maasdb STATEMENT:  DELETE FROM "maasserver_notification" WHERE "maasserver_notification"."id" IN (124)                                           
2019-10-02 11:01:46.904 PDT [13437] maas@maasdb ERROR:  could not serialize access due to concurrent update                                                                               
2019-10-02 11:01:46.904 PDT [13437] maas@maasdb STATEMENT:  UPDATE "maasserver_interface" SET "updated" = '2019-10-02T11:01:46.903121'::timestamp, "name" = 'enp0s25', "vlan_id" = 5001 WHERE "maasserver_interface"."id" = 1
2019-10-02 11:07:31.156 PDT [13605] maas@maasdb LOG:  could not receive data from client: Connection reset by peer

...


2019-10-02 12:03:30.247 PDT [1356] [unknown]@[unknown] LOG:  incomplete startup packet
2019-10-02 12:03:40.470 PDT [2790] maas@maasdb ERROR:  could not serialize access due to concurrent update
2019-10-02 12:03:40.470 PDT [2790] maas@maasdb STATEMENT:  UPDATE "maasserver_service" SET "updated" = '2019-10-02T12:03:40.469559'::timestamp, "status" = 'degraded', "status_info" = '2 processes running but 4 were expected.' WHERE "maasserver_service"."id" = 4

and maas is now barfing with:

=> regiond.log <==
2019-10-02 12:24:48 provisioningserver.rpc.common: [critical] Unhandled failure dispatching AMP command. This is probably a bug. Please ensure that this error is handled within application code or declared in the signature of the b'UpdateServices' command. [maas-controller-2:pid=7306:cmd=UpdateServices:ask=13]
	Traceback (most recent call last):
	  File "/usr/lib/python3/dist-packages/twisted/internet/asyncioreactor.py", line 267, in run
	    self._asyncioEventloop.run_forever()
	  File "/usr/lib/python3/dist-packages/twisted/internet/asyncioreactor.py", line 290, in run
	    f(*args, **kwargs)
	  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 500, in errback
	    self._startRunCallbacks(fail)
	  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 567, in _startRunCallbacks
	    self._runCallbacks()
	--- <exception caught here> ---
	  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 653, in _runCallbacks
	    current.result = callback(current.result, *args, **kw)
	  File "/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1171, in checkKnownErrors
	    key = error.trap(*command.allErrors)
	  File "/usr/lib/python3/dist-packages/twisted/python/failure.py", line 359, in trap
	    self.raiseException()
	  File "/usr/lib/python3/dist-packages/twisted/python/failure.py", line 385, in raiseException
	    raise self.value.with_traceback(self.tb)
	  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250, in inContext
	    result = inContext.theWork()
	  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
	    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
	  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 122, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 85, in callWithContext
	    return func(*args,**kw)
	  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 885, in callInContext
	    return func(*args, **kwargs)
	  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 234, in wrapper
	    result = func(*args, **kwargs)
	  File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 756, in call_within_transaction
	    return func_outside_txn(*args, **kwargs)
	  File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 563, in retrier
	    return func(*args, **kwargs)
	  File "/usr/lib/python3.6/contextlib.py", line 51, in inner
	    with self._recreate_cm():
	  File "/usr/lib/python3/dist-packages/django/db/transaction.py", line 184, in __enter__
	    connection.set_autocommit(False, force_begin_transaction_with_broken_autocommit=True)
	  File "/usr/lib/python3/dist-packages/django/db/backends/base/base.py", line 411, in set_autocommit
	    self._set_autocommit(autocommit)
	  File "/usr/lib/python3/dist-packages/django/db/backends/postgresql/base.py", line 236, in _set_autocommit
	    self.connection.autocommit = autocommit
	  File "/usr/lib/python3/dist-packages/django/db/utils.py", line 94, in __exit__
	    six.reraise(dj_exc_type, dj_exc_value, traceback)
	  File "/usr/lib/python3/dist-packages/django/utils/six.py", line 685, in reraise
	    raise value.with_traceback(tb)
	  File "/usr/lib/python3/dist-packages/django/db/backends/postgresql/base.py", line 236, in _set_autocommit
	    self.connection.autocommit = autocommit
	django.db.utils.InterfaceError: connection already closed

Definitely something wrong here. I’m trying to decide if I want to file a bug on this or not only it’s not clear to me what the bug is.

I’m trying to run this down but if anyone can pass some insight on this it would be much appreciated.


#2

After running the two queries and re-starting maas, I continued to have issues in the UI (same tab I was using the day before), which simply showed “Loading machines…” but never completed the operation. I was able to log out and log back in, but anything that was related to machines was unresponsive.

I decided to close and open a new tab. Once I did that, the UI became responsive again. So this is resolved, but I still think it’s a bug.