Support setting target_session_attrs postgres option by passing additional options to a django database backend

dmitriis · 24 September 2019 14:37

libpq coming with PostgreSQL 10 has support for multiple hostnames and IP addresses and iterating over them until it finds a good endpoint to connect to. It also supports resolving a single hostname with multiple address records backing it and iterating over those. However target_session_attrs needs to be passed to the postgres backend in order to always connect to the master instance, not slaves.

# without target_session_attrs
python -c "import psycopg2; psycopg2.connect('user=postgres password=postgres host=pg1,pg2,pg2andpg3 port=5432,5432,5432 dbname=maas connect_timeout=3')"

# with target_session_attrs
python -c "import psycopg2; psycopg2.connect('user=postgres password=postgres host=pg1,pg2,pg2andpg3 port=5432,5432,5432 dbname=maas connect_timeout=3 target_session_attrs=read-write')"

https://paquier.xyz/postgresql-2/postgres-10-multi-host-connstr/

Parsing and handling: https://git.launchpad.net/~usd-import-team/ubuntu/+source/postgresql-10/tree/src/interfaces/libpq/fe-connect.c?h=ubuntu/bionic-updates#n5419

This explains that target_session_attrs=read-write needs to be passed in order to avoid connecting to read-only slaves:
https://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/
“any”, meaning that any kind of servers can be accepted. This is as well the default value.
“read-write”, to disallow connections to read-only servers, hot standbys for example.

“If a failover happens and a standby is promoted and switches to be a primary, target_session_attrs can be used in read-write mode with the addresses of all the nodes of the cluster to allow the application to connect to a primary for read-write actions or any nodes for read-only actions.”

MAAS relies on django and psycopg2 backend to connect to postgres.

https://github.com/maas/maas/blob/2.4.2/src/maasserver/djangosettings/settings.py#L105-L115

However, it does not allow passing OPTIONS which could contain “target_session_attrs=read-write”

https://docs.djangoproject.com/en/2.1/ref/settings/#std:setting-OPTIONS
“Extra parameters to use when connecting to the database. Available parameters vary depending on your database backend.”

Using the default setting target_session_attrs=any should also work but if we know that there is only one master in the cluster it is better to be specific.

"host=host1,host2,host3 target_session_attrs=any

This simplifies the logic at application level: there is no need for it to know exactly which node is the primary and which ones are the standbys. The cost though, is an increase in connection failures when using the read-write mode, but that may be acceptable if the cluster is in a low-latency environment."

~~
Looks like people did not have any real issues in using this via psycopg2:
https://github.com/psycopg/psycopg2/issues/602
https://github.com/psycopg/psycopg2/issues/669

~~

On failover handling:

From the perspective of using a VIP and gratuitous ARP type of failover on a single subnet (with Pacemaker managing the VIP and GARP), it doesn’t look like there is a big difference with the multi-endpoint setup:
We do not have any TCP connection state synchronization as in LVS http://www.linuxvirtualserver.org/docs/sync.html or other connection state replication relevant to PostgreSQL between PostgreSQL nodes - there is only data replication;
on failover a client recreates a TCP connection to the same VIP endpoint.
In the case of multiple endpoints we would just try several of them before the client lib would declare the connection as failed when creating a new connection or during failover handling.

The bulk of the client logic is in PQconnectPoll and the rest should be handled in the client library (psycopg2 that uses libpq).
https://git.launchpad.net/~usd-import-team/ubuntu/+source/postgresql-10/tree/src/interfaces/libpq/fe-connect.c?h=ubuntu/bionic-updates#n2065

https://docs.djangoproject.com/en/2.1/ref/databases/#connection-management (unrecoverable errors seem to affect only one request in django which is used by MAAS).

tcorej · 25 September 2019 10:11

I’m all for implementing this feature.

I’m using Citus pg_auto_failover for PostgreSQL HA setup and I had to somehow pass this option target_session_attrs=read-write. I had to put this directly into django DB configuration which is not ideal so I’d like to see something like that implemented.

My database config in regiond.conf looks as follows:

database_host: 10.140.5.18,10.140.5.4
database_name: maasdb
database_pass: ...
database_port: 6000
database_user: postgres
database_conn_max_age: 60
keepalives: 1
keepalives_idle: 10
keepalives_interval: 5
keepalives_count: 1

I had to use IP addresses as db hosts, because if I used DNS it tried to resolve whole thing with comma. This setup works great and is fully high available.

jhusakowski · 28 June 2023 08:54

Enabling PostgreSQL HA deployments is on a long-term roadmap for MAAS. At this point it’s hard to say whether it will work exactly as requested above, as there are multiple ways of setting up an HA Postgres instance.