Commit At Most Once v5
The objective of the Commit At Most Once (CAMO) feature is to prevent the application from committing more than once.
Without CAMO, when a client loses connection after a COMMIT is submitted, the application might not receive a reply from the server and is therefore unsure whether the transaction committed.
The application can't easily decide between the two options of:
Retrying the transaction with the same data, since this can in some cases cause the data to be entered twice
Not retrying the transaction and risk that the data doesn't get processed at all
Either of those is a critical error with high-value data.
One way to avoid this situation is to make sure that the transaction
includes at least one INSERT
into a table with a unique index, but
that depends on the application design and requires application-
specific error-handling logic, so it isn't effective in all cases.
The CAMO feature in PGD offers a more general solution and doesn't require
an INSERT
. When activated by bdr.commit_scope
, the application
receives a message containing the transaction identifier, if already
assigned. Otherwise, the first write statement in a transaction
sends that information to the client.
If the application sends an explicit COMMIT, the protocol ensures that the application receives the notification of the transaction identifier before the COMMIT is sent. If the server doesn't reply to the COMMIT, the application can handle this error by using the transaction identifier to request the final status of the transaction from another PGD node. If the prior transaction status is known, then the application can safely decide whether to retry the transaction.
CAMO works by creating a pair of partner nodes that are two PGD nodes from the same PGD group. In this operation mode, each node in the pair knows the outcome of any recent transaction executed on the other peer and especially (for our need) knows the outcome of any transaction disconnected during COMMIT. The node that receives the transactions from the application might be referred to as "origin" and the node that confirms these transactions as "partner." However, there's no difference in the CAMO configuration for the nodes in the CAMO pair. The pair is symmetric.
Warning
CAMO requires changes to the user's application to take advantage of the advanced error handling. Enabling a parameter isn't enough to gain protection. Reference client implementations are provided to customers on request.
Requirements
To use CAMO, an application must issue an explicit COMMIT message as a separate request (not as part of a multi-statement request). CAMO can't provide status for transactions issued from procedures or from single-statement transactions that use implicit commits.
Configuration
Configuration of CAMO happens through Commit Scopes.
Failure scenarios
Different failure scenarios occur in different configurations.
Data persistence at receiver side
By default, a PGL writer operates in
bdr.synchronous_commit = off
mode when applying transactions
from remote nodes. This holds true for CAMO as well, meaning that
transactions are confirmed to the origin node possibly before reaching
the disk of the CAMO partner. In case of a crash or hardware failure,
it is possible for a confirmed transaction to be unrecoverable on the
CAMO partner by itself. This isn't an issue as long as the CAMO
origin node remains operational, as it redistributes the
transaction once the CAMO partner node recovers.
This in turn means CAMO can protect against a single-node failure, which is correct for local mode as well as or even in combination with remote write.
To cover an outage of both nodes of a CAMO pair, you can use
bdr.synchronous_commit = local
to enforce a flush prior to the
pre-commit confirmation. This doesn't work with
either remote write or local mode and has a performance
impact due to I/O requirements on the CAMO partner in the
latency sensitive commit path.
Asynchronous mode
When DEGRADE ON ... TO ASYNC
clause is used in the commit scope
a node detects whether its CAMO partner is ready. If not, it temporarily
switches to asynchronous (local) mode. When in this mode, a node commits
transactions locally until switching back to CAMO mode.
This doesn't allow COMMIT status to be retrieved, but it does let you choose availability over consistency. This mode can tolerate a single-node failure. In case both nodes of a CAMO pair fail, they might choose incongruent commit decisions to maintain availability, leading to data inconsistencies.
For a CAMO partner to switch to ready, it needs to be connected, and
the estimated catchup interval needs to drop below
the timeout
value of TO ASYNC
. The current readiness status of a CAMO
partner can be checked with bdr.is_camo_partner_ready
, while
bdr.node_replication_rates
provides the current estimate of the catchup
time.
The switch from CAMO protected to asynchronous mode is only ever triggered by
an actual CAMO transaction either because the commit exceeds the
timeout
value of TO ASYNC
or, in case the CAMO partner is already
known, disconnected at the time of commit. This switch is independent
of the estimated catchup interval. If the CAMO pair is configured to
require the current node to be the write lead of a group as configured
through enable_proxy_routing
node group option. See
Commit Scopes syntax.
This can prevent a split brain situation due to an isolated node from
switching to asynchronous mode. If enable_proxy_routing
isn't set for the
CAMO group, the origin node switches to asynchronous mode immediately.
The switch from asynchronous mode to CAMO mode depends on the CAMO partner node, which initiates the connection. The CAMO partner tries to reconnect at least every 30 seconds. After connectivity is reestablished, it might therefore take up to 30 seconds until the CAMO partner connects back to its origin node. Any lag that accumulated on the CAMO partner further delays the switch back to CAMO protected mode.
Unlike during normal CAMO operation, in asynchronous mode there's no
additional commit overhead. This can be problematic, as it allows the
node to continuously process more transactions than the CAMO
pair can normally process. Even if the CAMO partner eventually
reconnects and applies transactions, its lag only ever increases
in such a situation, preventing reestablishing the CAMO protection.
To artificially throttle transactional throughput, PGD provides the
bdr.camo_local_mode_delay
setting, which allows you to delay a COMMIT in
local mode by an arbitrary amount of time. We recommend measuring
commit times in normal CAMO mode during expected workloads and
configuring this delay accordingly. The default is 5 ms, which reflects
a asynchronous network and a relatively quick CAMO partner response.
Consider the choice of whether to allow asynchronous mode in view of the architecture and the availability requirements. The following examples provide some detail.
Example
This example considers a setup with two PGD nodes that are the CAMO partner of each other.
For this CAMO commit scope to be legal, the number of nodes in the group must equal exactly 2. Using ALL or ANY 2 on a group consisting of several nodes is an error because the unquantified group expression does not resolve to a definite pair of nodes.
With asynchronous mode
If asynchronous mode is allowed, there's no single point of failure. When one node fails:
- The other node can determine the status of all transactions that were disconnected during COMMIT on the failed node.
- New write transactions are allowed:
- If the second node also fails, then the outcome of those transactions that were being committed at that time is unknown.
Without asynchronous mode
If asynchronous mode isn't allowed, then each node requires the other node for committing transactions, that is, each node is a single point of failure. When one node fails:
- The other node can determine the status of all transactions that were disconnected during COMMIT on the failed node.
- New write transactions are prevented until the node recovers.
Application use
Overview and requirements
CAMO relies on a retry loop and specific error handling on the client side. There are three aspects to it:
- The result of a transaction's COMMIT needs to be checked and, in case of a temporary error, the client must retry the transaction.
- Prior to COMMIT, the client must retrieve a global identifier for the transaction, consisting of a node id and a transaction id (both 32-bit integers).
- If the current server fails while attempting a COMMIT of a transaction, the application must connect to its CAMO partner, retrieve the status of that transaction, and retry depending on the response.
The application must store the global transaction identifier only for the purpose of verifying the transaction status in case of disconnection during COMMIT. In particular, the application doesn't need an additional persistence layer. If the application fails, it needs only the information in the database to restart.
CAMO partner connection status
The function bdr.is_camo_partner_connected
allows checking the
connection status of a CAMO partner node configured in pair mode.
There currently is no equivalent for CAMO used with
Eager Replication.
Synopsis
Return value
A Boolean value indicating whether the CAMO partner is currently connected to a WAL sender process on the local node and therefore can receive transactional data and send back confirmations.
CAMO partner readiness
The function bdr.is_camo_partner_ready
allows checking the readiness
status of a CAMO partner node configured in pair mode. Underneath,
this triggers the switch to and from local mode.
Synopsis
Return value
A Boolean value indicating whether the CAMO partner can reasonably be
expected to confirm transactions originating from the local node in a
timely manner (before timeout
for TO ASYNC
expires).
Note
This function queries the past or current state. A positive return value doesn't indicate whether the CAMO partner can confirm future transactions.
Fetch the CAMO partner
This function shows the local node's CAMO partner (configured by pair mode).
Wait for consumption of the apply queue from the CAMO partner
The function bdr.wait_for_camo_partner_queue
is a wrapper of
bdr.wait_for_apply_queue
defaulting to query the CAMO partner node.
It yields an error if the local node isn't part of a CAMO pair.
Synopsis
Transaction status between CAMO nodes
This function enables a wait for CAMO transactions to be fully resolved.
Transaction status query function
To check the status of a transaction that was being committed when the node failed, the application must use this function:
With CAMO used in pair mode, use this function only on a node that's part of a CAMO pair. Along with Eager replication, you can use it on all nodes.
In both cases, you must call the function within 15 minutes after the commit was issued. The CAMO partner must regularly purge such meta-information and therefore can't provide correct answers for older transactions.
Before querying the status of a transaction, this function waits for the receive queue to be consumed and fully applied. This prevents early negative answers for transactions that were received but not yet applied.
Despite its name, it's not always a read-only operation. If the status is unknown, the CAMO partner decides whether to commit or abort the transaction, storing that decision locally to ensure consistency going forward.
The client must not call this function before attempting to commit on the origin. Otherwise the transaction might be forced to roll back.
Synopsis
Parameters
node_id
— The node id of the PGD node the transaction originates from, usually retrieved by the client before COMMIT from the PQ parameterbdr.local_node_id
.xid
— The transaction id on the origin node, usually retrieved by the client before COMMIT from the PQ parametertransaction_id
require_camo_partner
— Defaults to true and enables configuration checks. Set to false to disable these checks and query the status of a transaction that was not a CAMO transaction.
Return value
The function returns one of these results:
'committed'::TEXT
— The transaction was committed, is visible on both nodes of the CAMO pair, and will eventually be replicated to all other PGD nodes. No need for the client to retry it.'aborted'::TEXT
— The transaction was aborted and will not be replicated to any other PGD node. The client needs to either retry it or escalate the failure to commit the transaction.'in progress'::TEXT
— The transaction is still in progress on this local node and wasn't committed or aborted yet. The transaction might be in the COMMIT phase, waiting for the CAMO partner to confirm or deny the commit. The recommended client reaction is to disconnect from the origin node and reconnect to the CAMO partner to query that instead. With a load balancer or proxy in between, where the client lacks control over which node gets queried, the client can only poll repeatedly until the status switches to either'committed'
or'aborted'
.For Eager All-Node Replication, peer nodes yield this result for transactions that aren't yet committed or aborted. This means that even transactions not yet replicated (or not even started on the origin node) might yield an
in progress
result on a peer PGD node in this case. However, the client must not query the transaction status prior to attempting to commit on the origin.'unknown'::TEXT
— The transaction specified is unknown, either because it's in the future, not replicated to that specific node yet, or too far in the past. The status of such a transaction is not yet or no longer known. This return value is a sign of improper use by the client.
The client must be prepared to retry the function call on error.
Connection pools and proxies
The effect of connection pools and proxies needs to be considered when designing a CAMO cluster. A proxy may freely distribute transactions to all nodes in the commit group (i.e. to both nodes of a CAMO pair or to all PGD nodes in case of Eager All Node Replication).
Care needs to be taken to ensure that the application fetches the proper node id: when using session pooling, the client remains connected to the same node, so the node id remains constant for the lifetime of the client session. However, with finer-grained transaction pooling, the client needs to fetch the node id for every transaction (as in the example given below).
A client that is not directly connected to the PGD nodes might not even
notice a failover or switchover, but can always use the
bdr.local_node_id
parameter to determine which node it is currently
connected to. In the crucial situation of a disconnect during COMMIT,
the proxy must properly forward that disconnect as an error to the
client applying the CAMO protocol.
For CAMO in received
mode, a proxy that potentially switches
between the CAMO pairs must use the bdr.wait_for_camo_partner_queue
function to prevent stale reads.
Example
The following example demonstrates what a retry loop of a CAMO aware
client application should look like in C-like pseudo-code. It expects
two DSNs origin_dsn
and partner_dsn
providing connection information.
These usually are the same DSNs as used for the initial call to
bdr.create_node
, and can be looked up in bdr.node_summary
, column
interface_connstr
.
This example needs to be extended with proper logic for connecting, including
retries and error handling. If using a load balancer
(e.g. PgBouncer), re-connecting can be implemented by simply using
PQreset
. Ensure that the load balancer only
ever redirects a client to a CAMO partner and not any other PGD node.
In practice, an upper limit of retries is recommended. Depending on the
actions performed in the transaction, other temporary errors may be
possible and need to be handled by retrying the transaction depending
on the error code, similarly to the best practices on deadlocks or on
serialization failures while in SERIALIZABLE
isolation mode.
Interaction with DDL and global locks
Transactions protected by CAMO can contain DDL operations. However, DDL uses global locks, which already provide some synchronization among nodes. See DDL locking details for more information.
Combining CAMO with DDL imposes a higher latency and also
increases the chance of global deadlocks. We therefore recommend using a
relatively low bdr.global_lock_timeout
, which aborts the DDL and
therefore resolves a deadlock in a reasonable amount of time.
Nontransactional DDL
The following DDL operations aren't allowed in a transaction block and therefore can't benefit from CAMO protection. For these, CAMO is automatically disabled internally:
- all concurrent index operations (
CREATE
,DROP
, andREINDEX
) REINDEX DATABASE
,REINDEX SCHEMA
, andREINDEX SYSTEM
VACUUM
CLUSTER
without any parameterALTER TABLE DETACH PARTITION CONCURRENTLY
ALTER TYPE [enum] ADD VALUE
ALTER SYSTEM
CREATE
andDROP DATABASE
CREATE
andDROP TABLESPACE
ALTER DATABASE [db] TABLESPACE
CAMO limitations
CAMO is designed to query the results of a recently failed COMMIT on the origin node, so in case of disconnection, code the application to immediately request the transaction status from the CAMO partner. Have as little delay as possible after the failure before requesting the status. Applications must not rely on CAMO decisions being stored for longer than 15 minutes.
If the application forgets the global identifier assigned, for example as a result of a restart, there's no easy way to recover it. Therefore, we recommend that applications wait for outstanding transactions to end before shutting down.
For the client to apply proper checks, a transaction protected by CAMO can't be a single statement with implicit transaction control. You also can't use CAMO with a transaction-controlling procedure or in a
DO
block that tries to start or end transactions.CAMO resolves commit status but doesn't yet resolve pending notifications on commit. CAMO and Eager replication options don't allow the
NOTIFY
SQL command or thepg_notify()
function. They also don't allowLISTEN
orUNLISTEN
.When replaying changes, CAMO transactions may detect conflicts just the same as other transactions. If timestamp conflict detection is used, the CAMO transaction uses the timestamp of the prepare on the origin node, which is before the transaction becomes visible on the origin node itself.
CAMO is not currently compatible with transaction streaming. Please ensure to disable transaction streaming when planning to use CAMO. This can be configured globally or in the PGD node group, see Transaction Streaming Configuration.
Performance implications
CAMO extends the Postgres replication protocol by adding a message roundtrip at commit. Applications have a higher commit latency than with asynchronous replication, mostly determined by the roundtrip time between involved nodes. Increasing the number of concurrent sessions can help to increase parallelism to obtain reasonable transaction throughput.
The CAMO partner confirming transactions must store transaction states. Compared to non-CAMO operation, this might require an additional seek for each transaction applied from the origin.
Client application testing
Proper use of CAMO on the client side isn't trivial. We strongly recommend testing the application behavior with the PGD cluster against failure scenarios such as node crashes or network outages.