Limitations v5
Take these EDB Postgres Distributed (PGD) design limitations into account when planning your deployment.
Nodes
PGD can run hundreds of nodes, assuming adequate hardware and network. However, for mesh-based deployments, we generally don’t recommend running more than 48 nodes in one cluster. If you need extra read scalability beyond the 48-node limit, you can add subscriber-only nodes without adding connections to the mesh network.
PGD currently has a hard limit of no more than 1000 active nodes, as this is the current maximum Raft connections allowed.
The minimum recommended number of nodes in a group is three to provide fault tolerance for PGD's consensus mechanism. With just two nodes, consensus would fail if one of the nodes were unresponsive. Consensus is required for some PGD operations, such as distributed sequence generation. For more information about the consensus mechanism used by EDB Postgres Distributed, see Architectural details.
Multiple databases on single instances
Support for using PGD for multiple databases on the same Postgres instance is deprecated beginning with PGD 5 and will no longer be supported with PGD 6. As we extend the capabilities of the product, the added complexity introduced operationally and functionally is no longer viable in a multi-database design.
It's best practice and we recommend that you configure only one database per PGD instance.
The deployment automation with TPA and the tooling such as the CLI and PGD Proxy already codify that recommendation.
While it's still possible to host up to 10 databases in a single instance, doing so incurs many immediate risks and current limitations:
If PGD configuration changes are needed, you must execute administrative commands for each database. Doing so increases the risk for potential inconsistencies and errors.
You must monitor each database separately, adding overhead.
TPAexec assumes one database. Additional coding is needed by customers or by the EDB Professional Services team in a post-deploy hook to set up replication for more databases.
PGD Proxy works at the Postgres instance level, not at the database level, meaning the leader node is the same for all databases.
Each additional database increases the resource requirements on the server. Each one needs its own set of worker processes maintaining replication, for example, logical workers, WAL senders, and WAL receivers. Each one also needs its own set of connections to other instances in the replication cluster. These needs might severely impact performance of all databases.
When rebuilding or adding a node, you can use the physical initialization method (
bdr_init_physical
) for one database only for one node. You must initialize all other databases by logical replication, which can be problematic for large databases because of the time it can take.Synchronous replication methods, for example, CAMO and Group Commit, won’t work as expected. Since the Postgres WAL is shared between the databases, a synchronous commit confirmation can come from any database, not necessarily in the right order of commits.
CLI and OTEL integration (new with v5) assumes one database.
CAMO
Commit At Most Once (CAMO) is a feature that aims to prevent applications committing more than once. If you use this feature, take these limitations into account when planning:
CAMO is designed to query the results of a recently failed COMMIT on the origin node. So, in case of disconnection, code the application to immediately request the transaction status from the CAMO partner. Have as little delay as possible after the failure before requesting the status. Applications must not rely on CAMO decisions being stored for longer than 15 minutes.
If the application forgets the global identifier assigned, for example, as a result of a restart, there's no easy way to recover it. Therefore, we recommend that applications wait for outstanding transactions to end before shutting down.
For the client to apply proper checks, a transaction protected by CAMO can't be a single statement with implicit transaction control. You also can't use CAMO with a transaction-controlling procedure or in a
DO
block that tries to start or end transactions.CAMO resolves commit status but doesn't resolve pending notifications on commit. CAMO and Eager Replication options don't allow the
NOTIFY
SQL command or thepg_notify()
function. They also don't allowLISTEN
orUNLISTEN
.When replaying changes, CAMO transactions might detect conflicts just the same as other transactions. If timestamp-conflict detection is used, the CAMO transaction uses the timestamp of the prepare-on-the-origin node, which is before the transaction becomes visible on the origin node itself.
CAMO isn't currently compatible with transaction streaming. Be sure to disable transaction streaming when planning to use CAMO. You can configure this option globally or in the PGD node group. See Transaction streaming configuration.
Not all DDL can run when you use CAMO. If you use unsupported DDL, a warning is logged and the transactions commit scope is set to local only. The only supported DDL operations are:
- Nonconcurrent
CREATE INDEX
- Nonconcurrent
DROP INDEX
- Nonconcurrent
REINDEX
of an individual table or index CLUSTER
(of a single relation or index only)ANALYZE
TRUNCATE
- Nonconcurrent
Group Commit
Group Commit enables configurable synchronous commits over nodes in a group. If you use this feature, take the following limitations into account:
- Not all DDL can run when you use Group Commit. If you use unsupported DDL, a warning is logged, and the transactions commit scope is set to local. The only supported DDL operations are:
- Nonconcurrent
CREATE INDEX
- Nonconcurrent
DROP INDEX
- Nonconcurrent
REINDEX
of an individual table or index CLUSTER
(of a single relation or index only)ANALYZE
TRUNCATE
- Nonconcurrent
Eager
Eager is available in Group Commit. It avoids conflicts by eagerly aborting transactions that might clash. It's subject to the same limitations as Group Commit.
Other limitations
This noncomprehensive list includes other limitations that are expected and are by design. We don't expect to resolve them in the future. Consider these limitations when planning your deployment:
Replacing a node with its physical standby doesn't work for nodes that use CAMO/Eager/Group Commit. We don't recommend combining physical standbys and EDB Postgres Distributed, even if it's possible.
A
galloc
sequence might skip some chunks if you create the sequence in a rolled back transaction and then create it again with the same name. Skipping chunks can also occur if you create and drop the sequence when DDL replication isn't active and then you create it again when DDL replication is active. The impact of the problem is mild because the sequence guarantees aren't violated. The sequence skips only some initial chunks. Also, as a workaround, you can specify the starting value for the sequence as an argument to thebdr.alter_sequence_set_kind()
function.Legacy synchronous replication uses a mechanism for transaction confirmation different from the one used by CAMO, Eager, and Group Commit. The two aren't compatible, so don't use them together.
Postgres two-phase commit (2PC) transactions (that is,
PREPARE TRANSACTION
) can't be used with CAMO, Group Commit, or Eager Replication because those features use two-phase commit underneath.