Logical Replication
Logical replication is a Postgres feature that provides a convenient way for moving data between databases, particularly Postgres clusters that are in an active state.
You can set up your PGO managed Postgres clusters to use logical replication. This guide provides an example for how to do so.
Set Up Logical Replication
This example creates two separate Postgres clusters named hippo
and rhino
. We will logically replicate data from rhino
to hippo
. We can create these two Postgres clusters using the manifests below:
---
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-0
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
---
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: rhino
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-0
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
users:
- name: logic
databases:
- zoo
options: "REPLICATION"
The key difference between the two Postgres clusters is this section in the rhino
manifest:
users:
- name: logic
databases:
- zoo
options: "REPLICATION"
This creates a database called zoo
and a user named logic
with REPLICATION
privileges. This will allow for replicating data logically to the hippo
Postgres cluster.
Create these two Postgres clusters. When the rhino
cluster is ready, log into the zoo
database. For convenience, you can use the kubectl exec
method of logging in:
kubectl exec -it -n postgres-operator -c database \
$(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=rhino,postgres-operator.crunchydata.com/role=master' -o name) -- psql zoo
Let’s create a simple table called abc
that contains just integer data. We will also populate this table:
CREATE TABLE abc (id int PRIMARY KEY);
INSERT INTO abc SELECT * FROM generate_series(1,10);
We need to grant SELECT
privileges to the logic
user in order for it to perform an initial data synchronization during logical replication. You can do so with the following command:
GRANT SELECT ON abc TO logic;
Finally, create a publication that allows for the replication of data from abc
:
CREATE PUBLICATION zoo FOR ALL TABLES;
Quit out of the rhino
Postgres cluster.
For the next step, you will need to get the connection information for how to connection as the logic
user to the rhino
Postgres database. You can get the key information from the following commands, which return the hostname, username, and password:
kubectl -n postgres-operator get secrets rhino-pguser-logic -o jsonpath={.data.host} | base64 -d
kubectl -n postgres-operator get secrets rhino-pguser-logic -o jsonpath={.data.user} | base64 -d
kubectl -n postgres-operator get secrets rhino-pguser-logic -o jsonpath={.data.password} | base64 -d
The host will be something like rhino-primary.postgres-operator.svc
and the user will be logic
. Further down, the guide references the password as <LOGIC-PASSWORD>
. You can substitute the actual password there.
Log into the hippo
Postgres cluster. Note that we are logging into the postgres
database within the hippo
cluster:
kubectl exec -it -n postgres-operator -c database \
$(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master' -o name) -- psql
Create a table called abc
that is identical to the table in the rhino
database:
CREATE TABLE abc (id int PRIMARY KEY);
Finally, create a subscription that will manage the data replication from rhino
into hippo
:
CREATE SUBSCRIPTION zoo
CONNECTION 'host=rhino-primary.postgres-operator.svc user=logic dbname=zoo password=<LOGIC-PASSWORD>'
PUBLICATION zoo;
In a few moments, you should see the data replicated into your table:
TABLE abc;
which yields:
id
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
You can further test that logical replication is working by modifying the data on rhino
in the abc
table, and the verifying that it is replicated into hippo
.