Logical Replication
Logical replication is a Postgres feature that provides a convenient way for moving data between databases, particularly Postgres clusters that are in an active state. To apply logical replication, we'll first enable the feature in our cluster, then we'll create a publication in one cluster and a subscription to that publication in another cluster. With this pub-sub relationship established, we'll observe data created in one cluster flowing into another.
Before getting started, you may want to create the postgres-operator
namespace if you haven't already, kubectl create ns postgres-operator
. Just as we did in the Quickstart and Tutorials, we're going to create a Postgres cluster named hippo
. You may want to delete the existing hippo
cluster, if you have one left over. Finally, you'll need a running installation of Crunchy Postgres for Kubernetes.
Enable Logical Replication
This example creates two separate Postgres clusters named hippo
and rhino
. We will logically replicate data from rhino
to hippo
. We can create these two Postgres clusters by creating a file called replication-example.yaml
and pasting in the manifests below:
---
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo
namespace: postgres-operator
spec:
postgresVersion: 15
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
---
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: rhino
namespace: postgres-operator
spec:
postgresVersion: 15
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
users:
- name: logic
databases:
- zoo
options: "REPLICATION"
The key difference between the two Postgres clusters is this section in the rhino
manifest:
users:
- name: logic
databases:
- zoo
options: "REPLICATION"
This creates a database called zoo
and a user named logic
with REPLICATION
privileges. This will allow for replicating data logically to the hippo
Postgres cluster.
Create these two Postgres clusters with the command kubectl apply -f replication-example.yaml
.
Create a Publication
For convenience, you can use the kubectl exec
method to log into the zoo database in rhino
:
kubectl exec -it -n postgres-operator -c database $(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=rhino,postgres-operator.crunchydata.com/role=master' -o name) -- psql zoo
Let's create a simple table called abc
that contains just integer data. We will also populate this table:
CREATE TABLE abc (id int PRIMARY KEY);
INSERT INTO abc SELECT * FROM generate_series(1,10);
We need to grant SELECT
privileges to the logic
user in order for it to perform an initial data synchronization during logical replication. You can do so with the following command:
GRANT SELECT ON abc TO logic;
Finally, create a publication that allows for the replication of data from abc
:
CREATE PUBLICATION zoo FOR ALL TABLES;
Quit out of the rhino
Postgres cluster with \q
.
Create a Subscription
For the next step, you will need to get the connection information for how to connection as the logic
user to the rhino
Postgres database. You can get the key information from the following commands, which return the hostname, username, and password:
kubectl -n postgres-operator get secrets rhino-pguser-logic -o go-template='{{.data.host | base64decode}}'
kubectl -n postgres-operator get secrets rhino-pguser-logic -o go-template='{{.data.user | base64decode }}'
kubectl -n postgres-operator get secrets rhino-pguser-logic -o go-template='{{.data.password | base64decode }}'
The host will be something like rhino-primary.postgres-operator.svc
and the user will be logic
. Further down, the guide references the password as $LOGIC_PASSWORD
. You can substitute the actual password there.
Log into the hippo
Postgres cluster. Note that we are logging into the postgres
database within the hippo
cluster:
kubectl exec -it -n postgres-operator -c database $(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master' -o name) -- psql
Create a table called abc
that is identical to the table in the rhino
database:
CREATE TABLE abc (id int PRIMARY KEY);
Finally, create a subscription that will manage the data replication from rhino
into hippo
:
CREATE SUBSCRIPTION zoo
CONNECTION 'host=rhino-primary.postgres-operator.svc user=logic dbname=zoo password=$LOGIC_PASSWORD'
PUBLICATION zoo;
In a few moments, you should see the data replicated into your table:
TABLE abc;
which yields:
id
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
You can further test that logical replication is working by modifying the data on rhino
in the abc
table, and the verifying that it is replicated into hippo
.