Create a Postgres Cluster
If you came here through the quickstart, you may have already created a cluster, in which case, feel free to skip ahead, or read onward for a more in depth look into cluster creation!
Create a PostgreSQL Cluster
Creating a cluster is simple with the pgo create cluster
command:
pgo create cluster hippo
with output similar to:
created cluster: hippo
workflow id: 25c870a0-5d27-42c2-be00-92f0ba8768e7
database name: hippo
users:
username: testuser password: securerandomlygeneratedpassword
This creates a new PostgreSQL cluster named hippo
with a database in it named hippo
. This operation may take a few moments to complete. Note the name of the database user (testuser
) and password (securerandomlygeneratedpassword
) for when we connect to the PostgreSQL cluster.
To make it easier to copy and paste statements used throughout this tutorial, you can set the password of testuser
as part of creating the PostgreSQL cluster:
pgo create cluster hippo --password=securerandomlygeneratedpassword
You can check on the status of the cluster creation using the pgo test
command. The pgo test
command checks to see if the Kubernetes Services and the Pods that comprise the PostgreSQL cluster are available to receive connections. This includes:
- Testing that the Kubernetes Endpoints are available and able to route requests to healthy Pods.
- Testing that each PostgreSQL instance is available and ready to accept client connections by performing a connectivity check similar to the one performed by
pg_isready
.
For example, when the hippo
cluster is ready,
pgo test hippo
will yield output similar to:
cluster : hippo
Services
primary (10.96.179.126:5432): UP
Instances
primary (hippo-57675d4f8f-wwx64): UP
The Create Cluster Process
So what just happened? Let’s break down what occurs during the create cluster process.
First,
pgo
client creates an entry in the PostgreSQL Operator pgcluster custom resource definition with the attributes desired to create the cluster. In the case above, this fills in the name of the cluster (hippo
) and leverages a lot of defaults from the PostgreSQL Operator configuration. We’ll discuss more about the PostgreSQL Operator configuration later in the tutorial.Once the custom resource is added, the PostgreSQL Operator begins provisioning the PostgreSQL instace and a pgBackRest repository which is used to store backups. The following actions occur as part of this process:
- Creating persistent volume claims (PVCs) for the PostgreSQL instance and the pgBackRest repository.
- Creating services that provide a stable network interface for connecting to the PostgreSQL instance and pgBackRest repository.
- Creating deployments that house each PostgreSQL instance and pgBackRest repository. Each of these is responsible for one Pod.
- The PostgreSQL Pod, when it is started, provisions a PostgreSQL database and performs other bootstrapping functions, such as creating
testuer
. - The pgBackRest Pod, when it is started, initializes a pgBackRest repository. Note that the pgBackRest repository is not yet ready to start taking backups, but will be after the next step!
When the PostgreSQL Operator detects that the PostgreSQL and pgBackRest deployments are up and running, it creates a Kubenretes Job to create a pgBackRest stanza. This is necessary as part of intializing the pgBackRest repository to accept backups from our PostgreSQL cluster.
When the PostgreSQL Operator detects that the stanza creation is completed, it will take an initial backup of the cluster.
In order for a PostgreSQL cluster to be considered successfully created, all of these steps need to succeed. You can connect to the PostgreSQL cluster after step two completes, but note for the cluster to be considered “healthy”, you need for pgBackRest to finish initializig.
You may ask yourself, “wait, why do I need for the pgBackRest repository to be initialized for a cluster to be successfully created?” That is a good question! The reason is that pgBackRest plays a fundamental role in both the disaster recovery AND high availability system with the PostgreSQL Operator, particularly around self-healing.
What Is Created?
There are several Kubernetes objects that are created as part of the pgo create cluster
command, including:
- A Deployment representing the primary PostgreSQL instance
- A PVC that persists the data of this instance
- A Service that can connect to this instance
- A Deployment representing the pgBackRest repository
- A PVC that persists the data of this repository
- A Service that can connect to this repository
- Secrets representing the following three user accounts:
postgres
: the database superuser for the PostgreSQL cluster. This is in a secret calledhippo-postgres-secret
.primaryuser
: the replication user. This is used for copying data between PostgreSQL instance. You should not need to login as this user. This is in a secret calledhippo-primaryuser-secret
.testuser
: the regular user account. This user has access to log into thehippo
database that is created. This is the account you want to give out to your user / application. In a later section, we will see how we can change the default user that is created. This is in a secret calledhippo-testuser-secret
, wheretestuser
can be substituted for the name of the user account.
- ConfigMaps, including:
hippo-pgha-config
, which allows you to customize the configuration of your PostgreSQL cluster. We will cover more about this topic in later sections.hippo-config
andhippo-leader
, which are used by the high availability system. You should not modify these ConfigMaps.
Each deployment contains a single Pod. Do not scale the deployments!: further into the tutorial, we will cover some commands that let you scale your PostgreSQL cluster.
Some Job artifacts may be left around after the cluster creation process completes, including the stanza creation job (hippo-stanza-create
) and initial backup job (backrest-backup-hippo
). If the jobs completed successfully, you can safely delete these objects.
Create a PostgreSQL Cluster With Monitoring
The PostgreSQL Operator Monitoring stack provides a convenient way to gain insights into the availabilty and performance of your PostgreSQL clusters. In order to collect metrics from your PostgreSQL clusters, you have to enable the crunchy-postgres-exporter
sidecar alongside your PostgreSQL cluster. You can do this with the --metrics
flag on pgo create cluster
:
pgo create cluster hippo --metrics
Note that the --metrics
flag just enables a sidecar that can be scraped. You will need to install the monitoring stack separately, or tie it into your existing monitoring infrastructure.
Troubleshooting
PostgreSQL / pgBackRest Pods Stuck in Pending
Phase
The most common occurrence of this is due to PVCs not being bound. Ensure that you have configure your storage options correctly for your Kubernetes environment, if for some reason you cannot use your default storage class or it is unavailable.
Also ensure that you have enough persistent volumes available: your Kubernetes administrator may need to provision more.
stanza-create
Job Never Finishes
The most common occurrence of this is due to the Kubernetes network blocking SSH connections between Pods. Ensure that your Kubernetes networking layer allows for SSH connections over port 2022 in the Namespace that you are deploying your PostgreSQL clusters into.
Next Steps
Once your cluster is created, the next step is to connect to your PostgreSQL cluster. You can also learn how to customize your PostgreSQL cluster!