PostgreSQL Streaming Replication (SR) - Lessons Learned

PostgreSQL Streaming Replication (SR) - Lessons Learned

2021, May 21    

PostgreSQL is a complex database system, with many features. PostgreSQL a comprehensive documentation too on every aspect of the database system. However sometimes it may confuse you when you get real-world tasks, especially if you’re a beginner to PostgreSQL.

PostgreSQL is very flexible and powerful and requires a great deal of knowledge to administer,troubleshoot issues because there are so many moving parts which requires configuration changes. Also, there are numerous parameters which changes how PostgreSQL behaves.

Last few days, I was setting up PostgreSQL streaming replication. This blog post highlights what I have learned from that exercise.

What is Streaming Replication

PostgreSQL supports different types of replication. Stream replication is one of them. Streaming replication allows a standby server to stay up to date with its primary server by receiving WAL records as they are generated at primary server. Like water flows from one location to another location, in PostgreSQL, WAL segments flow (streams) from primary to standby sever which creates the replication stream.

Streaming replication is fairly easy to setup. I was using PostgreSQL 12 to setup streaming replication. After bootstrapping the standby server using pg_basebackup or similar method, you need to enable archive at primary server. See below command:

archive_mode = on
archive_command = 'rsync %p /pgwal_archive/data/%f'"

What confuses is the archive command. Why we need to archive WAL files to another location?

Standby server can connect to its primary to get the WAL records once you enable archive_mode. However, the primary server does not care about its standby and it will recycle the WAL segments to reclaim space for the operations of the primary server. Due to this reason, standby will not be able to find the required WAL segments from primary server’s pg_wal dir always. For this reason, it requires WAL archive so that WAL segments are available for the standby when it cannot find at the primary server’s pg_wal dir.

You will notice the errors like below at standby server’s postgres log, if it cannot find WAL segments from primary server’s pg_wal dir.

requested WAL segment 000000030000002A000000D8 has already been removed
LOG:  restored log file "000000030000002A000000D8" from archive

In the same log file, you will also notice that the same WAL segment was restored from the pg wal archive location.

The restore instruction from WAL archive needs to be given to standby server as a configuration option in postgresql.conf like below:

restore_command = 'cp /pgwal_archive/data/%f %p'
archive_cleanup_command = '/usr/edb/as12/bin/pg_archivecleanup /pgwal_archive/data %r'

Not only the restore command, a command to clean up the wal archive dir is also needed, otherwise, it will fill up the archive storage.

At one time, the cleanup command was not working in my replication setup, and it filled up the storage at archive dir. When I check the standby server’s log file, I saw the error below:

2021-05-20 02:40:46 EDT WARNING:  archive_cleanup_command "pg_archivecleanup /pgwal_archive/data %r": command not found

The reason was, even though pg_archivecleanup binary was there at bin dir, it was not able to locate due to the PATH variable been not setup correctly to watch the /usr/pgsql/12/bin folder.

That was the reason why wal archive folder got filled up.

Before I found the reason for not cleaning up the archive, I was trying to delete the WAL files with some sort of manual intervention. I found another command to perform that operation. The command is:

./pg_archivecleanup /pgwal_archive/data 000000030000002C0000003A

With the above command you need to mention the WAL file name. It will delete the WAL files older than the specified WAL file. I just followed the simple logical thinking to find the base WAL file for the above command. I just used the, “pg_stat_replication” meta data view at the primary server and found the replay LSN. See below:

replay-lsn

My thinking was, if I delete the files older than the replay LSN, it would be fine.

After that I used the pg_walfile_name() function to find out the WAL file name at primary location as stated below:

pg_walfile_name

Guess what, I was wrong. PostgreSQL is more complicated than I thought. Basically, after deleting the WAL files manually from the archive, the situation got worse and replication broke. I had to bootstrap the standby from the start.

After replication was setup successfully, I started restoring a database of around 200 GB of size. But it failed with insufficient space of pg_wal and WAL archive dirs, then failover had initiated. After this incident I was not able to bring the old master online. I tried various solutions but none of them worked. Probably there will be a way to resolve the issue, but I could not find one. When I looked at the error logs, I saw many WAL missing errors.

I even tried pg_resetwal utility. This utility destroys the data changes and that means data loss. However, if you decided to do this, there will be no going back. I did not take the backup of pg_wal dir as well because I had the backup of the database. Finally, I had to re-initialize the database server and re-setup replication because pg_resetwal did not work for me as I expected.

I also noticed many entries like below in PosrgreSQL logs which are pretty annoying. Also found a method to get rid of them.

cp: cannot stat ‘/pgwal_archive/data/0000000B.history’: No such file or directory
cp: cannot stat ‘/pgwal_archive/data/000000030000001900000054’: No such file or directory

I added a -q switch to the restore command at standby server to suppress the above messages.

Finally, I was able to restore the database and cluster was working properly.

So below are the lessons learned:

  • Never delete WAL logs either in pg_wal or pg_wal archive dirs manually unless you’re an expert on PostgreSQL

  • Don’t be so quick to run pg_resetwal command. If you do not know what you’re doing, it will make the situation worse. Take a backup of pg_wal before you are going in this direction because there will be a difficulty of going back.

  • Situation made it even worse when I used pg_archivecleanup to free up space on wal archive dir manually.

About the cover picture: It was taken during the visit to Royal Gorge Bridge & Park. From top of the bridge you can see the Arkansas River. Royal Gorge Bridge, Colorado.