Over the past couple of weeks I've had the opportunity to dig in to Postgres 9's streaming replication. What I found was a relatively easy way to configure a system for basic replication with very impressive speed. Postgres's streaming replication is an enhancement to the log-shipping warm standby that was available in previous versions. Instead of replaying write-ahead log (WAL) files as they become available on the slave's filesystem, the slave nodes connect directly to the master and stream the log as it generated. This results in extremely fast replication, typically under a second.
It can't match the cascading features of Slony, but if you're looking for a simple structure with a master database feeding one or more read-only slaves then Postgres 9 is a great choice.
Here's how I got everything working on Ubuntu 10.04.
UPDATE: I've changed the NFS details below to squash the user and group IDs and set the anonymous access user and group to the slave server's user and group IDs. This resolves issues where the postgres use has a different ID on the slave server than the master server, so files received from master are owned by the wrong user.
Postgres 9
Postgres 9 isn't included in the Lucid repos (or Maverick, for that matter), so you'll need to use a PPA. I used Martin Pitt's.
:::sh
$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:pitti/postgresql
$ sudo apt-get update
$ sudo apt-get install postgresql postgresql-server-dev-9.0 postgresql-contrib-9.0
If you're using GeoDjango, you'll need to re-compile the requirements from source after 9.0 is installed. Make sure to use Postgis 1.5.2, which is the first release to support Postgres 9.
Because the default encoding is ASCII, you'll need to drop the cluster, and re-initialize it as UTF8.
:::sh
$ sudo -u postgres pg_dropcluster --stop 9.0 main
$ sudo -u postgres pg_createcluster --start -e UTF-8 9.0 main
At this point, you can restore any database backups needed. If you'd like, you can begin accepting live queries - the base backup needed below can run while the database is still available.
NFS Sharing
I'm using log-shipping and streaming replication together, just in case something goes wrong with the direct connection. I use NFS to share a folder on the slave node and mount it on the master, so that the master can send WAL files to the slave.
On the Slave Server
:::sh
$ sudo apt-get install nfs-kernel-server nfs-common portmap
Note: If you're on Rackspace, you'll have trouble with nfs-kernel-server. See my post here for details on how to work around it.
:::sh
$ sudo dpkg-reconfigure portmap
Select 'no'.
:::sh
$ sudo restart portmap
$ sudo emacs /etc/exports
Find out the user and group IDs for postgres. You'll use them in the next step.
:::sh
$ id -u postgres
104
$ id -g postgres
108
Add a line like this:
:::text
/path/for/backups 10.0.1.101/32(rw,async,no_subtree_check,all_squash,anonuid=104,anongid=108)
Replace the CIDR notation with your own. Use a CIDR calculator if needed. Replace the anonuid and anongid with the appropriate values for your server.
Then, restart the server and export your shares. Make sure the path exists on the filesystem before doing so, and adjust the permissions as needed so that the path can be written to.
:::sh
$ sudo /etc/init.d/nfs-kernel-server restart
$ sudo exportfs -a
On the Master Server
:::sh
$ sudo apt-get install nfs-common portmap
$ sudo emacs /etc/fstab
Add a line like this:
:::text
slavehostname:/path/for/backups /path/for/mount nfs rsize=8192,wsize=8192,timeo=14,intr 0 0
Then mount it:
:::sh
$ sudo mount /path/for/mount
Test the NFS connection by touching a new file on the primary machine, and making sure it can be seen on the standby machine.
Replication
On the Master Server
Edit postgresql.conf to enable WAL archiving:
wal_level = hot_standby
archive_mode = on
archive_command = 'cp -i %p /path/for/mount/%f </dev/null'
max_wal_senders = 1
Adjust max_wal_senders
to the number of slave servers you plan on using.
Create a superuser for streaming replication:
:::sql
CREATE USER myuser WITH SUPERUSER ENCRYPTED PASSWORD 'mypassword';
Modify pg_hba.conf to allow the replication user to connect:
:::text
host replication myuser 10.0.1.102/32 md5
You'll need to restart Postgres at this point.
Base Backup
Next, a base backup needs to be created. This must be a filesystem-level backup, not a logical backup like the pg_dump command creates. On the master server, run the pg_start_backup command:
:::sql
SELECT pg_start_backup('base_backup');
Then, tar and bzip the data directory:
:::sh
$ cd /var/lib/postgresql/9.0/
$ sudo tar -cjf ~/postgres-data.tar.bz2 main
This will take awhile. Once it's done, run the pg_stop_backup command:
:::sql
SELECT pg_stop_backup();
Then, copy the archive to the standby server. Make sure postgresql is not running on the standby, and unzip the archive:
:::sh
$ cd /var/lib/postgresql/9.0/
# Make SURE you're on the standby server here
$ sudo rm -rf main
$ sudo tar -xjf ~/postgres-data.tar.bz2
$ sudo chown -R postgres:postgres main
On the Slave Server
Edit postgresql.conf to enable hot standby:
hot_standby = on
Create a recovery.conf in /var/lib/postgresql/9.0/main/ similar to this::
standby_mode = 'on'
primary_conninfo = 'host=10.0.1.101 port=5432 user=myuser password=mypassword'
restore_command = 'cp /path/for/backups/%f %p'
trigger_file = '/path/for/backups/trigger_file'
archive_cleanup_command = 'pg_archivecleanup /path/for/backups %r'
The postgresql-contrib-9.0 package doesn't link pg_archivecleanup into the path. You can do so manually:
:::sh
$ sudo ln -s /usr/lib/postgresql/9.0/bin/pg_archivecleanup /usr/bin/
Smoke Test
To make sure everything is working properly, first check and make sure wal sender and receiver processes are live on the master and slave servers. On the primary, run ps -ef | grep "wal sender"
and make sure you see a wal sender process. On the standby run ps -ef | grep "wal receiver"
.
Next, create a test table on the primary:
:::sql
CREATE TABLE test (test varchar(30));
INSERT INTO test VALUES ('Testing 1 2 3');
On the standby, query the table:
:::sql
SELECT * FROM test;
You should see the following output:
:::text
test
---------------
Testing 1 2 3
(1 row)
You can then delete your test table:
:::sql
DROP TABLE test;
Load Balancing
For load balancing, I decided to leverage the excellent multidb capability introduced in Django 1.2. To do this, I created the django-balancer project. It builds on the example of database routers in the Django docs, a post on Eli Bendersky's blog, and Jeff Balogh's django-multidb-router project.
For my project, I decided on the WeightedMasterSlaveRouter. I want to include my master database in the read pool, but to a lesser extent than my slave database
I borrowed master-pinning from Jeff Balogh's project for django-balancer, but I ended up not needing it in my project. In my tests, I never had an issue with the databases being out of sync. The replication was really that fast! Even in situations where a user just created an object and was being rerouted to a rendered view of that object, the replication was speedy enough to keep up and prevent a 404.
Installation
Install with pip:
:::sh
$ pip install django-balancer
Then, add the desired router to your DATABASE_ROUTERS setting. Also add any settings required by that router.
:::python
DATABASE_ROUTERS = ['balancer.routers.WeightedMasterSlaveRouter']
DATABASE_POOL = {
'default': 1,
'slave1': 2,
}
MASTER_DATABASE = 'default'
Make sure to configure your DATABASES setting to add the new slave database.
And with that, you should be all set!
Create Your Own
You can also use the classes and mixins inside the django-balancer project to create your own routers. If you come up with one that you'd like to share, fork me on Github. I'd be glad to pull it into the project.