Software Performance Engineering: PostgreSQL High Availability, Load Balancing and Replication

Introduction

A secondary server taking over quickly while primary server fails to serve is called High Availability.
Several database machines serving same data is called as Load Balancing.

Ideally, web servers serving static web pages and read-only databases can be combined easily by merely load balancing the requests to be served. Unfortunately, most of the database servers receive mix of read and write requests. This makes hard to get consistent data from one of the available servers at any given point of time. The data written to the server must be propagated to all other servers.

This synchronization problem is solved by many solutions:
One Server to modify data & other servers to perform read operations
Using one server (Master or Primary server) as write only and other servers (Standby or Secondary server) as read only servers which track changes in the primary server.

Primary or Master server
Secondary or Standby server

Warm standyby server (cannot be connected until it is promoted to a master server)
Hot standyby server (serves read only queries)

Data modification is not considered as committed until all the servers have committed the transaction

This solution guarantees:

fail over (There is no loss of data)
all load balanced servers will retrieve consistent data no matter whatever the serve is queried

Comparison of different solutions

Shared Disk Failover (sharing the hardware functionality)
Uses single disk array that is shared by multiple servers. As soon as primary server server fails, secondary server will be able to mount and start.
Is advantageous as there:

is no overhead of synchronization of the primary server with the secondary server.
and allows rapid failover with no data loss.

Also has some serious limitations since:

If shared disk array fails or corrupts all the servers becomes non-functional.
and standby server should never access the shared storage when primary server is accessing

File System Replication (file system replication Eg: DRBD)

has restriction that, writes to standby server must be done in the same order of the data changes in primary server to ensure consistency to the standby server.

Write-Ahead Log shipping

Synchronously or Asynchronously warm and hot standby servers are kept updated by reading stream of WAL records and is achieved by

either:

File based log shipping
Streaming replication
or combination of both

This is done on entire database

Logical Replication

Logical replication constructs a stream of logical data modifications (at table level is also possible) from the WAL of the server and sends to the other servers (can be from master to standby or standby to master).

Trigger-Based Master-Standyby Replication (Updates standby servers asynchronously in batches) Eg: Slony-I
This is 2 step process

Master-Standby replication setup server asynchronously sends all the modification queries to the master server.
Master server asynchronously sends the data to the standby server.

Asynchronously sends all the data modification queries to the standby servers as a batch with help of which standby servers can answer read-only queries while the primary server is running. This case may lead to possible data loss during failover.
Statement-Based Replication Middleware (Each server operated independently)
A program intercepts every SQL Query and broadcasts the same to all the servers to execute. This can cause data inconsistency if the queries contains functions like CURRENT_TIMESTAMP(), random(), sequences which can have different value on different servers etc. Care must be taken that transaction either commits or rollbacks on all the servers, perhaps using the 2 phase commit (PREPARE TRANSACTION and COMMIT TRANSACTION)

One solution for this would be that middleware should query server dependent functions from the one server and make to use same in all other servers.

Asynchronous Multimaster Replication

Each server works independently and periodically communicates each other and resolve the conflicting transactions. These conflicting transactions can be resolved by user or using some rules like Bucardo.

Synchronous Multimaster Replicatiom

Each server can accept the write requests and modified data is sent to all other servers. Since, data is sent to all other servers, we won't face any issue with non-deterministic functions like random() etc. Heavy write activity may lead to excessive locking leading to the poor performance. However, read operations take advantage since we need not partition the workload and the any server can serve the read request. The problem with write overhead is reduced by using shared disk.

This replication should be implemented in the application using 2 phase commit transactions since, PostgreSQL doesn't implicit;y provide us the functionality.

References

https://www.postgresql.org/docs/current/static/high-availability.html

Software Performance Engineering

Saturday, 8 September 2018

PostgreSQL High Availability, Load Balancing and Replication

Introduction

Comparison of different solutions

References

No comments:

Post a Comment

Linux Metricbeat Counters Mapping

Report Abuse