Redis with failover replication
Redis - nice tools for store key-value data in different formats. Here pretty easy way to create failover for replication. Sometime named as redis cluster, but is it not true, just few (prefer 3 for sentinel quorum) servers with one master and slaves in different configuration (slave from slave, slave by priority. local slave...).
We need 3 servers with redis+sentinel. Tested on Redis version 2.x and 3.x.
r1_redis.conf
r2_redis.conf r3_redis.conf
Sentinel, commented part no needed for bootstrap and will be appended after start. I keep it for example.
r1_sentinel.conf r2_sentinel.conf r3_sentinel.conf
And finally if you need offload network, you can start local redis with slave of redis-ro pool with failover too.
What happen
Apps should use 127.0.0.1:6379 as master and 127.0.0.3:6379 as salve
Known problems:
1. Replication delay, in case when you need write and read from local slave
2. Replication crash when RO/RW traffic too high (more then 200mbps on master in one way, tested on AWS)
3. Lot of problems when you have more then 5-6 slaves and permanent problems with replication when amount of slaves 10+
We need 3 servers with redis+sentinel. Tested on Redis version 2.x and 3.x.
1 - 10.1.1.1 2 - 10.1.1.2 3 - 10.1.1.3
r1_redis.conf
#bind 127.0.0.1 protected-mode no port 6379 ...
r2_redis.conf r3_redis.conf
#bind 127.0.0.1 protected-mode no port 6379 ... slaveof 10.1.1.1 6379
Sentinel, commented part no needed for bootstrap and will be appended after start. I keep it for example.
r1_sentinel.conf r2_sentinel.conf r3_sentinel.conf
daemonize yes pidfile "/var/run/redis/redis-sentinel.pid" logfile "/var/log/redis/redis-sentinel.log" port 16379 dir "/var/lib/redis" protected-mode no #sentinel myid 4809b5ae33b617b24e4ee061222a3cb11f4457cd sentinel monitor redis-ha 10.1.1.1 6379 2 sentinel down-after-milliseconds redis-ha 3000 sentinel failover-timeout redis-ha 6000 #sentinel config-epoch redis-ha 24 #sentinel leader-epoch redis-ha 24 #sentinel known-slave redis-ha 10.1.1.2 6379 #sentinel known-slave redis-ha 10.1.1.3 6379 #sentinel known-sentinel redis-ha 10.1.1.2 16379 1316ff79cb3a4558119c53ca5038f86271683fa7 #sentinel known-sentinel redis-ha 10.1.1.3 16379 e36c6b59c49cbdc8f8056877a410644cda7a6255 #sentinel current-epoch 24
And finally if you need offload network, you can start local redis with slave of redis-ro pool with failover too.
listen redis-rw bind 127.0.0.1:6379 mode tcp balance leastconn option tcplog option tcp-check tcp-check connect tcp-check send PING\r\n tcp-check expect string +PONG tcp-check send info\ replication\r\n tcp-check expect string role:master tcp-check send QUIT\r\n tcp-check expect string +OK server redis-1 10.1.1.1:6379 check inter 2s backup server redis-2 10.1.1.2:6379 check inter 2s backup server redis-3 10.1.1.3:6379 check inter 2s backup listen redis-ro bind 127.0.0.2:6379 mode tcp balance leastconn option tcplog option tcp-check tcp-check connect tcp-check send PING\r\n tcp-check expect string +PONG tcp-check send info\ replication\r\n tcp-check expect string master_link_status:up tcp-check send QUIT\r\n tcp-check expect string +OK server redis-1 10.1.1.1:6379 check inter 2s server redis-2 10.1.1.2:6379 check inter 2s server redis-3 10.1.1.3:6379 check inter 2s server redis-rw 127.0.0.1:6379 backup listen redis-local-ro bind 127.0.0.3:6379 mode tcp balance leastconn option tcplog option tcp-check tcp-check connect tcp-check send PING\r\n tcp-check expect string +PONG tcp-check send info\ replication\r\n tcp-check expect string master_link_status:up tcp-check send QUIT\r\n tcp-check expect string +OK server redis-local 127.0.0.4:6379 check inter 2s server redis-ro 127.0.0.2:6379 backup
What happen
127.0.0.1:6379 - Redis Cluster RW 127.0.0.2:6379 - Redis Cluster RO 127.0.0.3:6379 - Local Redis RO with fallback to Redis Cluster RO 127.0.0.4:6379 - Local Redis, slave of Redis Cluster ROIn this configuration we can miss all servers except one, he become master and you no need reconfigure apps. In switch moment, app will try write to the RO and so on. But haproxy tcp-check in few seconds will detect all nodes statuses and fix it on-the-fly.
Apps should use 127.0.0.1:6379 as master and 127.0.0.3:6379 as salve
Known problems:
1. Replication delay, in case when you need write and read from local slave
2. Replication crash when RO/RW traffic too high (more then 200mbps on master in one way, tested on AWS)
3. Lot of problems when you have more then 5-6 slaves and permanent problems with replication when amount of slaves 10+
Comments
Post a Comment