pg流复制_切换

主备判断方法
看WAL进程是sender还是receiver

--主
[postgres@pg93 ~]$ ps -fu postgres|egrep 'walsender|walreceiver'|grep -v egrep
postgres   6065   6054  0 16:35 ?        00:00:00 postgres: walsender repuser 192.168.150.132(50722) streaming 2/10001C0
--备
[postgres@pg93s ~]$ ps -fu postgres|egrep 'walsender|walreceiver'|grep -v egrep
postgres   7628   6966  0 00:44 ?        00:00:03 postgres: walreceiver   streaming 2/10001C0

查pg_stat_replication和pg_stat_wal_receiver视图

--主：pg_stat_replication有返回
postgres@postgres:select pid,state,sync_state,usename,client_addr,application_name from pg_stat_replication;
 pid  |   state   | sync_state | usename |   client_addr   | application_name 
------+-----------+------------+---------+-----------------+------------------
 6065 | streaming | sync       | repuser | 192.168.150.132 | pg93std
(1 row)

--备：pg_stat_wal_receiver有返回
postgres=# select pid,status,sender_host,sender_port,last_msg_send_time from pg_stat_wal_receiver;
 pid  |  status   |   sender_host   | sender_port |      last_msg_send_time       
------+-----------+-----------------+-------------+-------------------------------
 7628 | streaming | 192.168.150.130 |        6000 | 2020-12-20 17:31:53.520146+08
(1 row)

查看是否处于恢复状态，是为备，否为主
select pg_is_in_recovery(0;
查看控制信息

--主
[postgres@pg93 ~]$ pg_controldata -D /pgdata |grep cluster
Database cluster state:               in production
--备
[postgres@pg93s ~]$ pg_controldata -D /pgdata |grep cluster
Database cluster state:               in archive recovery

比对参数
12以前版本查看PGDATA下有没recovery.conf，有就是备
12和以上版本查看PGDATA下有没standby.signal，有就是备
主备切换（文件法）

--备添加参数，并reload或重启生效
promote_trigger_file = '/pgdata/.pg_prompt_6000'
postgres=# show promote_trigger_file;
  promote_trigger_file   
-------------------------
 /pgdata/.pg_prompt_6000
(1 row)

--停主库
[postgres@pg93 pgdata]$ pg_ctl stop -D /pgdata
waiting for server to shut down.... done
server stopped

--备库连接主失败，创建trigger文件后promote为主
[postgres@pg93s pgdata]$ touch /pgdata/.pg_prompt_6000
--查看备日志：
2020-10-07 02:18:47.670 CST [7908] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.150.130" and accepting
                TCP/IP connections on port 6000?
2020-10-07 02:18:52.676 CST [7909] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.150.130" and accepting
                TCP/IP connections on port 6000?
2020-10-07 02:18:57.680 CST [6968] LOG:  promote trigger file found: /pgdata/.pg_prompt_6000
2020-10-07 02:18:57.682 CST [6968] LOG:  redo done at 2/2000028
2020-10-07 02:18:57.682 CST [6968] LOG:  last completed transaction was at log time 2020-12-20 13:34:46.657621+08
2020-10-07 02:18:57.722 CST [6968] LOG:  selected new timeline ID: 2
2020-10-07 02:18:57.912 CST [6968] LOG:  archive recovery complete
2020-10-07 02:18:57.973 CST [6966] LOG:  database system is ready to accept connections
--可见备先是连接不到主库，在创建了trigger文件后自动检测到trigger文件，结束了recovery，变成主模式
--检查还发现trigger文件和standby.signal都被删除了
[postgres@pg93s pgdata]$ ls -l /pgdata/.pg_prompt_6000
ls: cannot access /pgdata/.pg_prompt_6000: No such file or directory
[postgres@pg93s ~]$ ls -l /pgdata/standby.signal
ls: cannot access /pgdata/standby.signal: No such file or directory

注意：

备库在启动状态时，只要检测到存在ptomote_trigger_file指定的文件，就会立即中断与主的同步（如果在同步的话）并将自身提升为主库，而此时主库并不会自动切换为备库（此时两个都是主库），所以如果是要做主备切换的话，需要先手动干净关闭主库，再prompt备库。

主备切换（pg_ctl prompt）

--停主
[postgres@pg93s ~]$ pg_controldata |grep cluster
Database cluster state:               in production
[postgres@pg93s ~]$ pg_ctl stop -D /pgdata
waiting for server to shut down.... done
server stopped
--提升备
[postgres@pg93 pgdata]$ pg_ctl promote -D /pgdata
waiting for server to promote.... done
server promoted
[postgres@pg93 pgdata]$ pg_con
pg_config       pg_controldata  
[postgres@pg93 pgdata]$ pg_controldata |grep cluster
Database cluster state:               in production

pg_rewind修复双主：

在主机器损坏等严重故障中，没有条件将主干净关闭，或者提升备时忘记了先关闭主了，后续将主设置为备时会报错，此种情况下通常需要重新搭建原主为备，但也可以通过使用pg_rewind来实现从新主同步过来达到修复的目的（原主未传到备的部分事务将丢失）：

--主备同步状态中，将备直接提升为主，此时两边都是主
[postgres@pg93s pgdata]$ pg_ctl promote -D /pgdata/
waiting for server to promote.... done
server promoted
[postgres@pg93s pgdata]$ pg_controldata |grep cluster
Database cluster state:               in production
[postgres@pg93 ~]$ pg_controldata |grep clus
Database cluster state:               in production
--再将原主启动到备模式后，不能正常同步，报错：
2020-12-21 00:46:00.053 CST [8553] LOG:  fetching timeline history file for timeline 4 from primary server
2020-12-21 00:46:00.075 CST [8553] FATAL:  could not start WAL streaming: ERROR:  requested starting point 2/D000000 on timeline 3 is not in this server's history
        DETAIL:  This server's history forked from timeline 3 at 2/C0001F8.
--rewind：停掉库，执行rewind
[postgres@pg93 ~]$ pg_rewind --target-pgdata /pgdata --source-server='host=192.168.150.132 port=6000 user=postgres dbname=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 2/C0001F8 on timeline 3
pg_rewind: rewinding from last common checkpoint at 2/C000148 on timeline 3
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 104 MB (total source directory size is 3749 MB)
107316/107316 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!

--重新启动到备模式，可正常同步

PostgreSQL pg

pg流复制_切换

主备判断方法

主备切换（文件法）

注意：

主备切换（pg_ctl prompt）

pg_rewind修复双主：

相关

pg执行计划

pg流复制_延迟备库

pg流复制_一主多备

【Gerrit】Gerrit+Postgresql安装方式的数据备份

Python PDF 转 JPG 推荐使用

CSharpGL(47)你好，Framebuffer！

CSharpGL(31)[译]OpenGL渲染管道那些事

pg_probackup

PostgreSQL数据库常用SQL命令示例

使用pgcli工具连接postgreSQL数据库服务

postgres数据库表空间收缩之pg_squeeze，pg_repack

postgresql-shared_buffers

标签