logo

PostgreSQL 备库会同步主库修改的参数吗?

作者:makedata2023.07.15 18:59浏览量:1017

简介:PostgreSQL 备库会同步主库修改的参数吗?

1前言

今天①群里一位筒子问了我这样一个问题:

大佬,今天偶然操作发现alter system修改参数,reload备库不会跟着应用,调用的操作系统写auto.conf文件。但是这应该算是个ddl,这个要怎么理解呢?

这个是不是跟pg_setting里边的signal,superuser,postmaster有关

其实这个问题涉及到了不少原理,值得写出来分享一下。

2何如

这个问题不难理解,理解流复制的原理就能知晓
通俗一点来说就是主库不断产生 WAL 并且持续发送给备库,备库收到了再进行回放。因此,这个问题就转变成了主库修改参数是否会记录 WAL record

那让我们试验一下:

postgres=# alter system set deadlock_timeout to '5s';
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# select pg_walfile_name(pg_current_wal_lsn());
     pg_walfile_name      
--------------------------
 0000000100000000000000C5
(1 row)

postgres=# create table t2(id int);
CREATE TABLE

pg_waldump 解析一下

[postgres@xiongcc pg_wal]$ pg_waldump 0000000100000000000000C5
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/C5000028, prev 0/C401CBA8, desc: RUNNING_XACTS nextXid 53284 latestCompletedXid 53283 oldestRunningXid 53284
rmgr: Storage     len (rec/tot):     42/    42, tx:          0, lsn: 0/C5000060, prev 0/C5000028, desc: CREATE base/24927/25043
rmgr: Heap2       len (rec/tot):     60/    60, tx:      53284, lsn: 0/C5000090, prev 0/C5000060, desc: NEW_CID rel 1663/24927/1247; tid 15/17; cmin: 0, cmax: 4294967295, combo: 4294967295
...

可以看到,确实没有相关的 WAL record。因此,备库不会复制也就是”理所应当”了。但是是否所有的参数都不会记录呢?其实不然,让我们瞅瞅代码:
WAL 日志的类型很多,具体每一种的目的之前文章已经写过,这里我们只关注 XLOG_PARAMETER_CHANGE,顾名思义,参数变更,参数变更之后会记录一条相关日志。

else if (info == XLOG_PARAMETER_CHANGE)
 {
  xl_parameter_change xlrec;
  const char *wal_level_str;
  const struct config_enum_entry *entry;

  memcpy(&xlrec, rec, sizeof(xl_parameter_change));

  /* Find a string representation for wal_level */
  wal_level_str = "?";
  for (entry = wal_level_options; entry->name; entry++)
  {
   if (entry->val == xlrec.wal_level)
   {
    wal_level_str = entry->name;
    break;
   }
  }

  appendStringInfo(buf, "max_connections=%d max_worker_processes=%d "
       "max_wal_senders=%d max_prepared_xacts=%d "
       "max_locks_per_xact=%d wal_level=%s "
       "wal_log_hints=%s track_commit_timestamp=%s",
       xlrec.MaxConnections,
       xlrec.max_worker_processes,
       xlrec.max_wal_senders,
       xlrec.max_prepared_xacts,
       xlrec.max_locks_per_xact,
       wal_level_str,
       xlrec.wal_log_hints ? "on" : "off",
       xlrec.track_commit_timestamp ? "on" : "off");

可以看到,总共涉及到 8 个参数:max_connectionsmax_worker_processesmax_wal_sendersmax_prepared_transactionswal_levelwal_log_hintstrack_commit_timestamp 。那让我们再次验证一下:

postgres=# alter system set max_connections = 105;
ALTER SYSTEM

rmgr: XLOG        len (rec/tot):     54/    54, tx:          0, lsn: 0/C50180A8, prev 0/C5018030, desc: PARAMETER_CHANGE max_connections=105 max_worker_processes=8 max_wal_senders=10 max_prepared_xacts=0 max_locks_per_xact=64 wal_level=logical wal_log_hints=off track_commit_timestamp=off
pg_waldump: error: error in WAL record at 0/C50180A8: invalid record length at 0/C50180E0: wanted 24, got 0

果然,这次日志里面就记录了一条 PARAMETER_CHANGE 的 WAL record。那么为什么是这些参数?或者说,为什么几百个参数仅这些参数?看过我之前文章的朋友到这里应该就知道了(为什么备库的某些参数必须比主库大),因为这些 GUC 参数对于备库来说,至关重要,所以也全都是必须要重启生效 (postmaster)。

postgres=# select context,name from pg_settings where name in ('max_connections','max_worker_processes','max_wal_senders','max_prepared_transactions','max_locks_per_transaction','wal_level','wal_log_hints','track_commit_timestamp');
  context   |           name            
------------+---------------------------
 postmaster | max_connections
 postmaster | max_locks_per_transaction
 postmaster | max_prepared_transactions
 postmaster | max_wal_senders
 postmaster | max_worker_processes
 postmaster | track_commit_timestamp
 postmaster | wal_level
 postmaster | wal_log_hints
(8 rows)

The settings of some parameters determine the size of shared memory for tracking transaction IDs, locks, and prepared transactions. These shared memory structures must be no smaller on a standby than on the primary in order to ensure that the standby does not run out of shared memory during recovery. For example, if the primary had used a prepared transaction but the standby had not allocated any shared memory for tracking prepared transactions, then recovery could not continue until the standby’s configuration is changed. The parameters affected are

至于其他参数,比如 statement_timeout ,无关紧要,因为备库完全可以有自己的一套标准,控制承载的查询超时时间、锁超时时间等。而 max_connections 这类参数要求备库上的这些共享内存结构必须不小于主库上的结构,以确保备库在恢复期间不会耗尽共享内存。

另外各位可以去测一下 max_connections ,当比备库小的时候,流复制会暂停,在 14 以前的版本,重启主库,备库会挂掉,因为在 replay 这一条 record 的时候检测到了错误,在 14 进行了优化,仅仅是暂停流复制 (sent_lsn 停滞) 。

3小结

这个问题其实说难不难,从底层原理思考就能明白 🤔。

相关文章推荐

发表评论