Ubuntu 24.04 : Pacemaker : ノードを追加/削除する : Server World

Pacemaker : ノードを追加/削除する

2024/07/23

既存のクラスターにノードを追加する場合の設定です。

例として、既存のクラスターに [node03] を新規に追加します。

                       +--------------------+
                       | [  ISCSI Target  ] |
                       |    dlp.srv.world   |
                       +----------+---------+
                         10.0.0.30|
                                  |
+----------------------+          |          +----------------------+
| [  Cluster Node#1  ] |10.0.0.51 | 10.0.0.52| [  Cluster Node#2  ] |
|   node01.srv.world   +----------+----------+   node02.srv.world   |
+----------------------+          |          +----------------------+
                                  |
                                  |10.0.0.53
                      +-----------------------+
                      | [  Cluster Node#3  ]  |
                      +   node03.srv.world    |
                      +-----------------------+

[1]	新規追加ノードで、こちらの [1] を参考に Pacemaker をインストールしておきます。
[2]	既存のクラスターにノードを新規追加します。

root@node01:~#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node01.srv.world (version 2.1.6-6fdc9deea29) - partition with quorum
  * Last updated: Tue Jul 23 04:38:34 2024 on node01.srv.world
  * Last change:  Tue Jul 23 04:38:19 2024 by root via cibadmin on node01.srv.world
  * 2 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

# 新規ノードの認証を確立

root@node01:~#

pcs host auth node03.srv.world

Username: hacluster
Password:
node03.srv.world: Authorized

# 新規ノードを追加

root@node01:~#

pcs cluster node add node03.srv.world

No addresses specified for host 'node03.srv.world', using 'node03.srv.world'
Disabling sbd...
node03.srv.world: sbd disabled
Sending 'corosync authkey', 'pacemaker authkey' to 'node03.srv.world'
node03.srv.world: successful distribution of the file 'corosync authkey'
node03.srv.world: successful distribution of the file 'pacemaker authkey'
Sending updated corosync.conf to nodes...
node02.srv.world: Succeeded
node01.srv.world: Succeeded
node03.srv.world: Succeeded
node02.srv.world: Corosync configuration reloaded

[3]	フェンスデバイスの設定を更新します。当例のようにフェンスデバイスに SCSI フェンシングを設定している場合は、新規追加ノードでフェンスデバイス用の共有ストレージにログインして、SCSI フェンスエージェントをインストールしておきます ([2], [3])。その後、以下のようにフェンスデバイスの設定を更新します。

# フェンスデバイスのリストを更新

root@node01:~#

pcs stonith update scsi-shooter pcmk_host_list="node01.srv.world node02.srv.world node03.srv.world"

root@node01:~#

pcs stonith config scsi-shooter

Resource: scsi-shooter (class=stonith type=fence_scsi)
  Attributes: scsi-shooter-instance_attributes
    devices=/dev/disk/by-id/wwn-0x6001405fd08aa7cd2fe4f8cad7b28412
    pcmk_host_list="node01.srv.world node02.srv.world node03.srv.world"
  Meta Attributes: scsi-shooter-meta_attributes
    provides=unfencing
  Operations:
    monitor: scsi-shooter-monitor-interval-60s
      interval=60s

[4]	既存のクラスターにすでにリソースを設定している場合は、フェイルオーバーした際に正常に新規追加ノードがアクティブとなれるように、各リソース用の設定が必要です。例として、こちらのように LVM 共有ストレージを設定している場合は、新規追加ノードで、事前に LVM 共有ストレージを認識させておく必要があります。

root@node03:~#

iscsiadm -m discovery -t sendtargets -p 10.0.0.30

10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target01
10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target02

root@node03:~#

iscsiadm -m node --login --target iqn.2022-01.world.srv:dlp.target02

root@node03:~#

iscsiadm -m session -o show

tcp: [1] 10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target01 (non-flash)
tcp: [2] 10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target02 (non-flash)

root@node03:~#

lvm pvscan --cache --activate ay

  pvscan[4061] PV /dev/vda3 online, VG ubuntu-vg is complete.
  pvscan[4061] PV /dev/sdb1 ignore foreign VG.
  pvscan[4061] VG ubuntu-vg run autoactivation.
  1 logical volume(s) in volume group "ubuntu-vg" now active

[5]	既存のクラスターにすでにリソースを設定している場合は、フェイルオーバーした際に正常に新規追加ノードがアクティブとなれるように、各リソース用の設定が必要です。例として、こちらのように Apache httpd を設定している場合は、新規追加ノードで、リンク先の [1] の設定が必要です。
[6]	各リソース用の設定が全て終了したら、新規追加ノードのクラスターサービスを起動します。

# 新規追加ノード起動

root@node01:~#

pcs cluster start node03.srv.world

node03.srv.world: Starting Cluster...
root@node01:~#

pcs cluster enable node03.srv.world

node03.srv.world: Cluster Enabled

root@node01:~#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node01.srv.world (version 2.1.6-6fdc9deea29) - partition with quorum
  * Last updated: Tue Jul 23 04:49:56 2024 on node01.srv.world
  * Last change:  Tue Jul 23 04:49:47 2024 by hacluster via crmd on node01.srv.world
  * 3 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world node03.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[7]	フェンシングを実行して、新規追加ノードに正常にフェイルオーバーするか確認します。

root@node03:~#

pcs stonith fence node01.srv.world

Node: node01.srv.world fenced

root@node03:~#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-6fdc9deea29) - partition with quorum
  * Last updated: Tue Jul 23 04:51:01 2024 on node03.srv.world
  * Last change:  Tue Jul 23 04:49:47 2024 by hacluster via crmd on node01.srv.world
  * 3 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ node02.srv.world node03.srv.world ]
  * OFFLINE: [ node01.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node02.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node02.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[8]	ノードを削除する場合は、以下のように実行します。

root@node01:~#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-6fdc9deea29) - partition with quorum
  * Last updated: Tue Jul 23 04:56:45 2024 on node01.srv.world
  * Last change:  Tue Jul 23 04:49:47 2024 by hacluster via crmd on node01.srv.world
  * 3 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world node03.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node02.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node02.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

root@node01:~#

pcs cluster node remove node03.srv.world

Destroying cluster on hosts: 'node03.srv.world'...
node03.srv.world: Successfully destroyed cluster
Sending updated corosync.conf to nodes...
node02.srv.world: Succeeded
node01.srv.world: Succeeded
node01.srv.world: Corosync configuration reloaded

# フェンスデバイスのリストを更新

root@node01:~#

pcs stonith update scsi-shooter pcmk_host_list="node01.srv.world node02.srv.world"

root@node01:~#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-6fdc9deea29) - partition with quorum
  * Last updated: Tue Jul 23 04:59:46 2024 on node01.srv.world
  * Last change:  Tue Jul 23 04:59:38 2024 by hacluster via crmd on node02.srv.world
  * 2 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node02.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node02.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled