月刊あんどりゅーくん(12月号)

いつのまにか師走インしていますが、今月もリリース情報と知恵袋です。

リリース情報の大きな目玉は下記2点です。

Pacemaker 1.0.12のリリース
Cluster-glue 1.0.9のリリース

知恵袋ではIPaddr2 RAの裏技(?)をご紹介します。

1. リリース情報

1-1. Pacemaker 1.0.12

Pacemaker 1.0系の最新版がリリースされました！後述のCluster-glueとあわせて、Linux-HA Japanではrpmパッケージの公開準備を行っています。もうしばらくお待ちください。

主な変更点

cib: Call gnutls_bye() and shutdown() when disconnecting from remote TLS connections cib: Remove disconnected remote connections from mainloop 対向ノードが停止した場合、そのノードからの接続を完全に切断します。

crmd: Cancel timers for actions that were pending on dead nodes crmd: Do not wait for actions that were pending on dead nodes 対向ノードが停止した場合、そのノードで実行されていたactionがペンディング状態となる場合あります。ノードが停止している場合は、ペンディング状態となったactionのタイムアウトを待たずに即時に次のactionを実行します。

crmd: Ensure we do not attempt to perform action on failed nodes 故障ノードでのactionの実行を抑止します。

PE: Correctly recognise which recurring operations are currently active 故障発生時にoperationが何度も呼び出されることがありますがどのoperationがactiveであるのかを正確に判定できるよう修正されました。

PE: Demote from Master does not clear previous errors Masterリソースがdemoteする際、もしかして故障履歴が削除されてしまっていた？(気づかなかったな…) demote実行後も故障履歴が保存されるよう修正されました。

PE: Ensure restarts due to definition changes cause the start action to be re-issued not probes start処理関連の定義が変更された際に、リソースを強制的に再起動するように変更されました。

PE: Ensure role is preserved for unmanaged resources PE: Ensure unmanaged resources have the correct role set so the correct monitor operation is chosen リソースのroleを保持することによってunmanaged状態でも正しい動作が得られるようになりました。

PE: Move master based on failure of colocated group Master/Slaveとgroupに制約が設定されている場合、 groupの故障に伴ってMasterも移動することができるようになりました。 このチェンジセットで月刊あんどりゅーくん(11月号)で紹介したコノ動作が改善されています！

pengine: Correctly determine the state of multi-state resources with a partial operation history 故障によりoperationの履歴が保存されていないMaster/Slaveリソースの動作が改善されました。

PE: Only allocate master/slave resources once Master/Slaveリソースの配置動作がループしないように修正されました。

Shell: implement -w,—wait option to wait for the transition to finish crm shellにwaitオプションが追加されました。crm shellから実行されたコマンドが終了するまで待機します。

Shell: repair template list command crm shellのtemplate listコマンドの動作が改善されました。

すべてのチェンジログはこちらから参照できます。

また、Pacmaker 1.0系がgithubのリポジトリへお引っ越ししました。なお、最新版(Pacemaker 1.1系)のリポジトリはこちら。

1-2. Reusable Cluster Components (“glue”) 1.0.9

Cluster-glue 1.0.9がリリースされました！

主な変更点

stonith: external/ipmi: add missing double quote ipmiプラグインが実行できない症状を改善しました。

stonith: external/ipmi: add the priv parameter (ipmitool -L) ipmiプラグインからipmitoolコマンドの-Lオプションが実行できるようになりました。

LRM: lrmd: set op status to cancelled for running monitor operations リソースの移動を実行した後に、monitor処理が停止する症状が改善されました。 ※ この症状は毎回発生するものではありませんがタイミングによってはどのリソースでも発生する可能性があります。

ha_log: increase MAXENTITY size to accommodate long stonith strings ログメッセージのヘッダーに含まれる文字数制限が拡張されました。

hb_report: improve destination directory handling (bnc#727295) hb_reportコマンドのdestオプションにフルパスを指定する必要がなくなりました。パスを指定しない場合は、hb_reportコマンドを実行したディレクトリにレポートが出力されます。

チェンジログはこちらからも参照できます。 1-3. LCMC 1.1.0

LCMC 1.1.0がリリースされました。

主な変更点

disable Heartbeat installation on Opensuse 12 and Fedora 16 onward OpenSUSE 12とFedora 16以降ではHeartbeatのインストールを無効にしました。

use icons of different sizes アイコンのサイズを変更しました。

add “create exe” ant task antのタスクを追加しました。

don’t add corosync/hb to rc.d after installation automatically CorosyncまたはHeartbeatをインストールした後に、自動起動の設定を追加しないようにしました。

fix colors of buttons in dialogs ダイアログの色とボタンを修正しました。

fix cleanup of resources with failcount < INFINITY INFINITYよりも小さなフェイルカウントの削除する際の動作を修正しました。

add –cluster (and company) options to define clusters クラスタを定義するための–clusterオプションを追加しました。

また、LCMCのGUI画面からそれぞれのノードをクラスタに追加していかなくてもコマンドラインから複数のノードを一括してクラスタに登録することも可能となりました。

例) lcmc --cluster alice-bob --host 192.168.122.2 --host 192.168.122.3

上記の例では、クラスタ名「alice-bob」に「192.168.122.2」と「192.168.122.3」の2ノードを追加しています。

そして、LCMCのユーザガイドもできました！

いやあ、いい感じであらぶってますねえ。 Rastoさん、会ったことないけど、おもろい人やね。

1-4. その他リポジトリの変更など

Dejanくんの開発しているcrm shellですが、以前にアナウンスがあったとおり、Pacemakerのリポジトリから独立しました。

crm shellのプロジェクトページ crm shellのリポジトリ

Python GUI(pacemaker-mgmt)もリポジトリを独立する準備をおこなっています。

pacemaker-mgmtのリポジトリ 2. 知恵袋

質問(1) HB3+Pacemakerで、IPaddr2が付くIFがノードによって名前が違う場合ってどうにかしてクラスタ化できますかね… 元ネタ

例えば、node01ではeth1、node02ではeth2に仮想IPアドレスを割り当てたい、ということですね。

node01のネットワーク構成

[root@node01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:91:00
 inet addr:192.168.28.121  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:422 errors:0 dropped:0 overruns:0 frame:0
 TX packets:190 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:58908 (57.5 KiB)  TX bytes:20725 (20.2 KiB)

eth1      Link encap:Ethernet  HWaddr 00:16:3E:39:91:01
 inet addr:192.168.200.121  Bcast:192.168.200.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
 TX packets:24 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:0 (0.0 b)  TX bytes:2831 (2.7 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1334 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1334 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2147876 (2.0 MiB)  TX bytes:2147876 (2.0 MiB)

node02のネットワーク構成

[root@node02 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:92:00
 inet addr:192.168.28.122  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:281 errors:0 dropped:0 overruns:0 frame:0
 TX packets:50 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:45680 (44.6 KiB)  TX bytes:7261 (7.0 KiB)

eth2      Link encap:Ethernet  HWaddr 00:16:3E:39:92:02
 inet addr:192.168.200.122  Bcast:192.168.200.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
 TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:0 (0.0 b)  TX bytes:2633 (2.5 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1473 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1473 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2318404 (2.2 MiB)  TX bytes:2318404 (2.2 MiB)

仮想IPアドレスは「192.168.200.200」を割り当てることにします。ここで、crm設定ファイルにご注目。

[root@node01 ~]# cat IPaddr2.crm
property \
    no-quorum-policy="ignore" \
    stonith-enabled="false" \
    startup-fencing="false" \
    crmd-transition-delay="2s"

rsc_defaults \
    resource-stickiness="INFINITY" \
    migration-threshold="1"

primitive p_ip ocf:heartbeat:IPaddr2 \
    params \
    ip="192.168.200.200" \
    cidr_netmask="24" \
    op start interval="0s" timeout="60s" on-fail="restart" \
    op monitor interval="10s" timeout="60s" on-fail="restart" \
    op stop interval="0s" timeout="60s" on-fail="block"

いつもと違うのは、nicパラメータの設定を「していない」ということです。 ipパラメータは必須設定項目ですが、nicパラメータは必須設定項目ではないので設定しなくてもよいのです。 nicパラメータを設定していない場合、ルーティングテーブルを参照してインターフェースが自動的に選択されます。 crmコマンドを使って、IPaddr2 RAの設定方法を表示してみます。

# crm ra info IPaddr2
Manages virtual IPv4 addresses (Linux specific version) (ocf:heartbeat:IPaddr2)

This Linux-specific resource manages IP alias IP addresses.
It can add an IP alias, or remove one.
In addition, it can implement Cluster Alias IP functionality
if invoked as a clone resource.

Parameters (* denotes required, [] the default):

ip* (string): IPv4 address
 The IPv4 address to be configured in dotted quad notation, for example
 "192.168.1.1".

nic (string, [eth0]): Network interface
 The base network interface on which the IP address will be brought
 online.

 If left empty, the script will try and determine this from the
 routing table.

 Do NOT specify an alias interface in the form eth0:1 or anything here;
 rather, specify the base interface only.

 Prerequisite:

 There must be at least one static IP address, which is not managed by
 the cluster, assigned to the network interface.

 If you can not assign any static IP address on the interface,
 modify this kernel parameter:
 sysctl -w net.ipv4.conf.all.promote_secondaries=1
 (or per device)

cidr_netmask (string): CIDR netmask
 The netmask for the interface in CIDR format
 (e.g., 24 and not 255.255.255.0)

 If unspecified, the script will also try to determine this from the
 routing table.

<省略>

では、クラスタにcrm設定ファイルを反映してみましょう。

[root@node01 ~]# crm configure load update IPaddr2.crm

node01で仮想IPが起動しました。

[root@node02 ~] crm_mon -1 -Af
============
Last updated: Thu Dec  1 16:55:21 2011
Stack: Heartbeat
Current DC: node02 (22222222-2222-2222-2222-222222222222) - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node01 node02 ]

p_ip    (ocf::heartbeat:IPaddr2):       Started node01

Node Attributes:
* Node node01:
 + node02-eth0                       : up
* Node node02:
 + node01-eth0                       : up

Migration summary:
* Node node01:
* Node node02:

node01のeth1に仮想IPアドレス「192.168.200.200」が割り当てられています。

[root@node01 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:91:00 brd ff:ff:ff:ff:ff:ff
 inet 192.168.28.121/24 brd 192.168.28.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:91:01 brd ff:ff:ff:ff:ff:ff
 inet 192.168.200.121/24 brd 192.168.200.255 scope global eth1
 inet 192.168.200.200/24 brd 192.168.200.255 scope global secondary eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:02 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:03 brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:04 brd ff:ff:ff:ff:ff:ff

では、ここでnode01のクラスタを停止してみましょう。

[root@node01 ~]# service heartbeat stop

仮想IPはnode02へフェイルオーバしました。

[root@node02 ~]# crm_mon -1 -Af
============
Last updated: Thu Dec  1 16:56:16 2011
Stack: Heartbeat
Current DC: node02 (22222222-2222-2222-2222-222222222222) - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node02 ]
OFFLINE: [ node01 ]

 p_ip   (ocf::heartbeat:IPaddr2):       Started node02

Node Attributes:
* Node node02:
 + node01-eth0                       : up

Migration summary:
* Node node02:

node02のeth2に仮想IPアドレス「192.168.200.200」が割り当てられています。

[root@node02 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:92:00 brd ff:ff:ff:ff:ff:ff
 inet 192.168.28.122/24 brd 192.168.28.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:01 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:92:02 brd ff:ff:ff:ff:ff:ff
 inet 192.168.200.122/24 brd 192.168.200.255 scope global eth2
 inet 192.168.200.200/24 brd 192.168.200.255 scope global secondary eth2
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:03 brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:04 brd ff:ff:ff:ff:ff:ff

今回の例のように、ノードごとに異なるインターフェースを使用したい場合は nicパラメータを設定しないことで期待する動作が得られますが上記のような要件がない場合、nicパラメータは明示的に設定したほうがよいと思います。例えば、nicパラメータを設定しない状態で、ipパラメータと同じセグメントのインターフェースが存在しないと「unknown error」となるので注意してください。

エラーの例(/var/log/ha-log)

res_IPaddr2_1_start_0 (node=minky0, call=93, rc=1, status=complete): unknown error

参考1, 参考2 質問(2) IPaddr2で実IPをアサインしたい。

例えば、node01,node02のeth1にIPアドレスを設定しない状態でクラスタを起動しクラスタからeth1にIPアドレスを割り当てる、といったような感じでしょうか。

結論からいうと、なんかうまくいかないこともあるっぽい。

うまくいかなかった例。「F/O自体は問題ないのですが、その後の動作が不安定(手元ではssh接続断)だったり、両ノードが「俺がmaster！」になったりしました。」元ネタ

IPaddr2 RAはstart処理の際、arpコマンドを実行して、ARPテーブルを更新していますが切り替わりの際、MACの変更とARPの更新がずれるので、sshなどのセッションが途切れる可能性があります。そのへんはネットワークスイッチの性能など環境依存の要因が大きいので仮想IPを設定した場合も、セッションが途切れてしまう可能性はないとはいえません。上記の事象では、実IPを設定してみてうまくいかなかったということなのでこちらでも似たような環境で動作を確認してみました。 (環境の詳細情報は伺っていないので、勘違いしているかもしれません)

node01のネットワーク構成

[root@node01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:91:00
 inet addr:192.168.28.121  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:143 errors:0 dropped:0 overruns:0 frame:0
 TX packets:73 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:19092 (18.6 KiB)  TX bytes:9635 (9.4 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1259 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1259 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2053800 (1.9 MiB)  TX bytes:2053800 (1.9 MiB)

node02のネットワーク構成

[root@node02 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:92:00
 inet addr:192.168.28.122  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:108 errors:0 dropped:0 overruns:0 frame:0
 TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:16270 (15.8 KiB)  TX bytes:7436 (7.2 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1358 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1358 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2265620 (2.1 MiB)  TX bytes:2265620 (2.1 MiB)

eth1にIPアドレス「192.168.200.200」を設定します。

[root@node01 ~]# cat IPaddr2.crm
property \
    no-quorum-policy="ignore" \
    stonith-enabled="false" \
    startup-fencing="false" \
    crmd-transition-delay="2s"

rsc_defaults \
    resource-stickiness="INFINITY" \
    migration-threshold="1"

primitive p_ip ocf:heartbeat:IPaddr2 \
    params \
     ip="192.168.200.200" \
     nic="eth1" \
    cidr_netmask="24" \
    op start interval="0s" timeout="60s" on-fail="restart" \
    op monitor interval="10s" timeout="60s" on-fail="restart" \
    op stop interval="0s" timeout="60s" on-fail="block"

クラスタにcrm設定ファイルを反映してみましょう。

[root@node01 ~]# crm configure load update IPaddr2.crm

node01にIPアドレス「192.168.200.200」が設定されました。

[root@node02 ~]# crm_mon -1 -Af
============
Last updated: Thu Dec  1 17:16:45 2011
Stack: Heartbeat
Current DC: node02 (22222222-2222-2222-2222-222222222222) - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node01 node02 ]

 p_ip   (ocf::heartbeat:IPaddr2):       Started node01

Node Attributes:
* Node node01:
 + node02-eth0                       : up
* Node node02:
 + node01-eth0                       : up

Migration summary:
* Node node02:
* Node node01:

[root@node01 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:91:00 brd ff:ff:ff:ff:ff:ff
 inet 192.168.28.121/24 brd 192.168.28.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:91:01 brd ff:ff:ff:ff:ff:ff
 inet 192.168.200.200/24 brd 192.168.200.255 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:02 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:03 brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:91:04 brd ff:ff:ff:ff:ff:ff

[root@node01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:91:00
 inet addr:192.168.28.121  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:851 errors:0 dropped:0 overruns:0 frame:0
 TX packets:431 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:179548 (175.3 KiB)  TX bytes:96557 (94.2 KiB)

eth1      Link encap:Ethernet  HWaddr 00:16:3E:39:91:01
 inet addr:192.168.200.200  Bcast:192.168.200.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:0 (0.0 b)  TX bytes:2528 (2.4 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1259 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1259 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2053800 (1.9 MiB)  TX bytes:2053800 (1.9 MiB)

node01,node02以外のノード(ここではnode03)から「192.168.200.200」に対してpingを実行します。

[root@node03 ~]# ping 192.168.200.200
PING 192.168.200.200 (192.168.200.200) 56(84) bytes of data.
64 bytes from 192.168.200.200: icmp_seq=1 ttl=64 time=0.565 ms
64 bytes from 192.168.200.200: icmp_seq=2 ttl=64 time=0.159 ms
64 bytes from 192.168.200.200: icmp_seq=3 ttl=64 time=0.167 ms

ここで、node01のクラスタを停止します。

[root@node01 ~]# service heartbeat stop

IPアドレス「192.168.200.200」はnode02にフェイルオーバしました。

[root@node02 ~]# crm_mon -1 -Af
============
Last updated: Thu Dec  1 17:18:42 2011
Stack: Heartbeat
Current DC: node02 (22222222-2222-2222-2222-222222222222) - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node02 ]
OFFLINE: [ node01 ]

 p_ip   (ocf::heartbeat:IPaddr2):       Started node02

Node Attributes:
* Node node02:
 + node01-eth0                       : up

Migration summary:
* Node node02:

[root@node02 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:92:00 brd ff:ff:ff:ff:ff:ff
 inet 192.168.28.122/24 brd 192.168.28.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
 link/ether 00:16:3e:39:92:01 brd ff:ff:ff:ff:ff:ff
 inet 192.168.200.200/24 brd 192.168.200.255 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:02 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:03 brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
 link/ether 00:16:3e:39:92:04 brd ff:ff:ff:ff:ff:ff

[root@node02 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:39:92:00
 inet addr:192.168.28.122  Bcast:192.168.28.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:1106 errors:0 dropped:0 overruns:0 frame:0
 TX packets:667 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:186902 (182.5 KiB)  TX bytes:233273 (227.8 KiB)

eth1      Link encap:Ethernet  HWaddr 00:16:3E:39:92:01
 inet addr:192.168.200.200  Bcast:192.168.200.255  Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:39 errors:0 dropped:0 overruns:0 frame:0
 TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:5349 (5.2 KiB)  TX bytes:3347 (3.2 KiB)

lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:1422 errors:0 dropped:0 overruns:0 frame:0
 TX packets:1422 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:2268820 (2.1 MiB)  TX bytes:2268820 (2.1 MiB)

さて、pingはどうなったかというと…

[root@node03 ~]# ping 192.168.200.200

<省略>
64 bytes from 192.168.200.200: icmp_seq=14 ttl=64 time=0.159 ms
64 bytes from 192.168.200.200: icmp_seq=15 ttl=64 time=0.169 ms
64 bytes from 192.168.200.200: icmp_seq=17 ttl=64 time=3.22 ms
64 bytes from 192.168.200.200: icmp_seq=18 ttl=64 time=0.226 ms
64 bytes from 192.168.200.200: icmp_seq=19 ttl=64 time=0.177 ms
<省略>

確かに、一瞬遅延してますね。ただし、動作を確認した環境ではセッションの切断までは確認できませんでした。

ちなみに、node01,node02,node03は仮想マシン(xen)です。ネットワークスイッチは経由していません。

[root@ikedaj-81 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
[root@ikedaj-81 ~]# xm list
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0      967     2 r-----   9522.9
node01                                    13      256     1 -b----     26.3
node02                                    14      256     1 -b----     25.0
node03                                     3      256     1 -b----   2653.8

そういえば、「オレオレマスター」状態が確認されたということは node01のクラスタ停止ではなく、他の故障パターンで再現したっていうことですかね。私はなにか勘違いしているのかも。

今回は症状を再現することができませんでしたがクラスタの動作確認をする場合は、crm_monコマンドの表示をみて「リソースがフェイルオーバしたからよし！」だけではなくいろいろな観点から試験を行ってください。

では、今月はこれにてどろん！εεεεεヾ(*´ー`)ﾉ

ipmiは、ワシがenbugしてしまったんじゃよー。すまーん。