heartbeat重启后服务接管问题

grantlee1988 2013-07-18 06:00:58

两台机器avengers（192.168.12.101）和titanic（192.168.12.102）组了HA，虚ip为192.168.12.100，系统ubuntu 12.04，heartbeat版本3.0.2
当在主机上执行/etc/init.d/heartbeart restart时，在从机上被接管的服务会start两次，求指教为啥？

haresourcdes内容为:avengers 192.168.12.100 sh1 sh2 (这里的sh1和sh2是我自定义的脚本)
ha.cf配置如下：（注释的配置我给取消了）
#debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 3
warntime 5
initdead 120
udpport 694
ucast eth0 192.168.12.102
auto_failback off
node A
node B
respawn hacluster /usr/lib/heartbeat/ipfail

以下是heartbeart重启后从机的日志：
Jul 13 00:10:00 titanic heartbeat: [7055]: info: Clock jumped backwards. Compensating.
Jul 13 00:39:49 titanic heartbeat: [7055]: info: Received shutdown notice from 'avengers'.
Jul 13 00:39:49 titanic heartbeat: [7055]: info: Resources being acquired from avengers.
Jul 13 00:39:49 titanic heartbeat: [13158]: info: acquire all HA resources (standby).
Jul 13 00:39:49 titanic heartbeat: [13159]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys titanic] to acquire.
ResourceManager[13185]: 2013/07/13_00:39:49 info: Acquiring resource group: avengers 192.168.12.100 sms_monitor memcached_list
IPaddr[13212]: 2013/07/13_00:39:49 INFO: Resource is stopped
ResourceManager[13185]: 2013/07/13_00:39:49 info: Running /etc/ha.d/resource.d/IPaddr 192.168.12.100 start
IPaddr[13270]: 2013/07/13_00:39:49 INFO: Using calculated nic for 192.168.12.100: eth0
IPaddr[13270]: 2013/07/13_00:39:49 INFO: Using calculated netmask for 192.168.12.100: 255.255.255.0
IPaddr[13270]: 2013/07/13_00:39:49 INFO: eval ifconfig eth0:0 192.168.12.100 netmask 255.255.255.0 broadcast 192.168.12.255
IPaddr[13258]: 2013/07/13_00:39:49 INFO: Success
ResourceManager[13185]: 2013/07/13_00:39:49 info: Running /etc/init.d/sms_monitor start
ResourceManager[13185]: 2013/07/13_00:39:51 info: Running /etc/init.d/memcached_list start
Jul 13 00:39:51 titanic heartbeat: [13158]: info: all HA resource acquisition completed (standby).
Jul 13 00:39:51 titanic heartbeat: [7055]: info: Standby resource acquisition done [all].
harc[13506]: 2013/07/13_00:39:51 info: Running /etc/ha.d//rc.d/status status
mach_down[13521]: 2013/07/13_00:39:51 info: Taking over resource group 192.168.12.100
ResourceManager[13546]: 2013/07/13_00:39:51 info: Acquiring resource group: avengers 192.168.12.100 sms_monitor memcached_list
IPaddr[13573]: 2013/07/13_00:39:51 INFO: Running OK
ResourceManager[13546]: 2013/07/13_00:39:51 info: Running /etc/init.d/sms_monitor start
ResourceManager[13546]: 2013/07/13_00:39:53 info: Running /etc/init.d/memcached_list start
mach_down[13521]: 2013/07/13_00:39:53 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[13521]: 2013/07/13_00:39:53 info: mach_down takeover complete for node avengers.
Jul 13 00:39:53 titanic heartbeat: [7055]: info: mach_down takeover complete.
Jul 13 00:39:55 titanic heartbeat: [7055]: WARN: node avengers: is dead
Jul 13 00:39:55 titanic heartbeat: [7055]: info: Dead node avengers gave up resources.
Jul 13 00:39:55 titanic ipfail: [7078]: info: Status update: Node avengers now has status dead
Jul 13 00:39:55 titanic heartbeat: [7055]: info: Link avengers:eth0 dead.
Jul 13 00:39:55 titanic ipfail: [7078]: info: NS: We are dead. :<
Jul 13 00:39:55 titanic ipfail: [7078]: info: Link Status update: Link avengers/eth0 now has status dead
Jul 13 00:39:56 titanic ipfail: [7078]: info: We are dead. :<
Jul 13 00:39:56 titanic ipfail: [7078]: info: Asking other side for ping node count.
Jul 13 00:40:05 titanic heartbeat: [7055]: info: Heartbeat restart on node avengers
Jul 13 00:40:05 titanic heartbeat: [7055]: info: Link avengers:eth0 up.
Jul 13 00:40:05 titanic heartbeat: [7055]: info: Status update for node avengers: status init
Jul 13 00:40:05 titanic ipfail: [7078]: info: Link Status update: Link avengers/eth0 now has status up
Jul 13 00:40:05 titanic ipfail: [7078]: info: Status update: Node avengers now has status init
Jul 13 00:40:05 titanic heartbeat: [7055]: info: Status update for node avengers: status up
Jul 13 00:40:05 titanic ipfail: [7078]: info: Status update: Node avengers now has status up
harc[14295]: 2013/07/13_00:40:05 info: Running /etc/ha.d//rc.d/status status
harc[14310]: 2013/07/13_00:40:05 info: Running /etc/ha.d//rc.d/status status
Jul 13 00:40:07 titanic heartbeat: [7055]: info: Status update for node avengers: status active
Jul 13 00:40:07 titanic ipfail: [7078]: info: Status update: Node avengers now has status active
harc[14352]: 2013/07/13_00:40:07 info: Running /etc/ha.d//rc.d/status status
Jul 13 00:40:07 titanic ipfail: [7078]: info: Asking other side for ping node count.
Jul 13 00:40:07 titanic heartbeat: [7055]: info: remote resource transition completed.
Jul 13 00:40:12 titanic ipfail: [7078]: info: No giveup timer to abort.
Jul 13 02:10:00 titanic heartbeat: [7055]: info: Clock jumped backwards. Compensating.
Jul 13 04:10:01 titanic heartbeat: [7055]: info: Clock jumped backwards. Compensating.

...全文