感谢你的回答。
我也觉得脑裂特指网络心跳出现故障。
磁盘心跳出现故障有特定的叫法吗?
前段时间我们有个地方的10.2.0.5 RAC生产环境出现votedisk访问不稳定的情况,超时200秒,导致2个节点的实例都被终止。下面是日志截取:
Wed Mar 16 13:54:03 CST 2011
LMON (ospid: 8471) is not heartbeating for 202 seconds.
LMON is not healthy and has no heartbeat.
Please check LMD0/LMS0 and DIAG trace files for detail.
Wed Mar 16 13:54:09 CST 2011
LMS0 (ospid: 8479) is terminating the instance.
LMS0: terminating instance due to error 484
Wed Mar 16 13:54:21 CST 2011
Termination issued to instance processes. Waiting for the processes to exit
Wed Mar 16 13:54:27 CST 2011
Instance termination failed to kill one or more processes
Instance terminated by LMS0, pid = 8479
而非重启节点,但我后来查了一下,我判断原因在于FC HBA卡与存储不兼容造成的访问的不稳定导致的,但另一台服务器的兼容性是没问题的,第1台被终止之后,随后第2个节点也被终止了!!!
难道就如chensq版主所说存在很多bug?
原帖由 尛样儿 于 2011-4-5 21:23 发表
感谢你的回答。
我也觉得脑裂特指网络心跳出现故障。
磁盘心跳出现故障有特定的叫法吗?
前段时间我们有个地方的10.2.0.5 RAC生产环境出现votedisk访问不稳定的情况,超时200秒,导致2个节点的实例都被终止。下面是日志截取:
Wed Mar 16 13:54:03 CST 2011
LMON (ospid: 8471) is not heartbeating for 202 seconds.
LMON is not healthy and has no heartbeat.
Please check LMD0/LMS0 and DIAG trace files for detail.
Wed Mar 16 13:54:09 CST 2011
LMS0 (ospid: 8479) is terminating the instance.
LMS0: terminating instance due to error 484
Wed Mar 16 13:54:21 CST 2011
Termination issued to instance processes. Waiting for the processes to exit
Wed Mar 16 13:54:27 CST 2011
Instance termination failed to kill one or more processes
Instance terminated by LMS0, pid = 8479
而非重启节点,但我后来查了一下,我判断原因在于FC HBA卡与存储不兼容造成的访问的不稳定导致的,但另一台服务器的兼容性是没问题的,第1台被终止之后,随后第2个节点也被终止了!!!
难道就如chensq版主所说存在很多bug?
1号应该是主节点,把2号踢了,结果自己也挂了。