系统配置sun cluster 3.1 + oracle9最近发现oracle经常被sun cluster自动重启:
Jul 13 04:01:41 cqdb SC[SUNWscor.oracle_server.monitor]:cqc-rg
ra_server: [ID 564643 local7.error] Fault monitor detected error DBMS_ERROR: 3114 DEFAULT Action=NONE : Not connected?
Jul 13 04:01:44 cqdb SC[SUNWscor.oracle_server.monitor]:cqc-rg
ra_server: [ID 564643 local7.error] Fault monitor detected error DBMS_ERROR: 1034 DEFAULT Action=RESTART : Oracle is not available
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource ora_server status on node cqdb change to R_FM_FAULTED
Jul 13 04:01:44 cqdb SC[SUNWscor.oracle_server.monitor]:cqc-rg
ra_server: [ID 995339 local7.error] Restarting using scha_control RESTART
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group cqc-rg state on node cqdb change to RG_PENDING_OFFLINE
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource ora_server state on node cqdb change to R_MON_STOPPING
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource ora_lnsr-res state on node cqdb change to R_MON_STOPPING
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource usr5res state on node cqdb change to R_MON_STOPPING
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource dg-res state on node cqdb change to R_MON_STOPPING
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource cqc state on node cqdb change to R_MON_STOPPING
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching methodfor resource , resource group , timeoutseconds
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching methodfor resource , resource group , timeoutseconds
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching methodfor resource , resource group , timeoutseconds
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method for resource , resource group , timeoutseconds
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method for resource , resource group , timeoutseconds
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 736390 daemon.notice] methodcompleted successfully for resource , resource group , time used: 0% of timeout
Jul 13 04:01:44 cqdb Cluster.RGM.rgmd: [ID 736390 daemon.notice] methodcompleted successfully for resource , resource group , time used: 0% of timeout
经过初步检查是由于cluster检查不到oracle状态,就把oracle关闭后重启了,,但是为什么会造成这种情况呢?
现在我已经 Probe_timeout为 240
问个问题:cluster 监控oracle状态时怎么实现?
现在不知道是cluster的问题还oracle的问题?
|