找到些资料 , 有点像, 研究中
http://www.itpub.net/showthread. ... 928&pagenumber=
Reason 2: An instance death was detected. This can happen if:
a) An instance fails to issue a heartbeat to the control file.
When the heartbeat is missing, LMON will issue a network ping to the instance
not issuing the heartbeat. As long as the instance responds to the ping,
LMON will consider the instance alive. If, however, the heartbeat is not
issued for the length of time of the control file enqueue timeout, the
instance is considered to be problematic and will be evicted.
Common causes for an ORA-29740 eviction (Reason 2):
a) NTP (Time changes on cluster) - usually on Linux, Tru64, or IBM AIX
b) Network Problems (SAN).
c) Resource Starvation (CPU, I/O, etc..)
d) An Oracle bug.
Common bugs for reason 2 evictions:
If you feel that this eviction was not correct, do a search in Metalink or the
bug database for:
ORA-29740 'reason 2'
Important files to review are:
a) Each instance's alert log
b) Each instance's LMON trace file
c) Statspack reports from all nodes leading up to the eviction
d) The CKPT process trace file of the evicted instance
e) Other bdump or udump files...
f) Each node's syslog or messages file
g) iostat output before, after, and during evictions
h) vmstat output before, after, and during evictions
i) netstat output before, after, and during evictions
-----------------------------------------------------------------------------
Reason 3: Communications Failure. This can happen if:
a) The LMON processes loose communication between one another.
b) One instance loses communications with the LMD process of another
instance.
c) An LMON process is blocked, spinning, or stuck and is not
responding to the other instance(s) LMON process.
d) An LMD process is blocked or spinning.
In this case the ORA-29740 error is recorded when there are communication
issues between the instances. It is an indication that an instance has been
evicted from the configuration as a result of IPC send timeout. A
communications failure between a foreground, or background other than LMON,
and a remote LMD will also generate a ORA-29740 with reason 3. When this
occurs, the trace file of the process experiencing the error will print a
message:
Reporting Communication error with instance:
If communication is lost at the cluster layer (for example, network cables
are pulled), the cluster software may also perform node evictions in the
event of a cluster split-brain. Oracle will detect a possible split-brain
and wait for cluster software to resolve the split-brain. If cluster
software does not resolve the split-brain within a specified interval,
Oracle proceeds with evictions.
Oracle Support has seen cases where resource starvation (CPU, I/O, etc...) can
cause an instance to be evicted with this reason code. The LMON or LMD process
could be blocked waiting for resources and not respond to polling by the remote
instance(s). This could cause that instance to be evicted. If you have
a statspack report available from the time just prior to the eviction on the
evicted instance, check for poor I/O times and high CPU utilization. Poor I/O
times would be an average read time of > 20ms.
Common causes for an ORA-29740 eviction (Reason 3):
a) Network Problems.
b) Resource Starvation (CPU, I/O, etc..)
c) Severe Contention in Database.
d) An Oracle bug.
|