不知道你的cmcfg.ora 和ocmargs.ora配好没有。给你贴篇文档你参考一下:
Subject:RAC Linux 9.2: Configuration of cmcfg.ora and ocmargs.ora
Doc ID:Note:222746.1 Type:REFERENCE
Last Revision Date:26-AUG-2004 Status:PUBLISHED
PURPOSE
-------
This article will document the parameters in the Oracle Cluster
Manager (oracm) configuration files cmcfg.ora and ocmargs.ora.
SCOPE & APPLICATION
-------------------
This article is intended for Linux System Administrators and DBAs
who are configuring Real Application Clusters on Linux.
To better understand Oracle9i Release2 (9.2.0.2 and above) changes and new
features for the Oracle Cluster Manager (oracm), a brief description of the
architecture in Oracle9i Release 2 (9.2.0.1) is necessary. Oracle release
9.0.1 included these main components:
oNM service (only in 9.0.1)
oCM services (in 9.2.0.1), the NM and CM services were merged into
a single service, called oracm
oWatchdog Daemon (watchdogd)
NOTE: There is a software implementation of Watchdog in the Linux kernel,
called softdog, which when in use in conjunction with the Watchdog
Daemon causes a hardware reset of the node if the Watchdog Daemon
does not send a notification (ping) to the softdog within a
specified amount of time (soft_margin).
The Watchdog daemon is an Oracle supplied process which monitors the
oracm and sends pings to softdog through the Watchdog device, /dev/watchdog,
at defined intervals. It also monitors each oracm thread by receiving ping
messages from them, which have registered with watchdogd.
Watchdog Daemon detects the following cases:
oCM thread hang/death
oUser mode delay (scheduler / VM problem)
oKernel hang
In certain cases, systems with high loads resulted in unnecessary reboots.
As such, in Oracle9i Release 2 (9.2.0.2 and above) the Watchdog is detached from
the Cluster Manager.
New Watchdog Implementation in 9.2.0.2 and above
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In place of the Watchdog Daemon (watchdogd), the 9.2.0.2 and above version of
the oracm for Linux now includes the use of a Linux kernel module called
hangcheck-timer.The hangcheck-timer module monitors the Linux kernel
for long operating system hangs, and reboots the node if this occurs,
thereby preventing the database from potential corruptions. This is the
new I/O fencing mechanism for RAC on Linux.
This approach offers three advantages over the Watchdog implementation,
including:
oNode resets are triggered from within the Linux kernel making them
much less affected by the system load.
oThe oracm service on a node can easily be stopped and reconfigured,
as its operation is completely independent of the kernel module.
oThe features provided by the hangcheck-timer module closely resemble
features found in the implementation of the Cluster Manager for RAC
on the Windows platform.
The removal of the watchdogd means that the following parameters included
in the cmcfg.ora file are no longer valid:
oWatchdogTimerMargin
oWatchdogSafetyMargin
Please remove these watchdog parameters from your cmcfg.ora file.
Configuration of $ORACLE_HOME/oracm/admin/cmcfg.ora:
----------------------------------------------------
Parameters for cmcfg.ora, from a working Redhat Advanced Server 2.1
installation:
ClusterName=Oracle Cluster Manager, version 9i
KernelModuleName=hangcheck-timer
HeartBeat=15000
PollInterval=1000
MissCount=215
PrivateNodeNames=int-opcbr1 int-opcbr2
PublicNodeNames=opcbr1 opcbr2
ServicePort=9998
CmDiskFile=/u03/RAC/quorum.dbf
HostName=int-opcbr1
Detailed explaination of each parameter:
---------------------------------------
ClusterName:
This is a fixed parameter, and needs to remain at the above, default value.
KernelModuleName:
This is a new parameter for 9.2.0.2, and is required in order to use the
new Oracle-supplied hangcheck-timer module.
HeartBeat:
Leave at default setting of HeartBeat=15000.
PollInterval:
Leave at default, PollInterval=1000.
MissCount:
The MissCount must be set to a large value (at least 60) and must be greater
than the sum of hangcheck_tick + hangcheck_margin.We recommend 215 seconds.
The hangcheck_tick + hangcheck_margin parameter are set when you load
the hangcheck-timer module like:
/sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
Note that this value may need to be lowered for Transparent Application
Failover (TAF) environments.
PrivateNodeNames:
These are the /etc/host names for the private network used for RAC traffic.
PublicNodeNames:
These are the names returned by the hostname() system call.This list
is used by the Oracle Universal Installer (OUI) to determine what
available cluster members are present at software installation time.
ServicePort:
Default UDP port that the oracm service will open at startup for the
CM communications with other cluster nodes.
HostName:
The hostname (interface) where you want oracm to open it's UDP port for
cluster communications.
CmDiskFile:
This mandatory parameter points to a raw device or cluster file system
(OCFS) file that will be used by all cluster nodes.
-- End of cmcfg.ora parameters --
Configuration of ocmargs.ora
----------------------------
Parameters for ocmargs.ora, from a working Redhat Advanced Server 2.1
installation:
oracm
norestart 1800
Detailed explaination of each parameter:
---------------------------------------
oracm:
The name of the binary executable used to launch the Oracle Cluster
Manager daemon.
norestart:
The value used by the ocmstart.sh startup script to prevent too frequent
oracm restarts.
-- End of ocmargs.ora parameters --
NOTE:When adding a node to an existing cluster:
--------------------------------------------------
You may want to make a note of the values contained within these two files if they
are different from the default values.When adding a node to an existing cluster with
OUI, the default values may get reinserted.You will want to verify the correct values are present (or reinserted) after the addition is completed.
RELATED DOCUMENTS
-----------------
For more information on how to install and configure the Oracle
Cluster Manager, please read the Oracle Linux 9.2.0.2 README/Patchset
|