Veritas Cluster Server on Solaris System Administration Guide by Brendan Choi --------------------------------------------- VCS DEFAULT VALUES Private Network Heartbeat frequency = 0.5 sec Can be modified in /etc/llttab in 1/100th secs. Low-Pri Network Heartbeat frequency = 1.0 sec Can be modified in /etc/llttab in 1/100th secs. Failover interval after reboot command (VCS 1.3 and up) = 60 sec Can be modified with hasys's ShutdownTimeout attribute. Resource monitoring interval (by Resource Type) = 60 sec Monitoring an offline Resource (by Resource Type) = 300 sec LLT dead system declaration = 21 sec (16 sec peer inactive + 5 sec GAB stable timeout value) Peer inactive can be changed using "set-timer" in /etc/llttab in 1/100th secs. Stable timeout value can be changed using "gabconfig -t". GAB-HAd heartbeat = 15 sec (set by VCS_GAB_TIMEOUT environment variable in milliseconds, requires restart or had) Time GAB allows HAd to be killed before panic (IOFENCE) = 15 sec (set by gabconfig -f) Max. Number of Network Heartbeats = 8 Max. Number of Disk Heartbeats = 4 VCS had engine port = 14141 VCS 2.0 Web Server port = 8181 LLT SAP value = 0xcafe ------------------------------------------------------- GABCONFIG SETTINGS OPTIONS Running "gabconfig -l" on a node will give you the current GAB settings for that node. These values can be changed with the gabconfig command. EXAMPLE: draconis # gabconfig -l GAB Driver Configuration Driver state : ConfiguredPartition arbitration: Disabled Control port seed : Enabled Halt on process death: Disabled Missed heartbeat halt: Disabled Halt on rejoin : Disabled Keep on killing : Disabled Restart : Enabled Node count : 2 Disk HB interval (ms): 1000 Disk HB miss count : 4 IOFENCE timeout (ms) : 15000 Stable timeout (ms) : 5000 Here is how each gabconfig option goes with each value: Driver state -c Partition arbitration -s Control port seed -n2 or -x Halt on process death -p Missed heartbeat halt -b Halt on rejoin -j Keep on killing -k IOFENCE timeout (ms) -f Stable timeout (ms) -t To test "Halt on process death", do kill -9 on the had/hashadow PID. To test "Missed heartbeat halt", do kill -23 on the had PID. In both tests, GAB will panic your system. ------------------------------------------------------- VCS PACKAGES SOLARIS Here are the Solaris packages for VCS Version 1.1.1: optional VRTScsga VERITAS Cluster Server Graphical Administrator optional VRTSgab VERITAS Group Membership and Atomic Broadcast optional VRTSllt VERITAS Low Latency Transport optional VRTSperl VERITAS Perl for VRTSvcs optional VRTSvcs VERITAS Cluster Server optional VRTSvcsor VERITAS Cluster Server Oracle Enterprise Extension NOTE: Veritas advises that version 1.1 and 1.1.1 not be used in production. Here are the Solaris packages for VCS Version 1.3: optional VRTScscm VERITAS Cluster Server Cluster Manager optional VRTSgab VERITAS Group Membership and Atomic Broadcast optional VRTSllt VERITAS Low Latency Transport optional VRTSperl VERITAS Perl for VRTSvcs optional VRTSvcs VERITAS Cluster Server optional VRTSvcsor VERITAS Cluster Server Oracle Enterprise Extension Here are the Solaris packages for VCS Version 2.0: optional VRTSgab VERITAS Group Membership and Atomic Broadcast optional VRTSllt VERITAS Low Latency Transport optional VRTSperl VERITAS Perl for VRTSvcs optional VRTSvcs VERITAS Cluster Server optional VRTSvcsdb VERITAS Cluster Server Db2udb Enterprise Extension optional VRTSvcsdc VERITAS Cluster Server Documentation optional VRTSvcsw VERITAS Cluster Manager (Web Console) Here are the Solaris packages for VCS QuickStart Version 2.0: optional VRTSappqw VERITAS Cluster Server Application QuickStart Wizard optional VRTSgab VERITAS Group Membership and Atomic Broadcast system VRTSlic VERITAS Licensing Utilities optional VRTSllt VERITAS Low Latency Transport optional VRTSperl VERITAS Perl for VRTSvcs optional VRTSvcs VERITAS Cluster Server optional VRTSvcsqw VERITAS Cluster Server QuickStart Cluster Manager (We b Console) ----------------------------------------------- VCS INSTALLATION Install packages in this order: VRTSllt VRTSgab VRTSvcs VRTSperl VRTScscm VRTSvcsor Copy gabtab and llttab files from /opt/VRTSgab and /opt/VRTSllt to /etc. Configure both files. Create /etc/llthosts (MANDATORY for version 1.3). ----------------------------------------------- PROCESSES ON VCS SERVER Some processes commonly found on a VCS node include: root 577 1 0 Sep 14 ? 16:53 /opt/VRTSvcs/bin/had root 582 1 0 Sep 14 ? 0:00 /opt/VRTSvcs/bin/hashadow root 601 1 0 Sep 14 ? 2:33 /opt/VRTSvcs/bin/DiskGroup/DiskGroupAgent -type DiskGroup root 603 1 0 Sep 14 ? 0:56 /opt/VRTSvcs/bin/IP/IPAgent -type IP root 605 1 0 Sep 14 ? 10:17 /opt/VRTSvcs/bin/Mount/MountAgent -type Mount root 607 1 0 Sep 14 ? 11:23 /opt/VRTSvcs/bin/NIC/NICAgent -type NIC root 609 1 0 Sep 14 ? 31:14 /opt/VRTSvcs/bin/Oracle/OracleAgent -type Oracle root 611 1 0 Sep 14 ? 3:34 /opt/VRTSvcs/bin/SPlex/SPlexAgent -type SPlex root 613 1 0 Sep 14 ? 8:06 /opt/VRTSvcs/bin/Sqlnet/SqlnetAgent -type Sqlnet root 20608 20580 0 12:04:03 pts/1 0:20 /opt/VRTSvcs/bin/../gui/jre1.1.6/bin/../bin/sparc/green_threads/jre -mx128m VCS ---------------------------------------------------- GENERAL NOMENCLATURE VCS is concerned with the following components: Cluster Attributes Agents Systems Service Groups Resource Types Resources You can cluster many basic UNIX services, such as: User home directories NIS services NTP time services To cluster an application, you cluster individual services into Service Groups. Resource types can be "On-Off", "OnOnly" or "Persistent". An NFS resource is "OnOnly" because NFS may be needed by filesystems outside of VCS control. A NIC (network card) resource is "Persistent" because VCS cannot stop or start a NIC. ----------------------------------------------------- QUICKSTART SCRIPT Earlier versions of VCS included a Quickstart Wizard. After installing the VCS packages, execute the Quickstart script from an Xterm. /opt/VRTSvcs/wizards/config/quick_start ------------------------------------------------------ VCS STARTUP CONFIGURATION FILES VCS startup and stop files include: /etc/rc2.d/S70llt /etc/rc2.d/S92gab /etc/rc3.d/S99vcs /etc/rc0.d/K10vcs Important VCS config files include: /etc/VRTSvcs/conf/config/main.cf /etc/VRTSvcs/conf/config/types.cf /etc/llttab /etc/gabtab /etc/llthosts You should not edit config files in /etc/VRTSvcs/conf/config without bringing the Cluster down first. ------------------------------------------------------ SHUTDOWN VCS HASTOP To shutdown VCS on all the systems without bringing any user services down: /opt/VRTSvcs/bin/hastop -all -force NOTE: This is the way to stop the Cluster if it is open read- write. If the Cluster is open, you will have a .stale file. The next hastart will not startup service groups that were already offline. To shutdown VCS and Service Groups only on the local server: /opt/VRTSvcs/bin/hastop -local To shutdown VCS only on the local server, and keep services up on the current node: /opt/VRTSvcs/bin/hastop -local -force NOTE: This is the way to stop the Cluster if it is open read- write. If the Cluster is open, you will have a .stale file. To shutdown locally and move Service Groups to another machine: /opt/VRTSvcs/bin/hastop -local -evacuate Stopping VCS on any node will write the configuration to the main.cf file on each node if -force is not used and Cluster is closed (read-only). ------------------------------------------------------ LLT GAB COMMON INFORMATIONAL COMMANDS /sbin/gabconfig -a Verify LLT and GAB are up and running. /sbin/lltstat -n Show heartbeat status /sbin/lltstat -nvv Show the heartbeats with MAC addresses for up to 32 nodes. NOTE: lltstat displays "Link" using the tag you use in /etc/llttab. /sbin/lltstat -p Show port status /sbin/lltconfig -a list See MAC addresses on LLT links. /sbin/lltconfig -T query Display heartbeat frequencies. To test and watch LLT traffic between 2 nodes: /opt/VRTSllt/llttest -p 1 >transmit -n <name of other node> -c 5000 /opt/VRTSllt/llttest -p 1 (on other node) >receive -c 5000 /opt/VRTSllt/lltdump -f <network link device> Show LLT traffic. /opt/VRTSllt/lltshow -n <node name> Show LLT kernel structures. /opt/VRTSllt/dlpiping -vs <network link device> Turn on your dlpiping server. /opt/VRTSllt/dlpiping -c <network link device> <MAC address of other node> Send LLT packet to other node and see response. Other node must have dlpiping server running. ------------------------------------------------------ GABCONFIG LLTCONFIG SEEDING CLUSTER STARTUP LLTSTAT GAB and LLT operate at Layer 2 of the TCP/IP OSI stack. LLT is a Data Link Provider Interface (DLPI) protocol. GAB deals with: (1) Cluster memberships (2) Monitoring hearbeats (3) Distributing information throughout the Cluster LLT deals with: (1) System ID's in the Cluster (2) Setting Cluster ID's for multiple clusters. (3) Tuning network heartbeat frequencies. Heartbeat frequency is 0.5 seconds on a private network, and 1.0 seconds on a low-pri network. Use "/sbin/lltconfig -T query" to find out your heartbeat frequencies. Use gabconfig to control Cluster seeding and startup. EXAMPLE: If the Cluster normally has 4 nodes, then /etc/gabtab should contain: /sbin/gabconfig -c -n 4 VCS will then not start until all 4 nodes are up. You should execute this on each node of the Cluster. To start VCS with less number of nodes, execute gabconfig with a lower node count. To seed the Cluster manually if no other nodes are available, execute: /sbin/gabconfig -c -x NOTE: If no other nodes are available, you must do this if you want to start VCS on the current node. To see that LLT and GAB are up and running: /sbin/gabconfig -a GAB Port Memberships =============================================================== Port a gen 4b2f0011 membership 01 Port h gen a6690001 membership 01 The port "a" indicates GAB is communicating, port "h" indicates VCS is communicating. The "01" indicates node 0 and node 1. The gen strings are random generated numbers. GAB Port Memberships =================================== Port a gen a36e0003 membership 01 Port a gen a36e0003 jeopardy 1 Port h gen fd570002 membership 01 Port h gen fd570002 jeopardy 1 This output indicates one of the heartbeat links is down, so VCS is in jeopardy mode. GAB Port Memberships =============================================================== Port a gen 3a24001f membership 01 Port h gen a10b0021 membership 0 Port h gen a10b0021 visible ;1 This output indicates that GAB on node 1 has lost contact with its VCS daemons. GAB Port Memberships =============================================================== Port a gen 3a240021 membership 01 This output indicates that VCS daemons are down on the current node, but GAB and LLT are still up. To see the current LLT configuration: /sbin/lltconfig -a list To shutdown GAB: /sbin/gabconfig -U To unload GAB (or LLT) kernel module: modinfo | grep <gab | llt> (to find the module number) modunload -i <module number> To shutdown the LLT: lltconfig -U Commands to monitor LLT status: /sbin/lltstat -n Shows heartbeat status /sbin/lltstat -nvv Shows the heartbeats with MAC addresses /sbin/lltstat -p Shows port status ------------------------------------------------------ NETWORK INTERFACES For 2 Nodes, you need at least 4 interfaces on each server. The LLT interfaces use a VCS protocol, not IP, on their own private networks. You can have up to 8 LLT network links. Here's a common configuration on Sun: hme0 ----> VCS Private LAN 0 LLT connection qfe0 ----> VCS Private LAN 1 LLT connection qfe1 ----> Server's IP qfe2 ----> Cluster Virtual IP (managed by VCS) The VIP and server IP must belong on the same subnet. Do not create /etc/hostname.hme0 or /etc/hostname.qfe0 files if those are the LLT interfaces. Important VCS files in /etc: /etc/rc2.d/S70llt /etc/rc2.d/S92gab /etc/rc3.d/S99vcs /etc/llttab /etc/gabtab /etc/llthosts EXAMPLES: Low Latency Transport configuration /etc/llttab: set-node cp01 set-cluster 3 link hme1 /dev/hme:1 - ether - - link qfe0 /dev/qfe:0 - ether - - link-lowpri qfe4 /dev/qfe:4 - ether - - start NOTE: Each VCS Cluster on a LAN must have its own ID. The "set-cluster" value in the /etc/llttab is the Cluster's ID number. The first string after "link" is a tag you can name anyway you want. It is shown in the "lltstat" command. Group Membership Atomic Broadcast configuration /etc/gabtab: /sbin/gabconfig -c -n3 Low Latency Hosts Table /etc/llthosts: 1 cp01 2 cp02 3 cp03 These files start the LLT and GAB communications: /etc/rc2.d/S70llt /etc/rc2.d/S92gab This symlink in /dev must exist: ln -s ../devices/pseudo/clone@0:llt llt In /devices/pseudo : crw-rw-rw- 1 root sys 11,109 Sep 21 10:38 clone@0:llt crw-rw-rw- 1 root sys 143, 0 Sep 21 10:39 gab@0:gab_0 crw-rw-rw- 1 root sys 143, 1 Feb 1 16:59 gab@0:gab_1 crw-rw-rw- 1 root sys 143, 2 Sep 21 10:39 gab@0:gab_2 crw-rw-rw- 1 root sys 143, 3 Sep 21 10:39 gab@0:gab_3 crw-rw-rw- 1 root sys 143, 4 Sep 21 10:39 gab@0:gab_4 crw-rw-rw- 1 root sys 143, 5 Sep 21 10:39 gab@0:gab_5 crw-rw-rw- 1 root sys 143, 6 Sep 21 10:39 gab@0:gab_6 crw-rw-rw- 1 root sys 143, 7 Sep 21 10:39 gab@0:gab_7 crw-rw-rw- 1 root sys 143, 8 Sep 21 10:39 gab@0:gab_8 crw-rw-rw- 1 root sys 143, 9 Sep 21 10:39 gab@0:gab_9 crw-rw-rw- 1 root sys 143, 10 Sep 21 10:39 gab@0:gab_a crw-rw-rw- 1 root sys 143, 11 Sep 21 10:39 gab@0:gab_b crw-rw-rw- 1 root sys 143, 12 Sep 21 10:39 gab@0:gab_c crw-rw-rw- 1 root sys 143, 13 Sep 21 10:39 gab@0:gab_d crw-rw-rw- 1 root sys 143, 14 Sep 21 10:39 gab@0:gab_e crw-rw-rw- 1 root sys 143, 15 Sep 21 10:39 gab@0:gab_f /etc/name_to_major (numbers differ on each system): llt 109 gab 143 ------------------------------------------------------ STARTUP VCS HASTART HACONF MAIN.CF DUMP VCS only starts up locally on a machine. You must manually start or reboot other nodes in sequence if the main.cf files are different. Startup the node with the main.cf that you want for the Cluster. To startup VCS: /opt/VRTSvcs/bin/hastart If another node has already started and been seeded, VCS will load that other node's main.cf into the memory of the current node. To start VCS and treat the configuration as stale even if it is valid: /opt/VRTSvcs/bin/hastart -stale This will make a .stale file throughout the Cluster. If VCS fails to start normally, the configuration might be stale. If a .stale file exists and you really need to start the cluster now, use the "force" option to override the stale file and start the cluster: /opt/VRTSvcs/bin/hastart -force After you start VCS on all the nodes, you must tell VCS to write the Cluster configuration to main.cf on disks on all nodes. This will remove the .stale file. In VCS 2.0, the .stale file is automatically removed on a forced startup. /opt/VRTSvcs/bin/haconf -dump -makero NOTE: The node that was started first will just have its main.cf file reloaded with the same information. The other nodes will have theirs updated. Main.cf, types.cf and any include files are written to automatically when a node joins a cluster and when the cluster changes configuration while it is online. -------------------------------------------------------- HASTATUS HASYS CHECK STATUS To verify that the Cluster is up and running: /opt/VRTSvcs/bin/hastatus (will show a real-time output of VCS events) /opt/VRTSvcs/bin/hastatus -sum /opt/VRTSvcs/bin/hasys -display -------------------------------------------------------- START AND STOP SERVICE GROUPS You can manually start (online) and stop (offline) Service Groups on a given server. hagrp -online <service group> -sys <host name> hagrp -offline <service group> -sys <host name> -------------------------------------------------------- HAGRP SWITCH MIGRATE FREEZE SERVICE GROUPS FAILOVER To move (evacuate) Service Groups from one Node to another: hagrp -switch <Group name> -to <Hostname of other Node> To freeze a Service Group: hagrp -freeze <Service Group> -presistent -------------------------------------------------------- BINARIES MAN PAGES PATH The man pages for VCS are in the following directories: /opt/VRTSllt/man /opt/VRTSgab/man /opt/VRTSvcs/man Most of the binaries are stored in: /opt/VRTSvcs/bin ------------------------------------------------------ COMMON MONITORING DISPLAY COMMANDS hastatus -summary Show current status of the VCS Cluster hasys -list List all Systems in the Cluster hasys -display Get detailed information about each System hagrp -list List all Service Groups hagrp -resources <Service Group> List all Resources of a Service Group hagrp -dep <Service Group> List a Service Group's dependencies hagrp -display <Service Group> Get detailed information about a Service Group haagent -list List all Agents haagent -display <Agent> Get information about an Agent hatype -list List all Resource Types hatype -display <Resource Type> Get detailed information about a Resource Type hatype -resources <Resource Type> List all Resources of a Resource Type hares -list List all Resources hares -dep <Resource> List a Resource's dependencies hares -display <Resource> Get detailed information about a Resource haclus -display List attributes and attribute values of the Cluster ------------------------------------------------------ VCS COMMAND SET PROCESSES Most commands are stored in /opt/VRTSvcs/bin. hagrp Evacuate Service Groups from a Node Check groups, group resources, dependencies, attributes Start, stop, switch, freeze, unfreeze, disable, enable, flush, disable and enable resources in a group hasys Check Node parameters List Nodes in the Cluster, attributes, resource types, resources, attributes of resources Freeze, thaw node haconf Dump HA configuration hauser Manage VCS user accounts hastatus Check Cluster status haclus Check Cluster attributes hares Check resources Online and offline a resource, offline and propagate to children, probe, clear faulted resource haagent List agents, agent status, start and stop agents hastop Stop VCS hastart Start VCS hagui Change Cluster configuration hacf Generate main.cf file. Verify the local configuration haremajor Change Major number on shared block device gabconfig Check status of the GAB gabdisk Control GAB Heartbeat Disks (VCS 1.1.x) gabdiskx Control GAB Heartbeat Disks gabdiskhb Control GAB Heartbeat Disks lltstat Check status of the link rsync Distribute agent code to other Nodes Other processes: had The VCS engine itself. This is a high priority real time (RT) process. It might still get swapped out or sleeping in a kernel system call. hashadow Monitors and restarts the VCS engine. halink Monitors communication links in the Cluster. ------------------------------------------------------ HACF CONFIGURATION To verify the current configuration (works even if VCS is down): cd /etc/VRTSvcs/conf/config hacf -verify . To generate a main.cf file: hacf -generate To generate a main.cmd from a main.cf: hacf -cftocmd . To generate a main.cf from a main.cmd: hacf -cmdtocf . ------------------------------------------------------ HACONF CONFIGURATION FILE MAIN.CF To set the VCS configuration file (main.cf) to read-write: haconf -makerw NOTE: This will create the .stale file. To set the VCS configuration file to read-only. haconf -dump -makero EXAMPLES: To add a VCS user: haconf -makerw hauser -add <username> haconf -dump -makero To add a new system called "sysa" to a group's SystemList with a priority of 2: haconf -makerw hagrp -modify group1 SystemList -add sysa 2 haconf -dump -makero ------------------------------------------------------ RHOSTS ROOT ACCESS Add .rhosts files for user root to Nodes for transparent rsh access between Nodes. To add the root user to VCS: haconf -makerw hauser -add root haconf -dump -makero To change the VCS root password: haconf -makerw hauser -update root haconf -dump -makero ------------------------------------------------------ HASYS SHUTDOWN REBOOT FAILOVER Starting with VCS 1.3, a reboot will cause a failover if the server goes offline (completes the shutdown) within a specified amount of time (default is 60 seconds). To change this amount of time, execute for each node: haconf -makerw hasys -modify <system name> ShutdownTimeout <seconds> haconf -dump -makero If you don't want a failover during a reboot, set the time to 0. ------------------------------------------------------ HAREMAJOR VERITAS VOLUMES MAJOR MINOR NUMBERS If a disk partition or volume is to be exported over NFS (e.g., for a high availability NFS server), then the major and minor numbers on all nodes must match. Veritas Volumes: To change the Major numbers of a Veritas Volume to be the same as the Major numbers on the other node: haremajor -vx <vxio major number> <vxspec major number> You can find these Major numbers by doing 'grep vx /etc/name_to_major' on the other node. If the minor numbers of a Veritas Volume do not match, you must use "vxdg" with the "reminor" option. Disk Partitions: To make the major numbers match, execute: haremajor -sd <new major number> If the minor numbers of a disk partition do not match, you must make the instance numbers in /etc/path_to_inst match. After doing all this, execute: reboot -- -rv ------------------------------------------------------ VCS AGENTS Agents are stored under /opt/VRTSvcs/bin. Typical agents may include: CLARiiON (commercial) Disk DiskGroup ElifNone FileNone FileOnOff FileOnOnly IP IPMultiNIC Mount MultiNICA NFS (used by NFS server) NIC Oracle (Part of Oracle Agent - commercial) Phantom Process Proxy ServiceGroupHB Share (used by NFS server) Sqlnet (Part of Oracle Agent - commercial) Volume These agents can appear in the process table running like this: /opt/VRTSvcs/bin/Volume/VolumeAgent -type Volume /opt/VRTSvcs/bin/MultiNICA/MultiNICAAgent -type MultiNICA /opt/VRTSvcs/bin/Sqlnet/SqlnetAgent -type Sqlnet /opt/VRTSvcs/bin/Oracle/OracleAgent -type Oracle /opt/VRTSvcs/bin/IPMultiNIC/IPMultiNICAgent -type IPMultiNIC /opt/VRTSvcs/bin/DiskGroup/DiskGroupAgent -type DiskGroup /opt/VRTSvcs/bin/Mount/MountAgent -type Mount /opt/VRTSvcs/bin/Wig/WigAgent -type Wig ------------------------------------------------------- SUN OBP SEND BREAK SERIAL PORT CORRUPTION A Sun machine will halt the processor and be sent into the OBP prompt if a "STOP-A" or a BREAK signal is sent from the serial console. This can cause VCS to corrupt data when the machine is brought back online. To prevent this from happening, add the following line in /etc/default/kbd: KEYBOARD_ABORT=disable Also, on some Sun Enterprise machines, you can switch the Key to the Padlock position to secure it from dropping accidentally to OBP. ------------------------------------------------------- SHARING DISKS INITIATOR IDS <<< THIS SECTION TO BE UPDATED >>> If 2 Nodes are sharing the same disks on the same SCSI bus, their SCSI host adapters must be assigned unique SCSI "initiator" ID's. The default SCSI initiator ID is 7. To set the SCSI initiator ID on a system to 5, do the following at the OBP: ok setenv scsi-initiator-id 5 ok boot -r ------------------------------------------------------- REMOVE VCS SOFTWARE To remove the VCS packages, execute these commands: /opt/VRTSvcs/wizards/config/quick_start -b rsh <Node hostname> 'sh /opt/VRTSvcs/wizards/config/quick_start -b' pkgrm <VCS packages> rm -rf /etc/VRTSvcs /var/VRTSvcs init 6 ------------------------------------------------------ SYNTAX MAIN.CF FILE The main.cf is structured like this: * include clauses * cluster definition * system definitions * snmp definition * service group definitions * resource type definitions * resource definitions * resource dependency clauses * service group dependency clauses Here's a template of what main.cf looks like: #### include "types.cf" include "<Another types file>.cf" . . . cluster <Cluster name> ( UserNames = { root = <Encrypted password> } CounterInterval = 5 Factor = { runque = 5, memory = 1, disk = 10, cpu = 25, network = 5 } MaxFactor = { runque = 100, memory = 10, disk = 100, cpu = 100, network = 100 } ) system <Hostname of the primary node> system <Hostname of the failover node> snmp vcs ( TrapList = { 1 = "A new system has joined the VCS Cluster", 2 = "An existing system has changed its state", 3 = "A service group has changed its state", 4 = "One or more heartbeat links has gone down", 5 = "An HA service has done a manual restart", 6 = "An HA service has been manually idled", 7 = "An HA service has been successfully started" } ) group <Service Group Name> ( SystemList = { <Hostname of primary node>, <Hostname of failover node> } AutoStartList = { <Hostname of primary node> } ) <Resource Type> <Resource> ( <Attribute of Resource> = <Attribute value> <Attribute of Resource> = <Attribute value> <Attribute of Resource> = <Attribute value> . . . ) . . . <Resource Type> requires <Resource Type> . . . // resource dependency tree // // group <Service Group name> // { // <Resource Type> <Resource> // { // <Resource Type> <Resource> // . // . // . // { // <Resource Type> <Resource> // } // } // <Resource Type> <Resource> // } --------------------------------------------------- MAIN.CF RESOURCES ATTRIBUTE VALUES By default, VCS monitors online resources every 60 seconds and offline resources every 300 seconds. These are user configurable. Each Resource Type has Attributes and Values you can set. Resource can be added in any order within a Service Group. Here are some examples of common values in main.cf. *** Service Groups: group oragrpa ( SystemList = { cp01, cp02 } AutoStart = 0 AutoStartList = { cp01 } PreOnline = 1 ) + AutoStart determines whether the Service Group will start automatically after the machines in the AutoStartList are rebooted. + PreOnline determines whether a preonline script is executed. *** Veritas Volume Manager Disk Groups: DiskGroup external00 ( DiskGroup = external00 ) + DiskGroup is the Veritas Volume Manager Disk Group name. *** IP MultiNIC Virtual IP: IPMultiNIC ip_cpdb01 ( Address = "151.144.128.107" NetMask = "255.255.0.0" MultiNICResName = mnic_oragrpa ) + Address is the VIP address. *** Mount points: Mount u_u01_a ( MountPoint = "/u/u01" BlockDevice = "/dev/vx/dsk/external00/u_u01" FSType = vxfs MountOpt = rw ) *** MultiNIC IP's: MultiNICA mnic_oragrpa ( Device @cp01 = { hme0 = "151.144.128.101", qfe4 = "151.144.128.101" } Device @cp02 = { hme0 = "151.144.128.102", qfe4 = "151.144.128.102" } NetMask = "255.255.0.0" NetworkHosts @cp01 = { "151.144.128.1", "151.144.128.102", "151. 144.128.104" } NetworkHosts @cp02 = { "151.144.128.1", "151.144.128.101", "151. 144.128.104" } ) + MultiNICA is the Resource Type that comes with VCS. + These are the interfaces and IP's for the nodes in this Service Group. + NetworkHosts includes IP's of interfaces to ping to see if the NIC is up. One of the IP's is the default router, 151.144.128.1. + The "@" sign indicates this attribute will only be applied to this specific system. *** Normal IP's: IP group1_ip1 ( Device = hme0 Address = "192.168.1.1" ) *** Normal NIC's: NIC group1_nic1 ( Device = hme0 NetWorkType = ether ) *** Oracle Database: Oracle cawccs02 ( Sid = cawccs02 Owner = oracle Home = "/usr/apps/oracle/product/8.1.6" Pfile = "/usr/apps/oracle/product/8.1.6/dbs/initcawccs02.ora" User = vcs Pword = vcs Table = vcs MonScript = "./bin/Oracle/SqlTest.pl" ) + The Resource here is "cawccs02" which is the Oracle instance name. + VCS requires that the Oracle DBA create a table called "vcs" for VCS to monitor the Database instance. *** Sqlnet Listener: Sqlnet listenera ( Owner = oracle Home = "/usr/apps/oracle/product/8.1.6" TnsAdmin = "/usr/apps/oracle/product/8.1.6/network/admin" Listener = LISTENER MonScript = "./bin/Sqlnet/LsnrTest.pl" ) + The values for the Oracle and Sqlnet resources are items you have to get from the Oracle DBA. --------------------------------------------------- SYNTAX TYPES.CF FILE Here's a template of what types.cf looks like: ###### type <Resource Type> ( static str ArgList[] = { <attribute>, <attribute>, ... } NameRule = resource.<attribute> static str Operations = <value> static int NumThreads = <value> static int OnlineRetryLimit = <value> str <attribute> str <attribute> = <value> int <attribute> = <value> int <attribute> ) . . . --------------------------------------------------- GROUP TYPES A Failover Group can only be online on one system. A Parallel Group can be online on multiple systems. Groups can be brought online in three ways: 1. Command was issued 2. Reboot 3. Failover --------------------------------------------------- COMMUNICATIONS CHANNELS VCS nodes communicate to each other in several ways. 1. Network channels (up to 8). 2. Communication or Service Group Heartbeat Disks GAB controls diskbased communications. NOTE: Heartbeat disks CANNOT carry cluster state information. --------------------------------------------------- NETWORK AND DISK HEARTBEATS Here's a matrix of what happens when you are losing network and disk heartbeat links. 1 net 0 dhb Jeopardy and Regular Memberships. 1 net-low-pri 0 dhb Jeopardy and Regular Memberships. This is the only time a low-pri link carries cluster status information. 0 net 1 dhb Jeopardy and VCS splits your Cluster into mini-Clusters That's because disk heartbeats can't carry cluster status information. 1 net 1 dhb No Jeopardy, since you have 2 links. Everything is fine for now. 1 net 1 net-low-pri No Jeopardy, since you have 2 links. Everything is fine for now. NOTES: VCS versions before 1.3.0 may panic a system upon rejoin from a cluster split. Here were the rules: 1. On a 2 node cluster, the node with the highest LLT host ID panics. 2. On multinode cluster, the largest mini-cluster stays up. 3. On multinode cluster splitting into equal size clusters, cluster with lowest node number stays up. In VCS 1.3.0, had is restarted on the node with the highest LLT host ID. 4. If low-pri or regular links are on the same VLAN or network, unique Service Access Point (SAP) values must be used. EXAMPLE: set-node 0 set-cluster 0 link-lowpri mylink0 /dev/hme:0 - ether - link-lowpri mylink1 /dev/hme:1 - ether 0xcaf0 - link mylink4 /dev/qfe:4 - ether - - link mylink5 /dev/qfe:5 - ether - - start To change the SAP value while VCS is online, freeze the Service Groups then do this: lltconfig -u mylink1 lltconfig -l -t mylink1 -d /dev/hme:1 -b ether -s 0xcaf0 The other low-pri link can stay at the default 0xcafe value or it can be changed to another value also. --------------------------------------------------- SEEDING HEARTBEATS FAILOVER JEOPARDY Seeding is the bringing in of a system into the Cluster. Systems by default are not seeded when they come up. They must be seeded by other nodes. Otherwise, they must seed themselves (if that is allowed by the System Administrator). 1. Assume ALL nodes boot with "gabconfig -c -nX" where X is number of nodes in your cluster. If all nodes are booting up, and ALL X nodes come up AND send heartbeats, "auto-seeding" begins on the first node that came up, VCS starts on that node, loads that node's main.cf into memory, and VCS starts on all other nodes. In other words, ALL nodes must be up before VCS starts automatically. Once all nodes are up and send heartbeats, VCS starts. If less than X nodes come up (e.g. one of the nodes crashes), the cluster does not auto-seed, and VCS does not start on any node. You must manually seed the cluster. Run "gabconfig -c -x", or "gabconfig -c -nY", where Y < X, on the node you want to load the main.cf from. That node will seed itself, VCS will start and load that node's main.cf, and other nodes will begin seeding. VCS will then start on the other nodes. 2. Assume a node boots with "gabconfig -c -x" and all others have "gabconfig -c -nX". Assume these other nodes are down. The "gabconfig -c -x" node will come up, seed itself, and start VCS. The other nodes will come up and be seeded by this cluster. If the "gabconfig -c -nX" nodes come up first, they will wait for the "gabconfig -c -x" node to come up. If the "gabconfig -c -x" node can't come up, you must do "gabconfig -c -x" on one of the nodes that is up. 3. A node is seeded if and only if (1) there is another node that is already seeded, or (2) all nodes in the cluster are up, or (3) it boots with "gabconfig -c -x" or that command is executed on some node. **** Heartbeats and Jeopardy **** If only 1 heartbeat link remains, automatic failover after a system crash is disabled. The cluster is in "jeopardy". You can still do manual failovers, and VCS will still failover on individual service group failures. But it is still very important that you replace or fix the broken link. That's because in this condition, VCS cannot decide, when the next and last heartbeat failure occurs, whether a system actually crashed or whether its just the heartbeat that had failed (e.g., another loose cable). When the next and last heartbeat is dead (because of cable failure or server failure), VCS is designed to split the cluster into mini-clusters, each capable of operating on its own. So, if there is no pro-active measure taken, when the last heartbeat is dead, each node will fire up the service groups, thinking the other node is dead. In this case, a split-brain will develop, with more than one system writing to the same storage, causing data corruption. VCS must therefore act pro-actively to prevent a split-brain. It does this by disabling automatic failover between the nodes until the second heartbeat link is restored. When an application crashes, VCS agents will detect this and failover. When a system crashes, VCS will detect all heartbeats as being down at the same time, the agents will sense the application is down, and VCS will then do a failover. When only one or a few heartbeat links are down, but more than one heartbeats are still up, and VCS agents sense the application is still up, VCS will do automatic failover. But if only one heartbeat remains, and the agents say the application is up, VCS will then disable automatic failover if a system crashes, (until a heartbeat is restored). This is a pro-active measure by VCS. SO, if your heartbeats consist of only 2 network links, and one link fails, automatic failover due to a system crash is disabled. Its better to add one of the public links as a low-priority 3rd heartbeat, or add a 3rd dedicated heartbeat/communications cable. If the remaining link goes down, VCS partitions the nodes into mini-clusters. All failover types are disabled. You MUST, at this point, shutdown VCS on all nodes BEFORE restoring ANY heartbeat links otherwise VCS will panic your nodes (you will get a core dump to debug from). You can disable this with a "/sbin/gabconfig -r". Starting with VCS 1.30, VCS will restart on the node with higher ID instead of panicing. To enable halt on rejoin, do "/sbin/gabconfig -j". --------------------------------------------------- AGENT ENTRY POINTS Entry points can be in C++ or scripts in perl or shell. If your entry points are scripts, you can use /opt/VRTSvcs/bin/ScriptAgent to build your agent. Mandatory entry points: VCSAgStartup (must be in C++) monitor (return 100 if offline, 110 if online) Optional entry points: online (return value 0 or 1) offline (return value 0 or 1) clean (return 0 clean, 1 not clean) attr_changed open close shutdown Example agent binary source code: #include "VCSAgApi.h" void my_shutdown() { ... } void VCSAgStartup() { VCSAgEntryPointStruct ep; ep.open = NULL; ep.online = NULL; ep.offline = NULL; ep.monitor = NULL; ep.attr_changed = NULL; ep.clean = NULL; ep.close = NULL; ep.shutdown = NULL; VCSAgSetEntryPoints(ep); } The entry point scripts are located in /opt/VRTSvcs/bin/<agent>/<entry point>. ---------------------------------------------------- SERVICE GROUP CANNOT START TRICKS To start a Service Group, execute: hagrp -online <service group> -sys <host name> If a Service Group cannot start up on any nodes, try one of these tricks: 1. Check out which systems have the AutoDisabled set to 1. hasys -display Execute the following for each of those systems: hagrp -autoenable <service group> -sys <host name> 2. Clear any faults for the service group. hagrp -clear <service group> -sys <host name> 3. Clear any faulted resource. hares -clear <resource> -sys <host name> -------------------------------------------------------------- MAINTENANCE OFFLINE RESOURCE Here's an example of how to do some filesystem maintenance, while keeping the Service Group up (partially online). This will involve offlining a resource without affecting other resources or bringing the service group down. This procedure is detailed in Veritas TechNote ID: 232192 1. haconf -makerw 2. hagrp -freeze <service group> -persistent 3. haconf -dump -makero Now do maintenance, e.g. unmount a filesystem. If you don't want resources monitored during maintenance, just do before maintainance: hagrp -disableresources <service group> After maintenance, remount your filesystems. 4. haconf -makerw 5. hagrp -unfreeze <service group> -persistent If you disabled resources, hagrp -enableresources <service group> 6. haconf -dump -makero Find out which resources are still down. 7. hastatus -sum 8. hares -clear <mount resource> 9. hares -online <mount resource> -sys <host name> Verify the Service Group is completely up. 10. hastatus -sum ---------------------------------------------------- FILEONOFF CLUSTER MAIN.CF Heres a main.cf for simple 3-node Cluster where the 2 Service Groups bring a file online and checks to see if that file exists. ############ include "types.cf" include "OracleTypes.cf" cluster cpdb ( UserNames = { veritas = cD9MAPjJQm6go } CounterInterval = 5 Factor = { runque = 5, memory = 1, disk = 10, cpu = 25, network = 5 } MaxFactor = { runque = 100, memory = 10, disk = 100, cpu = 100, network = 100 } ) system cp01 system cp02 system cp03 snmp vcs ( TrapList = { 1 = "A new system has joined the VCS Cluster", 2 = "An existing system has changed its state", 3 = "A service group has changed its state", 4 = "One or more heartbeat links has gone down", 5 = "An HA service has done a manual restart", 6 = "An HA service has been manually idled", 7 = "An HA service has been successfully started" } ) group oragrpa ( SystemList = { cp01, cp02 } AutoStart = 0 AutoStartList = { cp01 } PreOnline = 1 ) FileOnOff filea ( PathName = "/var/tmp/tempa" ) // resource dependency tree // // group oragrpa // { // FileOnOff filea // } group oragrpb ( SystemList = { cp02, cp03 } AutoStart = 0 AutoStartList = { cp03 } PreOnline = 1 ) FileOnOff fileb ( PathName = "/var/tmp/tempb" ) // resource dependency tree // // group oragrpb // { // FileOnOff fileb // } --------------------------------------------------------------------------- FILEONOFF CLUSTER MAIN.CF CLI Here's an example of how to build a simple 2-node cluster using FileOnOff agent entirely from the command line. Service Group is "bchoitest", and the systems are "anandraj" and "bogota". hagrp -add bchoitest hagrp -modify bchoitest SystemList anandraj 0 bogota 1 hagrp -modify bchoitest AutoStartList anandraj hares -add filea FileOnOff bchoitest hares -modify filea PathName "/tmp/brendan" hagrp -enableresources bchoitest or hares -modify filea Enabled 1 hagrp -online bchoitest -sys anandraj To add another FileOnOff resource: hares -add fileb FileOnOff bchoitest hares -modify fileb Enabled 1 hares -modify fileb PathName "/tmp/brendan" hares -online fileb -sys anandraj To link the resources in a resource dependancy (making fileb depend on filea): hares link fileb filea The syntax here is "hares link <parent> <child>". --------------------------------------------------------------------------- VIOLATIONS Concurrency Violation This occurs when a resource in a failover service group is online on more than one system. --------------------------------------------------------------------------- TRIGGER SCRIPTS Event trigger scripts have to be stored in this directory in order for them to work: /opt/VRTSvcs/bin/triggers Sample trigger scripts are stored in: /opt/VRTSvcs/bin/sample_triggers Trigger scripts include: nfs_restart preonline violation injeopardy nofailover postoffline postonline resfault resnotoff sysoffline resstatechange (VCS 1.3.0P1 and VCS 2.0) VCS 1.3.0 Patch1 introduced the resstatechange event trigger. It can be enabled with: hagrp -modify <service group> TriggerResStateChange 1 Or modify the script and place it in /opt/VRTSvcs/bin/triggers. The following is from the VCS 1.3.0 User's Guide: Event triggers are invoked on the system where the event occurred, with the following exceptions: * The InJeopardy, SysOffline, and NoFailover event triggers are invoked from the lowest-numbered system in RUNNING state. * The Violation event trigger is invoked from all systems on which the service group was brought partially or fully online. --------------------------------------------------------------------------- PARENT CHILD RESOURCES GROUPS DEPENDENCIES NOTE: The Veritas definitions for "Parent" and "Child" are quite backward. They are not intuitive. Resources and their dependencies form a "graph", with Parent resources at the "root", and Child resources as "leaves". System A Parent (Root) | | | | System B Child (Leaf) | | | | System C In this diagram, System A depends on B (A is the parent and B is the child), and B depends on C (B is the parent relative to C). Under Online Group Dependencies, the Parent resources or group must wait for the Child to come online first before bringing itself online. Under Offline Group Dependencies, the Parent must wait for the Child to go offline before bringing itself online. From the Veritas Cluster Server Users Guide: * Online local The child group must be online on a system before the parent group can go online on the same system. * Online global The child group must be online on a system before the parent group can go online on any system. * Online remote The child group must be online on a system before the parent group can go online on a different system. * Offline local The child group must be offline on a system before the parent group can go online on that same system, and vice versa. ------------------------------------------------------------- NOTES ON SERVICE GROUP DEPENDENCIES Rules: 1. Parent can have ONLY one Child, but Child can have multiple Parents. 2. An SGD tree can have up to 3 levels maximum. GroupA depends on GroupB = GroupA is parent of Group B, GroupB is child of GroupA EXAMPLE: application (parent) depend on database (child) Categories of Dependencies = online, offline Locations of Dependencies = local, global, remote Types of Dependencies = soft, firm Online SGD = parent must wait for child to come up before it can come up Offline SGD = parent must wait for child to be offline before it can come up, and vice versa Local = Parent on same system Global = Parent on any system Remote = Parent on any other system (on any system other than the local one) Soft = parent may or may not automatically failover if child dies * Online Local Soft = Parent will failover to the same system child failovers to. If Parent dies and Child does not, Parent cannot failover to another system. If Child cannot failover,Parent stays where it is. *Online Global Soft = Parent will NOT failover if child failovers. If Parent dies and Child does not, Parent can failover to another system. * Online Remote Soft = If child failsover to the parent system, the parent will failover to another system. If Child failsover to a system other than the parent, the parent stays where it is. Firm = parent MUST be taken offline if child dies. Child cannot be taken offline while Parent is online. Parent remains offline and cannot failover if Child dies and cannot come online *Online Local Firm = Parent failovers to the system the child failovers to *Online Global Firm = Parent failovers to any system when child failovers *Online Remote Firm = Parent will failover to a system, but NOT the one the *child failovers to Offline Local = If Child failsover to Parent's system, Parent will failover to another system. Parent can only be online on a system that the Child is offline on. EXAMPLES: main.cf syntax The "requires..." statements must come after the resource declarations and before the resource dependency statements. Online local firm: "requires group GroupB online local firm" Online global soft: "requires group GroupB online global soft" Online remote soft: "requires group GroupB online remote soft" Offline local: "requires group GroupB offline local" "Online remote" & "Offline local" are very similar, except "Offline local" doesn't require the child to be online anywhere. NOTE: * Parallel parent/parallel child not supported in online global or online remote. * Parallel parent/failover child not supported in online local. * Parallel child/failover parent is supported in online local, but the failover parent group's name must be lexically before the child group name. That's because VCS onlines service groups in alphabetical order. See Veritas Support TechNote 237239. EXAMPLE: hagrp -link bchoitest apache1 offline local hagrp -dep #Parent Child Relationship bchoitest apache1 offline local Inside the main.cf in the Parent Group: requires group apache1 offline local ------------------------------------------------------------ LOGS TAGS ERROR MESSAGES VCS logs are stored in: /var/VRTSvcs/log The logs show errors for the VCS engine and Resource Types. EXAMPLE: -rw-rw-rw- 1 root other 22122 Aug 29 08:03 Application_A.log -rw-rw-rw- 1 root root 9559 Aug 15 13:02 DiskGroup_A.log -rw-rw-rw- 1 root other 296 Jul 17 17:55 DiskGroup_ipm_A.log -rw-rw-rw- 1 root root 746 Aug 17 16:27 FileOnOff_A.log -rw-rw-rw- 1 root root 609 Jun 19 18:55 IP_A.log -rw-rw-rw- 1 root root 1130 Jul 21 14:33 Mount_A.log -rw-rw-rw- 1 root other 5218 May 14 13:16 NFS_A.log -rw-rw-rw- 1 root root 7320 Aug 15 12:59 NIC_A.log -rw-rw-rw- 1 root other 1042266 Aug 23 10:46 Oracle_A.log -rw-rw-rw- 1 root root 149 Mar 20 13:10 Oracle_ipm_A.log -rw-rw-rw- 1 root other 238 Jun 1 13:07 Process_A.log -rw-rw-rw- 1 root other 2812 Mar 21 11:45 ServiceGroupHB_A.log -rw-rw-rw- 1 root root 6438 Jun 19 18:55 Sqlnet_A.log -rw-rw-rw- 1 root root 145 Mar 20 13:10 Sqlnet_ipm_A.log -rw-r--r-- 1 root other 16362650 Aug 31 08:58 engine_A.log -rw-r--r-- 1 root other 313 Mar 20 13:11 hacf-err_A.log -rw-rw-rw- 1 root root 1615 Jun 29 16:30 hashadow-err_A.log -rw-r--r-- 1 root other 2743342 Aug 1 17:12 hashadow_A.log drwxrwxr-x 2 root sys 3072 Aug 27 12:41 tmp These tags appear in the engine log. TAG_A: VCS internal message. Contact Customer Support. TAG_B: Messages indicating errors and exceptions. TAG_C: Messages indicating warnings. TAG_D: Messages indicating normal operations. TAG_E: Messages from agents indicating status, etc. You can increase the log levels (get TAG F-Z messages) by changing the LogLevel Resource Type attribute. Default is "error". You can choose "none", "all", "debug", or "info". hatype -modify <Resource Type> LogLevel <option> ------------------------------------------------------------ VCS MISCELLANEOUS INFORMATION Here are some miscellaneous information about VCS: 1. VCS does not officially support both PCI and SBus on the same shared device. This could be more of a Sun restriction. 2. VCS Web Edition comes with an Apache agent. 3. The VCS VIP can only be created on an interface that is already plumbed. 4. VCS 1.3 supports Solaris 8. 5. Resources (not Resource Types) cannot have the same name within a Cluster. So if you have 2 Service Groups in a Cluster and they both use similar resources, you must name them differently. 6. Only manually edit the main.cf file when the Cluster has stopped. 7. Use "hastop -force -all" incase you mess up the Cluster before writing to main.cf; this will keep the main.cf file unchanged. This will also give you a .stale file if your cluster was in read-write mode. 8. Until you tell VCS to write to main.cf, it will update a backup of the main.cf instead (as you make changes). 9. Sun Trunking 1.1.2 using IP Source/IP Destination policy should work with VCS. 10. VCS has a current limit of 32 nodes per cluster. 11. GAB in version 1.1.2 is incompatible with GAB in 1.3, so if you are ugrading VCS, upgrade all nodes. 12. Oracle DBA's need to edit the following Oracle files during VCS setup: $ORACLE_HOME/network/admin/tnsnames.ora $ORACLE_HOME/network/admin/listener.ora If there are multiple instances within the cluster, each instance needs its own listener name in listener.ora. 13. VCS 2.0 supports Solaris eri FastEthernet driver. ---------------------------------------------------------------- UPGRADE MAINTENANCE PROCEDURE Here's a procedure to upgrade VCS or shutdown VCS during hardware maintenance. 1. Open, freeze each Service Group, and close the VCS config. haconf -makerw hagrp -freeze <Service Group> -persistent haconf -dump makero 2. Shutdown VCS but keep services up. hastop -all -force 3. Confirm VCS has shut down on each system. gabconfig -a 4. Confirm GAB is not running on any disks. gabdisk -l (use this if upgrading from VCS 1.1.x) gabdiskhb -l gabdiskx -l If it is, remove it from the disks on each system. gabdisk -d (use this if upgrading from VCS 1.1.x) gabdiskhb -d gabdiskx -d 5. Shutdown GAB and confirm it's down on each system. gabconfig -U gabconfig -a 6. Identify the GAB kernel module number and unload it from each system. modinfo | grep gab modunload -i <GAB module number> 7. Shutdown LLT. On each system, type: lltconfig -U Enter "y" if any questions are asked. 8. Identify the LLT kernel module number and unload it from each system. modinfo | grep llt modunload -i <LLT module number> 9. Rename VCS startup and stop scripts on each system. cd /etc/rc2.d mv S70llt s70llt mv S92gab s92gab cd /etc/rc3.d mv S99vcs s99vcs cd /etc/rc0.d mv K10vcs k10vcs 10. Make a backup copy of /etc/VRTSvcs/conf/config/main.cf. Make a backup copy of /etc/VRTSvcs/conf/config/types.cf. Starting with VCS 1.3.0, preonline and other trigger scripts must be in /opt/VRTSvcs/bin/triggers. Also, all preonline scripts in previous versions (such as VCS 1.1.2) must now be combined in one preonline script. 11. Remove old VCS packages. pkgrm VRTScsga VRTSvcs VRTSgab VRTSllt VRTSperl VRTSvcswz If you are upgrading from 1.0.1 or 1.0.2, you must also remove the package VRTSsnmp, and any packages containing a .2 extension, such as VRTScsga.2, VRTSvcs.2, etc. Also remove any agent packages such as VRTSvcsix (Informix), VRTSvcsnb (NetBackup), VRTSvcssor (Oracle), and VRTSvcssy (Sybase). Install new VCS packages. Restore your main.cf and types.cf files. 12. Start LLT, GAB and VCS. cd /etc/rc2.d mv s70llt S70llt mv s92gab S92gab cd /etc/rc3.d mv s99vcs S99vcs cd /etc/rc0.d mv k10vcs K10vcs /etc/rc2.d/S70llt start /etc/rc2.d/S92gab /etc/rc3.d/S99vcs start 13. Check on status of VCS. hastatus hastatus -sum 14. Unfreeze all Service Groups. haconf -makerw hagrp -unfreeze <Service Group> -persistent haconf -dump -makero ------------------------------------------------ USING VXEXPLORER SCRIPT When you call Veritas Tech Support for a VCS problem, they may have you download a script to run and send back information on your cluster. Here's the procedure: The URL is ftp://ftp.veritas.com/pub/support/vxexplore.tar.Z but you can also get it this way. 1. ftp ftp.veritas.com 2. Login as anonymous. 3. Use your e-mail address as your password. 4. cd /pub/support 5. bin 6. get vxexplore.tar.Z 7. Once downloaded, copy the file to all nodes. On each node, uncompress and un-tar the file: zcat vxexplore.tar.Z | tar xvf - 8. cd VRTSexplorer Read the README file. Run the Vxexplorer script on each node. ./VRTSexplorer Make sure the output filename has the CASE ID number. EXAMPLE: VRTSexplorer_999999999.tar.Z Now upload the file to ftp.veritas.com. 9. ftp ftp.veritas.com 10. Login as anonymous. 11. Use your e-mail address as your password. 12. cd /incoming 13. bin 14. put <Vxexplorer output filename> Upload the other file also. -------------------------------------------------- BUNDLED AGENTS ATTRIBUTES The following are Agents bundled with VCS 1.3.0 and the Resource attributes for each Resource Type. These attributes are listed in types.cf. Application User StartProgram StopProgram CleanProgram MonitorProgram PidFiles MonitorProcesses Disk Partition (required) DiskGroup DiskGroup (required) StartVolumes StopVolumes DiskReservation Disks (required) FailFast ConfigPercentage ProbeInterval ElifNone PathName (required) FileNone PathName (required) FileOnOff PathName (required) FileOnOnly PathName (required) IP Address (required) Device (required) ArpDelay IfconfigTwice NetMask Options IPMultiNIC Address (required) MultiNICResName (required) ArpDelay IfconfigTwice NetMask Options Mount BlockDevice (required) MountPoint (required) FSType (required) FsckOpt MountOpt SnapUmount MultiNICA Device (required) ArpDelay HandshakeInterval IfconfigTwice NetMask NetworkHosts Options PingOptimize RouteOptions NFS Nservers NIC Device (required) PingOptimize NetworkHosts NetworkType Phantom Process PathName (required) Arguments Proxy TargetResName (required) TargetSysName ServiceGroupHB Disks (required) AllOrNone Share PathName (required) Options Volume DiskGroup (required) Volume -------------------------------------------- ENTERPRISE STORAGE AGENTS ATTRIBUTES Here are Agents not included with VCS that you have to purchase, and their resource attributes. These attributes are also listed in types.cf. Apache ServerRoot PidFile IPAddr Port TestFile Informix Server (required) Home (required) ConfigFile (required) Version (required) MonScript NetApp (storage agent) NetBackup Oracle Oracle Agent Sid (required) Owner (required) Home (required) Pfile (required) User PWord Table MonScript Sqlnet Agent Owner (required) Home (required) TnsAdmin (required) Listener (required) MonScript PCNetlink SuiteSpot Sun Internet Mail Server (SIMS) Sybase SQL Server Agent Server (required) Owner (required) Home (required) Version (required) SA (required) SApswd (required) User UPword Db Table MonScript Backup Server Agent Server (required) Owner (required) Home (required) Version (required) Backupserver (required) SA (required) SApswd (required) ----------------------------------------------------------------- VCS SYSTEM CLUSTER SERVICE GROUP RESOURCE TYPE SNMP ATTRIBUTES Everything in VCS has attributes. Here is a list of attributes from VCS 1.3. Cluster Attributes (haclus -display): ClusterName CompareRSM (for internal use only) CounterInterval DumpingMembership EngineClass EnginePriority Factor (for internal use only) GlobalCounter GroupLimit LinkMonitoring LoadSampling (for internal use only) LogSize MajorVersion MaxFactor (for internal use only) MinorVersion PrintMsg (for internal use only) ProcessClass ProcessPriority ReadOnly ResourceLimit SourceFile TypeLimit UserNames Systems Attributes (hasys -display): AgentsStopped (for internal use only) ConfigBlockCount ConfigCheckSum ConfigDiskState ConfigFile ConfigInfoCnt (for internal use only) ConfigModDate DiskHbDown Frozen GUIIPAddr LinkHbDown LLTNodeId (for internal use only) Load LoadRaw MajorVersion MinorVersion NodeId OnGrpCnt ShutdownTimeout SourceFile SysInfo SysName SysState TFrozen TRSE (for internal use only) UpDownState UserInt UserStr Service Groups Attributes (hagrp -display): ActiveCount AutoDisabled AutoFailOver AutoRestart AutoStart AutoStartList CurrentCount Enabled Evacuating (for internal use only) ExtMonApp ExtMonArgs Failover (for internal use only) FailOverPolicy FromQ (for internal use only) Frozen IntentOnline LastSuccess (for internal use only) ManualOps MigrateQ (for internal use only) NumRetries (for internal use only) OnlineRetryInterval OnlineRetryLimit Parallel PathCount PreOffline (for internal use only) PreOnline PreOfflining (for internal use only) PreOnlining (for internal use only) Priority PrintTree ProbesPending Responding (for internal use only) Restart (for internal use only) SourceFile State SystemList SystemZones TargetCount (for internal use only) TFrozen ToQ (for internal use only) TriggerEvent (for internal use only) TypeDependencies UserIntGlobal UserStrGlobal UserIntLocal UserStrLocal Resource Types Attributes (hatype -display): These are common to all Resource Types. You can change values for each Resource Type. AgentClass AgentFailedOn AgentPriority AgentReplyTimeout AgentStartTimeout ArgList AttrChangedTimeout CleanTimeout CloseTimeout ConfInterval FaultOnMonitorTimeouts LogLevel MonitorInterval MonitorTimeout NameRule NumThreads OfflineMonitorInterval OfflineTimeout OnlineRetryLimit OnlineTimeout OnlineWaitLimit OpenTimeout Operations RestartLimit ScriptClass ScriptPriority SourceFile ToleranceLimit Resources Attributes (hares -display): NOTE: These are only some of the ones common to many kinds of resources. See the types.cf file for type-specific attributes for resources. ArgListValues AutoStart ConfidenceLevel Critical Enabled Flags Group IState LastOnline MonitorOnly Name (for internal use only) Path (for internal use only) ResourceOwner Signaled (for internal use only) Start (for internal use only) State TriggerEvent (for internal use only) Type SNMP (predefined): Enabled IPAddr Port SnmpName SourceFile TrapList ----------------------------------------------- VCS NODENAME If you change your server's hostname often, it might be a good idea to let VCS use its own names for your nodes. To use VCS's own nodenames instead of hostnames for the nodes in a Cluster, you need the /etc/VRTSvcs/conf/sysname file defined in /etc/llttab. Use these VCS nodenames in main.cf. EXAMPLE: /etc/llthosts: 0 sysA 1 sysB /etc/VRTSvcs/conf/sysname: sysA /etc/VRTSvcs/conf/sysname: sysB /etc/llttab: set-node /etc/VRTSvcs/conf/sysname ------------------------------------------------- RESOURCE ATTRIBUTE DATA TYPES DIMENSIONS Resource attributes are either type-independent or type-specific. Type-specific attributes appear in types.cf. Resources have 3 system created attributes that determine failover behavior. They are user modifiable. Critical = 1 If resource or its children fault, service group faults and fails over. AutoStart = 1 Command to bring service group online will also bring resource online. Enabled = 0 If 0, resource is not monitored by agent. This is the default value if resource is added in the CLI. Enabled = 1 if resource is defined in main.cf Static attributes are same for all resources within a resource type. All attributes have "definition" (Data Type) and a "value" (Dimension). Global attribute value means its the same throughout the cluster. Local attribute value means it applies only to a specific node. Attribute data types are: 1. str (string) DEFAULT Data Type. Can be in double quotes, backslash is \\, a double quote is \". No double quotes needed if string begins with letter and only has letters, numbers, dashes and underscores. EXAMPLE: Adding a string-scalar value to BlockDevice attribute for a Mount resource called "export1". hares -modify export1 BlockDevice "/dev/sharedg/voloracle" 2. int (integer) Base 10. 0-9. 3. boolean 0 (false) or 1 (true). EXAMPLE: hagrp -modify Group1 Parallel 1 This is a "boolean integer". Attribute Dimensions are: 1. scalar DEFAULT dimension. Only 1 value. 2. vector Ordered list of values, denoted by [] in types.cf file. Values are indexed by positive integers, starting at 0. EXAMPLES of string-vectors: Dependencies[] = { Mount, Disk, DiskGroup } Dependencies is the attribute. NetworkHosts = { "166.93.2.1", "166.99.1.2" } NetworkHosts @cp01 = { "151.144.128.1", "151.144.128.102", "151.144.128.104" } NetworkHosts @cp02 = { "151.144.128.1", "151.144.128.101", "151.144.128.104" } Command line would be: hares -local mul_nic NetworkHosts hares -modify mul_nic NetworkHosts 10.1.3.10 10.1.3.11 10.1.3.12 -sys cp01 hares -modify mul_nic NetworkHosts 10.1.3.10 10.1.3.13 10.1.3.12 -sys cp02 3. keylist Unordered list of unique strings. EXAMPLE: AutoStartList = { sysa, sysb, sysc } AutoStartList is the attribute. This is a "string keylist". hagrp -modify Group1 AutoStartList Server1 Server2 4. association Unordered list of name-value pairs, denoted by {} in types.cf file. EXAMPLE: SystemList{} = { sysa=1, sysb=2, sysc=3 } SystemList is the attribute. This is a "string association". hagrp -modify <Object> <Attribute> -add <Key> <Value> <Key> <Value> ... hagrp -modify Group1 SystemList -add Server1 0 Server2 1 To update the SystemList, do this: hagrp -modify Group1 SystemList -update Server2 0 Server1 1 EXAMPLE: Adding MultiNICA and IPMultiNIC resources. Suppose we wanted these IPMultiNIC and MultiNICA resources in main.cf: IPMultiNIC ip_mul_nic ( Address = "10.10.10.4" NetMask = "255.255.255.192" MultiNICResName = mul_nic ) NOTES: Address is a string-scalar. NetMask is a string-scalar. MultiNICResName is a string-scalar. MultiNICA mul_nic ( Device @sysA = { hme0 = "10.10.10.1", qfe0 = "10.10.10.1" } Device @sysB = { hme0 = "10.10.10.2", qfe0 = "10.10.10.2" } NetMask = "255.255.255.192" ArpDelay = 5 Options = trailers RouteOptions = "default 10.10.10.5 0" IfconfigTwice = 1 ) NOTES: Device is a string-association. Netmask is a string-scalar. ArpDelay is a integer-scalar. Options is a string-scalar. Route-Options is a string-scalar. IfconfigTwice is an integer-scalar. 1. Open the Cluster. haconf -makerw 2. Add the MultiNICA resource "mul_nic". hares -add mul_nic MultiNICA groupx 3. Make the mul_nic resource local to the systems. hares -local mul_nic Device 4. Add the attribute and values and enable resource. hares -modify mul_nic Device -add hme0 10.10.10.1 -sys sysA hares -modify mul_nic Device -add qfe0 10.10.10.1 -sys sysA hares -modify mul_nic Device -add hme0 10.10.10.2 -sys sysB hares -modify mul_nic Device -add qfe0 10.10.10.2 -sys sysB hares -modify mul_nic NetMask 255.255.255.192 hares -modify mul_nic ArpDelay 5 hares -modify mul_nic Options trailers hares -modify mul_nic RouteOptions "default 10.10.10.5 0" hares -modify mul_nic IfconfigTwice 1 hares -modify mul_nic Enabled 1 5. Add the IPMultiNIC resource "ip_mul_nic", add attributes and values, and enable it. hares -add ip_mul_nic IPMultiNIC groupx hares -modify ip_mul_nic Address 10.10.10.4 hares -modify ip_mul_nic Netmask 255.255.255.192 hares -modify ip_mul_nic MultiNICResName MultiNICA1 hares -modify ip_mul_nic Enabled 1 6. Make the IPMultiNIC resource dependent on the MultiNICA resource. hares -link ip_mul_nic mul_nic 7. Close the Cluster. haconf -dump -makero ----------------------------------------------------------------- RESOURCE MONITORING SERVICE GROUP FAILOVER TIMES Here are important attributes to be aware of in regards to resource monitoring and failovers. Default values are shown. *** Resource Type Attributes *** "Agent level" timimg: AgentReplyTimeout = 130 Time the engine waits to receive a heartbeat from a Agent (Resource Type) before restarting the Agent. AgentStartTimeout = 60 Time engine waits after starting an Agent, for an initial Agent handshake before restarting an Agent. AttrChangedTimeout = 60 Maximum time the "attr_changed" entry point must complete or be terminated. CloseTimeout = 60 Maximum time within with the Close entry point must complete or be terminated. CleanTimeout = 60 Maximum time within with the Clean entry point must complete or be terminated. ConfInterval = 600 The amount of time a resource must stay online before previous faults and restart attempts are ignored by the Agent. Basically how long a resource must stay up before VCS will "forget" that it had faulted recently; used by RestartLimit. If pass the ConfInterval, VCS "resets" the count from RestartLimit. If resource faults under the ConfInterval, VCS looks at RestartLimit. FaultOnMonitorTimeouts = 4 Number of consecutive times a monitor must time out before the the Resource is faulted. Set this to 0 if you don't want monitor failures to indicate a resource fault. MonitorInterval = 60 Time between monitoring attempts for an online or transitioning resource. This is the answer to the common question "how often does a resource get monitored?" MonitorTimeout = 60 Time Agent takes to time out a monitor entry point if it hangs. This is how long the Agent allows for monitoring a resource. OfflineMonitorInterval = 300 This is how often an offline resource gets monitored. Same purpose as MonitorInterval. OfflineTimeout = 300 Maximum time an Agent allows for offlining. OnlineTimeout = 300 Maximum time an Agent allows for onlining. EXAMPLE: If fsck's take too long, make this bigger for the Mount Resource Type, otherwise VCS will have trouble mounting filesystems after a crash. OnlineRetryLimit = 0 Number of times to retry an online if an attempt (for Resource Types) to online has failed. Used only when a clean entry point exists. OnlineWaitLimit = 2 Number of monitor intervals after an online procedure has been completed (online script/entry point exits) before monitor sees a failure and reports it as a failure. Basically, how much time VCS will give the service to come completely online (even after online exits) well enough for it to be monitored. Increase this if an application takes a long time to be completely ready for monitoring even after the online script/entry point completes. OpenTimeout = 60 Maximum time an Agent allows for opening. RestartLimit = 0 Number of times the Agent tries to restart a resource that has faulted or is having problems coming online. Basically number of times VCS will try to restart a resource without faulting the Service Group. ToleranceLimit = 0 Number of offlines that an Agent monitor must declare before a resource is faulted. By default a monitor declaring offline will also a declare a resource to be faulted. *** Resource Attributes *** MonitorOnly If 1, the resource can't be brought online or offline. VCS sets this to 1 if the Service Group is frozen. There is no way to set this directly. Critical If 0, don't fail the Service Group if this resource fails. Default is 1. *** Service Group Attributes *** OnlineRetryInterval = 0 How long in seconds a Service Group should be failed over to another system if it has already faulted and restarted on the same system. This prevents the Service Group from continuously faulting and restarting on the same system; used with OnlineRetryLimit for service groups. OnlineRetryLimit = 0 Number of times VCS will try to restart a faulted (for Service Groups) Service Group on the same system before failing it over to another system. ------------------------------------------------------------ HAUSER VCS 2.0 In VCS 2.0, you can add users, but they must be able to write a temp file to /var/VRTSvcs/lock. Also, if you want to disable password authentication by VCS, just do: haclus -modify AllowNativeCliUsers 1 ------------------------------------------------------------- HAD VCS VERSION To find the version of had you are running, had -version ------------------------------------------------------------- VCS 2.0 NEW FEATURES VCS 2.0 introduces some new features: 1. Web console 2. User privileges 3. SMTP and SNMP notification 4. Workload balancing at the system level. 5. New Java console 6. ClusterService service group 7. Single-node support 8. New licensing scheme 9. Internationalizing 10. Preonline IP check 11. hamsg command to display logs 12. LinkHbStatus and DiskHbStatus system attributes; LinkMonitoring cluster attribute is no longer used. 13. Password encryption 14. SRM can be scheduling class for VCS processes. 15. VCS 1.3 enterprise agents can run on VCS 2.0. 16. New resstatechange event trigger. 17. Service group can come up with a disabled resource. 18. Sun eri FastEthernet driver supported. 19. No restriction on Sybase resources per service group. 20. New NotifierMngr bundled agent. 21. New attributes: Cluster Attributes: Administrators AllowNativeCliUsers ClusterLocation ClusterOwner ClusterType HacliUserLevel LockMemory Notifier Operators VCSMode Service Group Attributes: Administrators AutoStartIfPartial AutoStartPolicy Evacuate Evacuating GroupOwner Operators Prerequisites TriggerResStateChange System Attributes: AvailableCapacity Capacity DiskHbStatus LinkHbStatus Limits Location LoadTimeCounter LoadTimeThreshold LoadWarningLevel SystemOwner Resource Type Attributes: LogFileSize LogTags -------------------------------------------- WEB GUI URLS For VCS 2.0 and VCS 2.0 QuickStart, the URL's to the web GUI interfaces are: VCS 2.0 http://<cluster IP>:8181/vcs VCSQS 2.0 http://<cluster IP>:8181/vcsqs --------------------------------------------- VCS QUICKSTART COMMAND LINE The VCS QuickStart has a limited command line set. These are all the commands: vcsqs -start vcsqs -stop [-shutdown] [-all] [-evacuate] vcsqs -grp [<group>] vcsqs -res [<resource>] vcsqs -config [<resource>] vcsqs -sys vcsqs -online <group> -sys <system> vcsqs -offline <group> [-sys <system>] vcsqs -switch <group> [-sys <system>] vcsqs -freeze <group> vcsqs -unfreeze <group> vcsqs -clear <group> [-sys <system>] vcsqs -flush <group> [-sys <system>] vcsqs -autoenable <group> -sys <system> vcsqs -users vcsqs -addadmin <username> vcsqs -addguest <username> vcsqs -deleteuser <username> vcsqs -updateuser <username> vcsqs -intervals [<type>] vcsqs -modifyinterval <type> <value> vcsqs -version vcsqs -help ----------------------------------------------------- VCS 2.0 WORKLOAD MANAGEMENT VCS 2.0 has new "service group workload management" features. First, we must distinguish between "AutoStartPolicy" and "FailOverPolicy" for Service Groups coming online or failing over. The new AutoStartPolicy is either Order (default), Priority or Load. Order is order in the AutoStartList, Priority is priority in the SystemList. The FailOverPolicy is either Priority (default), RoundRobin or Load. Priority is priority in the SystemList. Roundrobin will mean VCS will pick the system in SystemList with the least number of "online" Service Groups. Before a Service Group comes online it must consider the systems in (a) the AutoStartList if it is autostarting (b) the SystemList if this is a failover and then (1) System Zones (soft restriction) (2) System Limits and Service Group Prerequisites (hard restriction) (3) System Capacity and Service Group Load (soft restriction) Steps 1-3 are done serially during VCS startup, the System being chosen in lexical or canonical order. After this sequence, the actual onlining process begins (VCS will then check on Service Group Dependencies). Onlining of service groups is done in parallel. 1. System Zones (SystemZones), a Service Group attribute. The zones are numbers created by the user. EXAMPLE (2 zones called 0 and 1): SystemZones = { LgSvr1=0, LgSvr2=0, MedSvr1=1, MedSvr2=1 } VCS chooses the zone base on the current server available in the AutoStartList, SystemList or on which zone Service Group is currently on. If there are no more machines in the SystemZones available, VCS will go look at System Limits and Service Group Prerequisites of machines in other System Zones. In subsequent failures, the Service Group will even stick in the new System Zone. 2. System Limits (Limits), a System attribute, and Service Group Prerequisites (Prerequisites), a Service Group attribute, for the Service Group. These names and values are created by the user. EXAMPLE: System attribute: Limits = { ShrMemSeg=20, Semaphores=10, Processors=12, GroupWeight=1 } Service Group attribute: Prerequisites = { ShrMemSeg=10, Semaphores=5, Processors=6, GroupWeight=1 } Basically, VCS rations the Limits among the Service Groups already online, so there must be enough of these "Limits" left for another Service Group trying to come online. If there are not enough "Limits" left on the system, the Service Group can't come online on it. Limits and Prerequisites are hard restrictions. If VCS can't decide base on Limits and Prerequisites, VCS will go on and look at Loads and Capacity, if FailOverPolicy or AutoStartPolicy is Load. 3. Available Capacity (AvailableCapacity), total System Capacity (Capacity), both System attributes, and Service Group Load (Load), a Service Group attribute. When either AutoStartPolicy or FailOverPolicy is "Load", we must talk about Available Capacity, Capacity and current load. Before VCS 2.0, "Load" was a System Attribute calculated by something outside of VCS that can do "hasys -load XX". Now, Load is a Service Group Attribute. A "system load" can still be set by "hasys -load" in VCS 2.0, but now the attribute is "DynamicLoad" (Dynamic System Load). In any case, when VCS must look at capacity and load to bring up a service group, it does this: AvailableCapacity = Capacity - Current System Load In other words, AvailableCapacity = Capacity - (sum of all Load values of all online Service Groups) or AvailableCapacity = Capacity - DynamicLoad "Capacity" and "Load" are values defined by the user. "AvailableCapacity" is calculated by VCS. Here are 2 System attributes that are used by the new "loadwarning" trigger (defaults are shown): LoadWarningLevel = 80 Percentage of Capacity that VCS considers critical. LoadTimeThreshold = 600 Time in seconds system must remain at or above LoadWarningLevel before loadwarning trigger is fired. -------------------------------------------------------- MULTINICA MONITOR VCS 1.3 2.0 Here is what the MultiNICA monitor script does in VCS 1.3 (and the changes in VCS 2.0) NOTE: $NumNICs = 2 x (largest number of NIC's for MultiNIC on any node) Therefore, if one of the nodes has 2 NIC's for MultiNIC, $NumNICs = 4. $NumNICs = 4 in the vast majority of cases since most nodes will dedicate 2 NIC's for MultiNIC. $NumNICs is used to make an ordered list of NIC devices to test. In the following example, hme0 and qfe0 are the MultiNIC devices, the NetworkHost is 192.168.1.1 and the Broadcast IP is 192.168.1.255. 1. The script finds the values to all the attributes from arguments passed to it ( see hares -display <MultiNICA resource> to see the ArgListValues ). 2. The script finds the active device NIC using GetActiveDevice, which uses "ifconfig -a" and the Base IP. 3. The script checks if the NIC is really active. (1) If NetworkHosts is not used, run PacketCountTest. (a) If PingOptimize=1, run netstat -in using GetCurrentStats to compare current network traffic with previous network traffic. (b) If PingOptimize=0, run netstat -in to get current traffic, then ping the Broadcast IP 5 times, then run netstat -in again, then compare traffic. (2) If NetworkHosts is used, ping a NetworkHost. Return success if any of the hosts is pingable. NOTE: In VCS 2.0, if ping fails in this part, set $PingOptimize = 0. ************************************************ 4. If failure is detected, try the following test up to 5 times (RetryLimit=5). Do the following up to 5 times: (1) If NetworkHosts is not used, ping the Broadcast IP 5 times, then run PacketCountTest. NOTE: In VCS 2.0, NetworkHosts is NOT tested in this part, however, the "ping Broadcast IP 5 times" still applies, and the rest of this section (PacketCountTest) still applies. (a) If PingOptimize=1, run netstat -in using GetCurrentStats to compare current network traffic with previous network traffic. (b) If PingOptimize=0, run netstat -in to get current traffic, then ping the Broadcast IP 5 times, then run netstat -in again, then compare traffic. (2) If NetworkHosts is used, ping a NetworkHost. Return success if any of the hosts is pingable. NOTE: In VCS 2.0, this part is omitted. See TechNote 243100 for more information on changes in VCS 2.0 for this section. ************************************************ 5. If last test in Step 4 was successful, then exit 110 (resource is online). If failure is still detected, and MonitorOnly=1, then exit 100 (resource is offline), but if MonitorOnly=0, then begin failover and do the following: (1) Echo "Device hme0 FAILED" and "Acquired a WRITE Lock". Find the Logical IPMultiNIC IP addresses and store them in a table (StoreIPAddresses). (2) Echo "Bringing down IP addresses". Run BringDownIPAddresses. This brings down the Logical IP's and if this is Solaris 8, unplumbs the Logical NIC's. The Base IP is then brought down and the NIC unplumbed. (3) Plumb the next NIC. Bring up the new NIC with the Base IP. Echo "Trying to online Device qfe0". (a) If IfconfigTwice=1, bring down the NIC and bring it up with the Base IP again. (b) If $ArpDelay > 0, SLEEP for $ArpDelay seconds, then run ifconfig on the NIC to bring up the Broadcast IP. This help updates the ARP tables. (c) Add routes from RouteOptions if any. NOTE: In VCS 2.0, set $use_broadcast_ping = 0 before proceeding to Step "5. (4)". (4) Do the following test loop *up to* $HandshakeInterval/10 times, but exit the test once there is a success: (a) Echo "Sleeping 5 seconds". SLEEP 5 seconds. (b) Update the ARP tables on the subnet by doing ifconfig on the NIC to bring up the Broadcast IP. (c) If NetworkHosts is not used, echo "Pinging Broadcast address 192.168.1.255 on Device qfe0, iteration XX". Run PacketCountTest. NOTE: In VCS 2.0, the test is "If NetworkHosts is not used *or* $use_broadcast_ping = 1, run PacketCountTest." (i) If PingOptimize=1, run netstat -in using GetCurrentStats to compare current network traffic with previous network traffic. (ii) If PingOptimize=0, run netstat -in to get current traffic, then ping the Broadcast IP 5 times, then run netstat -in again, then compare traffic. (d) If NetworkHosts is used, echo "Pinging 192.168.1.1 with Device qfe0 configured: iteration 1", and ping a NetworkHost. Return success if any of the hosts is pingable. (e) If ping tests fail, continue with the Step "5. (4)" test loop. NOTE: In VCS 2.0, set $use_broadcast_ping = 1. (5) If there is no success after all the tests, echo "Tried the PingTest 10 times" and "The network did not respond", and set PingReallyFailed=1. Echo "Giving up on NIC qfe0". Bring down and unplumb the NIC. If there is a 3rd NIC, repeat above steps from "5. (3)", otherwise echo "No more Devices configured. All devices are down. Returning offline" and exit 100 (resource offline). NOTE: In VCS 2.0, there is an added check to see if the next NIC in the ordered list of NIC's is actually the current active NIC that has failed. In VCS 1.3, the monitor script would return to the failover loop in STEP 5.(3) but immediately exit with a false ONLINE. See Incident 68872. ********************************************** 6. If failover is successful, migrate the Logical IP's by running MigrateIPAddresses and checking the Logical IP addresses table. Do the following for each Logical IP: (a) Plumb logical interface (if Solaris 8) and bring up the Logical IP. (b) If IfconfigTwice=1, bring down the Logical IP and bring it up again. (c) If $ArpDelay > 0, SLEEP for $ArpDelay seconds, then run ifconfig on the NIC to bring up the Broadcast IP. This updates the ARP tables. (d) Echo "Migrated to Device qfe0" and "Releasing Lock". --------------------------------------------------------------------- MULTINICA PINGOPTIMIZE NETWORKHOSTS Basically, MultiNICA tests for NIC failure (or network failure) in 1 of 3 ways at a time. 1. NetworkHosts NOT being used, PingOptimize = 1. These are defaults. In this case, MultiNICA simply depends on incoming packets to tell whether a NIC is okay or not. If the packet count increases, it thinks the NIC is up, if the packet count does not increase, it thinks the NIC is down. It will check a few times before declaring the NIC dead and begin failover to the next NIC. This is the simplest MultiNICA monitoring. On very quiet networks, it might generate a false reading. It might do a NIC to NIC failover if it sees no packets coming in after several retries. 2. NetworkHosts is being used. In this case, the MultiNICA monitor script ignores what you have for PingOptimize. Whether you have PingOptimize = 0 or 1 will have no effect on the monitoring. MultiNICA will ping the IP address(es) listed in NetworkHosts attribute to determine if a NIC is up or down. It doesn't depend on knowing the Broadcast address to test a failed NIC (plumb up and bring up the Base IP). This is a pretty popular option as it allows you to test connectivity to specific hosts (like a router). 3. NetworkHosts NOT being used, PingOptimize = 0. In this case, the MultiNICA will try to find out the broadcast address of the NIC and ping it to generate network traffic. It then runs a packet count test like in #1 above. Because network traffic is generated by the monitor script, this test is sometimes more reliable. This option is not too popular in places where the network admins don't want a lot of broadcast traffic being generated. In the VCS 2.0 and 1.3 monitor scripts, if #3 is what you have, the MultiNICA will not plumb up and online your Base IP if the interface is already unplumbed (e.g., by VCS). --------------------------------------------------------------------- DISKGROUP NOAUTOIMPORT IMPORT DEPORT The DiskGroup Agent online script imports Disk Groups with a -t option. This sets the "noautoimport" flag to "on", so that on reboots, the Disk Group will not be automatically imported by VxVM outside of VCS. Because of this, the Agent will first detect if a Disk Group it needs to import has already been imported, and if it was imported without the -t option. If the Disk Group was imported without the -t option, the online will intentionally fail. This is a safety feature to prevent split brains. If a Disk Group was imported without the -t option, the "noautoimport" flag will be set to "off". The Disk Group should be deported before allowing VCS to import it. To see the flag, do this: vxprint -at <Disk Group> EXAMPLE: #vxprint -at oradg Disk group: oradg dg oradg tutil0=" tutil1=" tutil2=" import_id=0.2635 real_name=oradg comment="putil0=" putil1=" putil2=" dgid=965068944.1537.anandraj rid=0.1025 update_tid=0.1030 disabled=off noautoimport=on nconfig=default nlog=default base_minor=70000 version=60 activation=read-write diskdetpolicy=invalid movedgname= movedgid= mo ve_tid=0.0 The "noautoimport" flag is "on", and that is what should be seen in Disk Groups controlled by VCS. ------------------------------------------------------------------ GAB LLT PORTS a GAB internal use b I/O Fencing d ODM (Oracle Disk Manager) f CFS (VxFS cluster feature) h VCS engine (HAD) j vxclk monitor port k vxclk synch port l vxtd (SCRP) port m vxtd replication port o vcsmm (Oracle RAC/OPS membership module) q qlog (VxFS QuickLog) s Storage Appliance t Storage Appliance u CVM (Volume Manager cluster feature) v CVM w CVM x GAB test user client z GAB test kernel client -------------------------------------------------- SCRIPTPRIORITY NICE VALUES PROCESS Here is a table showing equivalent nice values between ScriptPriority, ps and top. ScriptPriority "ps -elf NIce value" "top nice value" 60 0 -20 20 14 -6 0 20 0 -20 26 6 -60 39 20 -------------------------------------------------- VERITAS TCP PORTS Here are some TCP ports used by VCS and related products: 8181 VCS and GCM web server 14141 VCS engine 14142 VCS engine test 14143 gabsim 14144 notifier 14145 GCM port 14147 GCM slave port 14149 tdd Traffic Director port 14150 cmd server 14151 GCM DNS 14152 GCM messenger 14153 VCS Simulator 15151 VCS GAB TCP port --------------------------------------------------- SNMP MONITORING NOTIFIERMNGR AGENT VCS 2.0 introduced a new Agent called NotifierMngr. This agent can send SNMP traps to your site's SNMP console. Here is an example of a main.cf on ClusterA sending SNMP traps to server "draconis". NotifierMngr ntfr ( PathName = "/opt/VRTSvcs/bin/notifier" SnmpConsoles = { draconis = Information } SmtpServer = localhost SmtpRecipients = { "root@localhost" = SevereError } ) On ClusterA, I can run commands like: hagrp -online testGroup -sys tonic hagrp -switch testGroup -to gin On draconis, I can run the Solaris snoop command and see the SNMP traffic coming from ClusterA: draconis # snoop -P -d hme0 port 162 Using device /dev/hme (non promiscuous) gin -> draconis UDP D=162 S=39012 LEN=408 gin -> draconis UDP D=162 S=39012 LEN=439 gin -> draconis UDP D=162 S=39012 LEN=409 gin -> draconis UDP D=162 S=39012 LEN=406 You can also download SNMP software from the internet like UCD-snmp for Solaris 8, and watch the trap information: draconis # ./snmptrapd -P -e -d -n 2002-08-27 10:07:59 UCD-snmp version 4.2.3 Started. Received 431 bytes from 10.140.16.16:39012 0000: 30 82 01 AB 02 01 01 04 06 70 75 62 6C 69 63 A7 0........public. 0016: 82 01 9C 02 01 01 02 01 00 02 01 00 30 82 01 8F ............0... 0032: 30 82 00 0F 06 08 2B 06 01 02 01 01 03 00 43 03 0.....+.......C. 0048: 1F 47 D0 30 82 00 1B 06 0A 2B 06 01 06 03 01 01 .G.0.....+...... 0064: 04 01 00 06 0D 2B 06 01 04 01 8A 16 03 08 0A 02 .....+.......... 0080: 02 07 30 82 00 11 06 0C 2B 06 01 04 01 8A 16 03 ..0.....+....... You can download such a tool from places like here: http://sourceforge.net/project/showfiles.php?group_id=12694 If your SNMP console is *inside* the cluster, the SNMP traps may be sent through the loopback 127.0.0.1 device. Solaris snoop cannot listen on that device, so you won't see any SNMP traffic using snoop. ---------------------------------------------------------- LLT PACKETS SNOOP NETWORK You can use a packet sniffer, like Solaris's snoop, to see if VCS is flooding your network with LLT packets. For example, you can do this on a machine outside of the cluster: # snoop -Pr -t d -d eri0 -x 0,36 ethertype 0xcafe Using device /dev/eri (non promiscuous) 0.00000 ? -> (broadcast) ETHER Type=CAFE (Unknown), size = 70 byte s 0: ffff ffff ffff 0800 20cf 7e73 cafe 0101 ........ .~s.þ.. 16: f40a 0000 0001 ffff 8c00 0000 0038 0000 .............8.. 32: 0000 8000 .... 0.01630 ? -> (broadcast) ETHER Type=CAFE (Unknown), size = 70 byte s 0: ffff ffff ffff 0800 20ab 4f1d cafe 0101 ........ .O..þ.. 16: 0b0a 0000 0001 ffff 8c00 0000 0038 0000 .............8.. 32: 0000 8000 .... The 12 characters to the left of the string "cafe" is the source MAC address for that packet. So in the example above, you may want to look for machines with interfaces with MAC addresses of 08:00:20:cf:7e:73 and 08:00:20:ab:4f:1d. You can find other pieces of information from Line 16 of the snoop output. For example, in the second packet above, we find the following: 1. Source cluster-ID = 0b 2. The "0a" means this is a heartbeat packet 3. Source node-ID = 0001 By default, LLT packets are 70 bytes in size. -----------------------------------------------------------------