HP Serviceguard NFS
Toolkit - "Stale NFS file
handle" when Package Failed Over
Overview
|
HP Serviceguard NFSToolkit – “Stale NFS file Handle" when
Package Failed Over
|
Procedures
|
Issue
After package failover, accessing the
automount NFS mount point
giving Stale NFS file
handle.
We had upgraded the ONCplus from version
B.11.31.10 to B.11.31.11 hence overuled the possibility that this issue is
related to QXCR1001067886 titled "After an NFS package re-start, NFS
client may not be able to access the NFS
mount due to stale file handle
error (ESTALE)". Also make sure all network patches are the latest.
scenario:
1) Login to NODE1, make sure the package is
up:
[formatted]
# date
Fri May 27 16:56:50 wib 2011
# cmviewcl
CLUSTER
STATUS
cluster_ecc
up
NODE
STATUS STATE
NODE1
up running
PACKAGE STATUS
STATE
AUTO_RUN NODE
dbciRP1
up
running
enabled
NODE1
NODE
STATUS STATE
NODE2
up running
[unformatted]
2) Make sure the automount NFS mountpoints are accessible:
[formatted]
# date
Fri May 27 16:56:58 wib 2011
# bdf
Filesystem
kbytes used avail %used Mounted on
..snap..
NODE1:/export/usr/sap/trans
42958848 77173
40201578 0% /usr/sap/trans
NODE1:/export/sapmnt/RP1
14647296 1327971 12487061 10% /sapmnt/RP1
[unformatted]
3) login to the other host
[formatted]
# date
Fri May 27 16:57:13 wib 2011
# rlogin NODE2
# uname -a
HP-UX NODE2 B.11.31 U ia64 1843445582
unlimited-user license
[unformatted]
3) Make sure the automount NFS mountpoints are accessible in this
host:
[formatted]
# bdf
Filesystem
kbytes used avail %used Mounted on
..snap..
NODE1:/export/usr/sap/trans
42958848 77173
40201578 0% /usr/sap/trans
NODE1:/export/sapmnt/RP1
14647296 1327971 12487061 10% /sapmnt/RP1
[unformatted]
4) Halt the package from NODE2:
[formatted]
# date
Fri May 27 16:57:36 wib 2011
# uname -a
HP-UX NODE2 B.11.31 U ia64 1843445582
unlimited-user license
# cmhaltpkg dbciRP1
Disabling automatic failover for
failover packages to be halted.
Halting package dbciRP1
Successfully halted package dbciRP1
One or more packages or package
instances have been halted.
The failover packages have AUTO_RUN
disabled and no new instance can
start automatically. To allow automatic
start, enable AUTO_RUN via
cmmodpkg -e <package_name>
cmhaltpkg: Completed successfully on
all packages specified
* syslog NODE2
May 27 16:57:41 syslog: cmhaltpkg
dbciRP1
May 27 16:57:42 syslog: Request
from root on node NODE2
to halt package dbciRP1
May 27 16:57:42 cmcld[9295]:
Request from root on node
JKTECCRP2 to halt package dbciRP1
May 27 16:57:42 cmcld[9295]:
Request from node NODE1 to
disable global switching for package
dbciRP1.
May 27 16:57:45 cmcld[9295]:
(NODE1) Halted package
dbciRP1 on node NODE1.
* syslog NODE1
May 27 16:57:43 cmcld[8177]: Request
from node NODE1 to
disable global switching for package
dbciRP1.
May 27 16:57:43 cmcld[8177]: Disabled
switching for package
dbciRP1.
May 27 16:57:43 cmcld[8177]: Request
from node NODE1 to
begin the halting process for package
dbciRP1 on node NODE1.
May 27 16:57:43 cmcld[8177]: Halting
package dbciRP1 on
node JKTECCRP1 as requested by user.
May 27 16:57:43 cmcld[8177]: Request
from node NODE1 to
halt package dbciRP1 on node NODE1.
May 27 16:57:43 cmcld[8177]: Executing
'/etc/cmcluster/RP1/dbciRP1.control.script stop' for package dbciRP1,
as service PKG*78849.
May 27 16:57:43 cmserviced[8181]:
Request to stop package
dbciRP1
May 27 16:57:43 syslog: cmmodnet -r -i
10.170.12.3
10.170.12.0
May 27 16:57:47 LVM[8907]:
vgchange -a n vg11
May 27 16:57:47 LVM[8911]:
vgchange -a n vg12
May 27 16:57:47 LVM[8915]: vgchange
-a n vg19
May 27 16:57:47 LVM[8919]:
vgchange -a n vg02
May 27 16:57:47 LVM[8923]:
vgchange -a n vg07
May 27 16:57:47 LVM[8927]:
vgchange -a n vg13
May 27 16:57:47 LVM[8931]:
vgchange -a n vg14
May 27 16:57:47 LVM[8935]:
vgchange -a n vg08
May 27 16:57:47 LVM[8939]:
vgchange -a n vg17
May 27 16:57:47 LVM[8943]:
vgchange -a n vg15
May 27 16:57:47 LVM[8947]:
vgchange -a n vg09
May 27 16:57:47 LVM[8951]:
vgchange -a n vg16
May 27 16:57:47 LVM[8955]:
vgchange -a n vg18
May 27 16:57:47 LVM[8959]:
vgchange -a n vg10
May 27 16:57:47 LVM[8963]:
vgchange -a n vg03
May 27 16:57:47 cmserviced[8181]:
Package Script for
dbciRP1 completed successfully with an
exit(0).
May 27 16:57:47 cmcld[8177]: Halted
package dbciRP1 on node
JKTECCRP1.
* dbciRP1.control.script.log of NODE2
###########
Node "NODE2": Halting package at Fri May 27
16:56:10 wib 2011 ###########
May 27 16:56:10 - Node
"NODE2": Remove IP address 10.170.12.3
from subnet 10.170.12.0
HANFS -- May 27 16:56:10 - Node
"NODE2": Unexporting filesystem
on /export/sapmnt/RP1
HANFS -- May 27 16:56:10 - Node
"NODE2": Unexporting filesystem
on /export/usr/sap/trans
HANFS -- May 27 16:56:10 - Node
"NODE2": Killing rpc.statd
HANFS -- May 27 16:56:10 - Node
"NODE2": Killing rpc.lockd
HANFS -- May 27 16:56:10 - Node
"NODE2": Restarting rpc.statd
HANFS -- May 27 16:56:11 - Node
"NODE2": Restarting rpc.lockd
May 27 16:56:12 - Node
"NODE2": Unmounting filesystem on
/dev/vg10/lvol1
...
May 27 16:56:14 - Node
"NODE2": Deactivating volume group vg03
Deactivated volume group in Exclusive
Mode.
Volume group "vg03" has been
successfully changed.
###########
Node "NODE2": Package halt completed at Fri
May 27 16:56:14 wib 2011 ###########
/etc/cmcluster/RP1/dbciRP1.control.script[369]: - o
largefiles,delaylog, nodatainlog:
not found.
...
/etc/cmcluster/RP1/dbciRP1.control.script[383]: - o
largefiles,delaylog, nodatainlog:
not found.
[unformatted]
5) Now run the package on NODE2:
[formatted]
# date
Fri May 27 16:57:47 wib 2011
# uname -a
HP-UX NODE2 B.11.31 U ia64 1843445582
unlimited-user license
# cmviewcl
CLUSTER
STATUS
cluster_ecc
up
NODE
STATUS STATE
NODE1
up running
NODE2
up
running
UNOWNED_PACKAGES
PACKAGE STATUS
STATE
AUTO_RUN NODE
dbciRP1
down
halted disabled
unowned
# cmrunpkg dbciRP1
Running package dbciRP1 on node NODE2
Successfully started package dbciRP1 on
node NODE2
cmrunpkg: All specified packages are
running
* dbciRP1.control.script.log of NODE2
###########
Node "NODE2": Starting package at Fri May 27
16:58:04 wib 2011 ###########
May 27 16:58:04 - Node
"NODE2": Activating volume group vg11 with
exclusive option.
Activated volume group in Exclusive
Mode.
Volume group "vg11" has been
successfully changed.
...
May 27 16:58:04 - Node
"NODE2": Checking filesystems:
/dev/vg19/lvol1
/dev/vg02/lvol1
/dev/vg07/lvol1
/dev/vg11/lvol1
/dev/vg12/lvol1
/dev/vg13/lvol1
/dev/vg14/lvol1
/dev/vg08/lvol1
/dev/vg17/lvol1
/dev/vg15/lvol1
/dev/vg09/lvol1
/dev/vg16/lvol1
/dev/vg18/lvol1
/dev/vg10/lvol1
/dev/vg03/lvol1
/dev/vg19/rlvol1:file system is clean - log replay is not
required
...
May 27 16:58:05 - Node
"NODE2": Mounting /dev/vg19/lvol1 at
/export/usr/sap/trans
...
###########
Node "NODE2": Package start completed at Fri
May 27 16:58:05 wib 2011 ###########
/etc/cmcluster/RP1/dbciRP1.control.script[369]: - o
largefiles,delaylog, nodatainlog:
not found.
...
/etc/cmcluster/RP1/dbciRP1.control.script[383]: - o
largefiles,delaylog, nodatainlog:
not found.
[unformatted]
6) Check bdf - we get Stale NFS
error
[formatted]
# date
Fri May 27 16:58:06 wib 2011
# uname -a
HP-UX NODE2 B.11.31 U ia64 1843445582
unlimited-user license
# bdf
Filesystem
kbytes used avail %used Mounted on
..snap..
bdf: /usr/sap/trans: Stale NFS
file handle
bdf: /sapmnt/RP1: Stale NFS
file handle
[unformatted]
7) Go back to the first node, and check bdf -
the Stale NFS message also seen there:
[formatted]
# uname -a
HP-UX NODE1 B.11.31 U ia64 2515239678
unlimited-user license
# bdf
Filesystem
kbytes used avail %used Mounted on
..snap..
bdf: /usr/sap/trans: Stale NFS
file handle
bdf: /sapmnt/RP1: Stale NFS
file handle
# date
Fri May 27 16:58:23 wib 2011
[unformatted]
Solution
Running ll /dev/vg*/* on both nodes
reveal that the minor and major number of the shared-volumes are not the
same.
The names of the volume groups must be unique
within the cluster, and the major and minor numbers associated with the
volume groups must be the same on all nodes. In addition, the mounting points
and exported file system names
must be the same on all nodes.
The preceding requirements exist because NFS uses the major number, minor number,
inode number, and exported directory as part of a file handle
to uniquely identify each NFS file. If differences exist between the
primary and adoptive nodes, the client's file
handle would no longer point to
the correct file location after
movement of the package to a different node.
It is recommended that filesystems used for NFS be created as journaled file systems (FStype vxfs). This ensures
the fastest recovery time in the event of a package switch to another node.
Fixed the issue by vgexport/vgimport.
|
Vgimport, vgexport, cmviewcl, NFS stale handle
|
Compressed Games Free Download
ReplyDeleteNeed For Speed Most Wanted (NFS)
Read more
Need for Speed COMPRESSED GAMES
Read more
need for speed underground download free game
Read more
Need For Speed Pro Street Pc Game free download
Read more
need for speed rivals download free game
Read more
need for speed 2 download free game
Read more
need for speed most wanted download free pc game
Read more
Need for Speed SHIFT 2 Unleashed
Read more
Need for Speed Most Wanted free download.full version, pc game compressed
Read more
need for speed hot pursuit download free game