UnixPedia : HPUX / LINUX / SOLARIS: June 2014

Friday, June 13, 2014

VERITAS :File system need full fsck when I/O error occurred while reading the inode list.:



File system need full fsck when I/O error occurred while reading the inode list.:.
Overview
File system need full fsck when I/O error occurred while reading the inode list..
Procedures
ERROR MESSAGE:

Jun 11 21:49:29 esxdbp14 vmunix: vxfs: WARNING: msgcnt 3 mesg 079: V-2-79: vx_tranuninode - /db04 file system inode 120 marked bad ondisk
Jun 11 21:49:31 esxdbp14 vmunix: vxfs: WARNING: msgcnt 4 mesg 016: V-2-16: vx_ilisterr: vx_iupdat_local_0 - /db04 file system error reading inode 120
Jun 11 22:24:15 esxdbp14 vmunix: vxfs: WARNING: msgcnt 6 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6
Jun 11 22:24:19 esxdbp14 vmunix: vxfs: WARNING: msgcnt 7 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6
Jun 11 22:26:51 esxdbp14 vmunix: vxfs: WARNING: msgcnt 8 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6
Jun 11 22:26:52 esxdbp14 vmunix: vxfs: WARNING: msgcnt 9 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6
Jun 11 22:27:31 esxdbp14 vmunix: vxfs: WARNING: msgcnt 10 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6
Jun 11 22:27:47 esxdbp14 vmunix: vxfs: WARNING: msgcnt 11 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6


DETAILED DESCRIPTION

An I/O error occurred while reading the inode list. The VX_FULLFSCK flag is set.

When inode information is no longer dependable, the kernel marks it bad on disk. The most common reason for marking an inode bad is a disk I/O failure. If there is an I/O failure in the inode list, on a directory block, or an indirect address extent, the integrity of the data in the inode, or the data the kernel tried to write to the inode list, is questionable. In these cases, the disk driver prints an error message and one or more inodes are marked bad.

The kernel also marks an inode bad if it finds a bad extent address, invalid inode fields, or corruption in directory data blocks during a validation check. A validation check failure indicates the file system has been corrupted. This usually occurs because a user or process has written directly to the device or used fsdb to change the file system.

The VX_FULLFSCK flag is set in the super-block so fsck will do a full structural check the next time it is run.


We verified the /db04 file system related disk information.

The below disk is used to create /db04 file system and one of the disk status is showing as "failing" status.


From RACDG disk group information.

c5t1d0       auto:cdsdisk    racdisk6     racdg        online shared failing
c5t0d7       auto:cdsdisk    racdisk5     racdg        online shared
c5t1d4       auto:cdsdisk    racdisk0     racdg        online shared
c3t2d6       auto:cdsdisk    racdisk13    racdg        online shared
c3t3d0       auto:cdsdisk    racdisk15    racdg        online shared
c5t0d3       auto:cdsdisk    racdisk1     racdg        online shared
c5t0d4       auto:cdsdisk    racdisk2     racdg        online shared
c7t2d3       auto:cdsdisk    racdisk18    racdg        online shared
c7t2d4       auto:cdsdisk    racdisk19    racdg        online shared
c5t0d6       auto:cdsdisk    racdisk4     racdg        online shared


From  crsdg  disk group information.

c5t1d6       auto:cdsdisk    crsdsk1      crsdg        online shared failing
c5t1d7       auto:cdsdisk    crsdsk2      crsdg        online shared failing
c5t2d0       auto:cdsdisk    crsdsk3      crsdg        online shared failing


Veritas Storage Foundation lists the status of a disk as "failing" in response to errors that are detected while reading or writing to a disk. The status is designed to draw administrative attention to disks that have experienced errors. Reviewing the status of the disks in the disk array, as well as any connected storage area network (SAN) components, is recommended to determine if a hardware problem exists.

Since it is possible for a disk to be flagged as "failing" in response to an isolated event, this status does not necessarily mean that the disks have a hardware problem.


SOLUTION:

Check the console log for I/O errors. If the problem is a disk failure, replace the disk. If the problem is not related to an I/O failure, find out how the disk became corrupted. If no user or process is writing to the device, report the problem to your customer support organization. In either case, unmount the file system and use fsck to run a full structural check.

1. Check the disk array as well as any connected SAN components for hardware problems.
2. Review the messages log, for the operating system, for events that refer to disk read and write errors.
3. If no persistent I/O errors are discovered, it may be that the "failing" status was triggered by transient error rather than a truly failing disk. In this case, you can simply clear the status. If the failing status continues to reappear for the same disk, it may be a sign that there is genuine, hardware problem with the disk, or with the SAN connectivity.

Kindly verify the  external storage device status with SAN team.

To clear failing flag kindly run below commands.


#vxedit -g racdg set failing=off racdisk6

#vxedit -g crsdg set failing=off crsdsk1      

#vxedit -g crsdg set failing=off crsdsk2

#vxedit -g crsdg set failing=off crsdsk3      

REFERENCE:





Keywords.
veritas

Monday, June 9, 2014

HPUX : How to configure Quorum server for the cluster



How to configure Quorum server for the cluster :.
Overview
How to configure Quorum server for the cluster : Quorum server is used as tiebreaker in serviceguard set-up . To avoid split-brain-syndrome.
Procedures

                1.Choose a node or nodes that are not part of this cluster. The nodes can be running either HP-UX or Linux.


                2.Install the Quorum Server software (B8467BA version A.02.00):


                3.Either from the HP Serviceguard Distributed Components CD, or Download the product for free from http://software.hp.com, under
                "High Availability." The last time I looked,
                it was titled "Serviceguard Quorum Server." Use the appropriate tool, i.e., swinstall or rpm to install the product

                3. Put the entry in  /etc/inittab gets started at boot-up time and gets restarted (the respawn action in /etc/inittab) if necessary.
                The entry should look something like this:

                                qs:345:respawn:/usr/lbin/qs >> /var/adm/qs/qs.log 2>&1
                               
                4.Now we can start the qs daemon by running init q
                5.Check that the qs daemon is running by monitoring the file /var/adm/qs/qs.log

Keywords.
Cluster ,quorum, init.

Thursday, June 5, 2014

HPUX :After a Firmware update, the rx7620 would Not boot because the system defaulted to nPars mode and not vPars.



After a Firmware update, the rx7620 would Not boot because the system defaulted to nPars mode and not vPars.
Overview
After a Firmware update, the rx7620 would not boot because the system defaulted to nPars mode and not vPars
Procedures
Upon trying to boot the system using the defined boot menu option, the system fails to boot with:
©) Copyright 2004 Hewlett-Packard Development Company, L.P.All rights reserved
HP-UX Boot Loader for IPF -- Revision 2.027
Press Any Key to interrupt Autoboot
\efi\hpux\AUTO ==> boot /stand/vpmon -a
Seconds left till autoboot - 0
AUTOBOOTING...> System Memory = 12151 MB
loading section 0
... (complete)
loading section 1
................................................................................
......... (complete)
loading symbol table
loading System Directory (boot.sys) to MFS
.....Launching /stand/vpmon
SIZE: Text:1126K + Data:45108K + BSS:18958K = Total:65194K
Console is on a Serial Device
ERROR: Please use the vparenv(1M) command in HP-UX or the vparconfig(1M) command at the EFI prompt to switch to vPars mode .
Unsupported boot environment for the vPar monitor. Resetting the system!
========
The /EFI/HPUX directory is missing the vparconfig command. The command needed to be reinstalled using efi_cp by copying the command from /usr/newconfig/sbin/vparconfig.efi to /EFI/HPUX/vparconfig.efi .
We first booted the server in nPar mode in order to get the server to a point where we could issue the commands to get the EFI command copied to the disk's EFI partition. Then the disk where the EFI partion resides is c0t6d0s1 , and we used the commands:
# lvlnboot -v
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
/dev/dsk/c0t6d0s2 (0/0/0/2/0.6.0) -- Boot Disk
Boot: lvol1 on: /dev/dsk/c0t6d0s2
Root: lvol3 on: /dev/dsk/c0t6d0s2
Swap: lvol2 on: /dev/dsk/c0t6d0s2
Dump: lvol2 on: /dev/dsk/c0t6d0s2, 0

# efi_ls -d /dev/rdsk/c0t6d0s1 /efi/hpux
FileName Last Modified Size
. 5/ 5/2006 0
.. 5/ 5/2006 0
HPUX.EFI 5/ 5/2006 541306
NBP.EFI 5/ 5/2006 24576
AUTO 5/ 8/2006 21

total space 523251712 bytes, free space 519626752 bytes


# efi_cp -d /dev/rdsk/c0t6d0s1 vparconfig.efi /efi/hpux
# efi_cp -d /dev/rdsk/c0t6d0s1 crashdump.efi /efi/hpux

# efi_ls -d /dev/rdsk/c0t6d0s1 /efi/hpux
FileName Last Modified Size
. 5/ 5/2006 0
.. 5/ 5/2006 0
HPUX.EFI 5/ 5/2006 541306
NBP.EFI 5/ 5/2006 24576
AUTO 5/ 8/2006 21
crashdump.efi 4/16/2007 107990
vparconfig.efi 4/16/2007 101897
Total space 523251712 bytes, free space 519626752 bytes
While the system is still running in nPars mode, use the vparenv command to change to vPars mode:
# vparenv -m vpars
The other option would be to reboot the system and enter the EFI shell to run the vparconfig command:
Shell> fs0:

fs0:\> cd efi\hpux
fs0:\EFI\HPUX> vparconfig
Current mode: nPars.

vparconfig supports the following options:
vparconfig
vparconfig reboot vPars
vparconfig reboot nPars

fs0:\EFI\HPUX> vparconfig reboot vpars
Rebooting ...


Affected systems:
rx7620
rx7640
rx8620
rx8640
Superdome SX1000 SD32A SD64A
Superdome SX2000 SD32B SD64B
.
Keywords.
efi_cp, vparenv,

Tuesday, June 3, 2014

HPUX : PACKAGE EXECUTION STEPS IN SG

During Run Script Execution
Once the package manager has determined that the package can start on a particular node, it
launches the script that starts the package (that is, a package’s control script or master control
script is executed with the start parameter). This script carries out the following steps:
1. Executes any external_pre_scripts
2. Activates volume groups or disk groups.
3. Mounts file systems.
4. Assigns package IP addresses to the NIC on the node (failover packages only).
5. Executes any customer-defined run commands (legacy packages only
6. Starts each package service.
7. Starts up any EMS (Event Monitoring Service) resources needed by the package that were
specially marked for deferred startup.
8. Exits with an exit code of zero (0).



 During Halt Script Execution
Once the package manager has detected the failure of a service or package that a failover package
depends on, or when the cmhaltpkg command has been issued for a particular package, the
package manager launches the halt script. That is, a package’s control script or master control
script is executed with the stop parameter.



1. Halts any deferred resources that had been started earlier.
2. Halts all package services.
3. Executes any customer-defined halt commands (legacy packages only) or external_scripts.
4. Removes package IP addresses from the NIC on the node.
5. Unmounts file systems.
6. Deactivates volume groups.
7. Exits with an exit code of zero (0).
8. Executes any external_pre_scripts