UnixPedia : HPUX / LINUX / SOLARIS: VERITAS :File system need full fsck when I/O error occurred while reading the inode list.:

File system need full fsck when I/O error occurred while reading the inode list.:.

Overview

File system need full fsck when I/O error occurred while reading the inode list..

Procedures

ERROR MESSAGE:

Jun 11 21:49:29 esxdbp14 vmunix: vxfs: WARNING: msgcnt 3 mesg 079: V-2-79: vx_tranuninode - /db04 file system inode 120 marked bad ondisk

Jun 11 21:49:31 esxdbp14 vmunix: vxfs: WARNING: msgcnt 4 mesg 016: V-2-16: vx_ilisterr: vx_iupdat_local_0 - /db04 file system error reading inode 120

Jun 11 22:24:15 esxdbp14 vmunix: vxfs: WARNING: msgcnt 6 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

Jun 11 22:24:19 esxdbp14 vmunix: vxfs: WARNING: msgcnt 7 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

Jun 11 22:26:51 esxdbp14 vmunix: vxfs: WARNING: msgcnt 8 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

Jun 11 22:26:52 esxdbp14 vmunix: vxfs: WARNING: msgcnt 9 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

Jun 11 22:27:31 esxdbp14 vmunix: vxfs: WARNING: msgcnt 10 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

Jun 11 22:27:47 esxdbp14 vmunix: vxfs: WARNING: msgcnt 11 mesg 008: V-2-8: vx_direrr: vx_readdir2_3 - /db04 file system dir inode 16385 dev/block 0/22490472 dirent inode 0 error 6

DETAILED DESCRIPTION

An I/O error occurred while reading the inode list. The VX_FULLFSCK flag is set.

When inode information is no longer dependable, the kernel marks it bad on disk. The most common reason for marking an inode bad is a disk I/O failure. If there is an I/O failure in the inode list, on a directory block, or an indirect address extent, the integrity of the data in the inode, or the data the kernel tried to write to the inode list, is questionable. In these cases, the disk driver prints an error message and one or more inodes are marked bad.

The kernel also marks an inode bad if it finds a bad extent address, invalid inode fields, or corruption in directory data blocks during a validation check. A validation check failure indicates the file system has been corrupted. This usually occurs because a user or process has written directly to the device or used fsdb to change the file system.

The VX_FULLFSCK flag is set in the super-block so fsck will do a full structural check the next time it is run.

We verified the /db04 file system related disk information.

The below disk is used to create /db04 file system and one of the disk status is showing as "failing" status.

From RACDG disk group information.

c5t1d0 auto:cdsdisk racdisk6 racdg online shared failing

c5t0d7 auto:cdsdisk racdisk5 racdg online shared

c5t1d4 auto:cdsdisk racdisk0 racdg online shared

c3t2d6 auto:cdsdisk racdisk13 racdg online shared

c3t3d0 auto:cdsdisk racdisk15 racdg online shared

c5t0d3 auto:cdsdisk racdisk1 racdg online shared

c5t0d4 auto:cdsdisk racdisk2 racdg online shared

c7t2d3 auto:cdsdisk racdisk18 racdg online shared

c7t2d4 auto:cdsdisk racdisk19 racdg online shared

c5t0d6 auto:cdsdisk racdisk4 racdg online shared

From crsdg disk group information.

c5t1d6 auto:cdsdisk crsdsk1 crsdg online shared failing

c5t1d7 auto:cdsdisk crsdsk2 crsdg online shared failing

c5t2d0 auto:cdsdisk crsdsk3 crsdg online shared failing

Veritas Storage Foundation lists the status of a disk as "failing" in response to errors that are detected while reading or writing to a disk. The status is designed to draw administrative attention to disks that have experienced errors. Reviewing the status of the disks in the disk array, as well as any connected storage area network (SAN) components, is recommended to determine if a hardware problem exists.

Since it is possible for a disk to be flagged as "failing" in response to an isolated event, this status does not necessarily mean that the disks have a hardware problem.

SOLUTION:

Check the console log for I/O errors. If the problem is a disk failure, replace the disk. If the problem is not related to an I/O failure, find out how the disk became corrupted. If no user or process is writing to the device, report the problem to your customer support organization. In either case, unmount the file system and use fsck to run a full structural check.

1. Check the disk array as well as any connected SAN components for hardware problems.

2. Review the messages log, for the operating system, for events that refer to disk read and write errors.

3. If no persistent I/O errors are discovered, it may be that the "failing" status was triggered by transient error rather than a truly failing disk. In this case, you can simply clear the status. If the failing status continues to reappear for the same disk, it may be a sign that there is genuine, hardware problem with the disk, or with the SAN connectivity.

Kindly verify the external storage device status with SAN team.

To clear failing flag kindly run below commands.

#vxedit -g racdg set failing=off racdisk6

#vxedit -g crsdg set failing=off crsdsk1

#vxedit -g crsdg set failing=off crsdsk2

#vxedit -g crsdg set failing=off crsdsk3

REFERENCE:

http://www.symantec.com/business/support/index?page=content&id=HOWTO24530

http://www.symantec.com/business/support/index?page=content&id=HOWTO24942

http://www.symantec.com/business/support/index?page=content&id=HOWTO24705

http://www.symantec.com/business/support/index?page=content&id=TECH61915

Keywords.