UnixPedia : HPUX / LINUX / SOLARIS: UX:vxfs fsck: WARNING: V-3-20837: FILE SYSTEM HAD I/O ERROR(S) ON USER DATA.

Saturday, February 22, 2014

UX:vxfs fsck: WARNING: V-3-20837: FILE SYSTEM HAD I/O ERROR(S) ON USER DATA.

On system tiger (roar)  , a file system has been reported with IO error.
Result of that DB crashed as corrupted fs is holding binary of database.
UNIX team run the fsck and full check on the fs result are negative. Seems like
no way it can be fixed.

Question: what/how/why caused this situation (corruption) with FS on server.
Answer: On first look it seems that someone screwed the FS or a SAN disk has been
Removed which carry the user data ( beta aaj tho lag gayee).

In hope of fixing the issue I tried running below command (fsck). Command have one character (s )(u) difference from situation of yours, and hope it should fix it. if everything work as expected then world would be a different world.

Production DB was down for more than 3 hrs, my comments of FS (corruption) hits the panic
button, Team gone into a critical issue.

[root@tiger:/.root]#
#-> fsck -F vxfs -o full /dev/bc_vg_odd/lv_m01oracle
UX:vxfs fsck: WARNING: V-3-20837: file system had I/O error(s) on user data.
log replay in progress
UX:vxfs fsck: ERROR: V-3-25433: fsck write failure devid = 0, bno = 135153328, off = 0, len = 8192
full file system check required, exiting ...
[root@tiger:/.root]#
#-> fsck -F vxfs -y -o full /dev/bc_vg_odd/lv_m01oracle
UX:vxfs fsck: WARNING: V-3-20837: file system had I/O error(s) on user data.
log replay in progress
UX:vxfs fsck: ERROR: V-3-25433: fsck write failure devid = 0, bno = 135153328, off = 0, len = 8192
full file system check required, exiting ...


VG is showing one disk is less in number on active / current disk count. i given the mixing disk a name
"ghost disk"  (ghost also exist on server, yes why not zombie can why not Ghost.). Ghost disks LE are present in Corrupted FS. Seems like today data growth landed on those sectors of LE.

Recommendation: no way data going to recovery on the lv , decision made to recreate the LV and restore the data from BCV FS ( when all option closed , looks for light). Server is running BCV script mid-night backup have a kept the data in good shape.

 What is hard part of solution, explaining the story of corruption? Yes.

No comments:

Post a Comment