Below are the steps which can be used while dealing with such situations efficiently:
Case 1: When one of the storage path goes “offline”
Please refer to the instructions on resetting HBA port “Driver Specific HBA Reset Method”
#-> adapter_info
/sys/class/scsi_host/host2: wwnn=0x20000090fa504ad2 wwpn=0x10000090fa504ad2 state=Offline
/sys/class/scsi_host/host3: wwnn=0x20000090fa504ad3 wwpn=0x10000090fa504ad3 state=Online
Determine if the driver supports resetting the HBA.
[root@system.fnf.com:/sys/class/fc_host/host2]#
#-> modinfo lpfc | grep reset
parm: lpfc_enable_hba_reset:Enable HBA resets from the driver. (uint)
Verify the driver parameter is configured as needed.
This parameter is enabled by default:
The actual reset parameter is available under scsi_host class within sysfs:
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> pwd
/sys/class/fc_host/host2/device/scsi_host:host2
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> ls -l |grep -i lpfc_enable_hba_reset
-r--r--r-- 1 root root 4096 Sep 1 08:15 lpfc_enable_hba_reset
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> cat lpfc_enable_hba_reset
1
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> adapter_info
/sys/class/scsi_host/host2: wwnn=0x20000090fa504ad2 wwpn=0x10000090fa504ad2 state=Offline
/sys/class/scsi_host/host3: wwnn=0x20000090fa504ad3 wwpn=0x10000090fa504ad3 state=Online
Reset the HBA via the SCSI host/driver:
To reset the lpfc within a RHEL6 environment, echo 'selective' string to the parameter:
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> echo 'selective' > /sys/class/scsi_host/host2/issue_reset
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> adapter_info
/sys/class/scsi_host/host2: wwnn=0x20000090fa504ad2 wwpn=0x10000090fa504ad2 state=Online
/sys/class/scsi_host/host3: wwnn=0x20000090fa504ad3 wwpn=0x10000090fa504ad3 state=Online
[root@system.fnf.com:/sys/class/fc_host/host2/device/scsi_host:host2]#
#-> multipath -ll
mpathc (360060e801606a300000106a300004835) dm-4 HP,OPEN-V
[size=65G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=1][active]
\_ 3:0:0:1 sdd 8:48 [active][ready]
\_ 2:0:0:1 sdb 8:16 [active][ready]
mpathb (360060e801606a300000106a300004834) dm-5 HP,OPEN-V
[size=70G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=1][active]
\_ 3:0:0:0 sdc 8:32 [active][ready]
\_ 2:0:0:0 sda 8:0 [active][ready]
Case 2: When multipath -ll gets hung and the server’s load goes very high
Reset HBA Card using hbacmd :
Redhat:
#-> /usr/sbin/hbacmd listhba
SUSE:
#->/usr/sbin/hbanyware/hbacmd listhba
Manageable HBA List
Port WWN : 10:00:00:00:c9:8e:76:f2
Node WWN : 20:00:00:00:c9:8e:76:f2
Fabric Name: 10:00:00:05:1e:9b:2b:e4
Flags : 8000fe0d
Host Name : system.fnf.com
Mfg : Emulex Corporation
Serial No. : MY193264UB
Port Number: 0
Mode : Initiator
Port WWN : 10:00:00:00:c9:8e:76:f3
Node WWN : 20:00:00:00:c9:8e:76:f3
Fabric Name: 10:00:00:05:1e:9b:18:5d
Flags : 8000fe0d
Host Name : system.fnf.com
Mfg : Emulex Corporation
Serial No. : MY193264UB
Port Number: 1
Mode : Initiator
#/usr/sbin/hbanyware/hbacmd listhba
#adapter_info
#multipath -ll
#/usr/sbin/hbanyware/hbacmd reset 10:00:00:00:c9:8e:76:f2(PORT WWN)
#multipath -ll
Gradually we can reset the other ports as well and post which the multipath hung issue will get resolved as we can see all the paths available. The server load automatically comes down post the HBA reset in case it has caused due to multipath hung issue.