UnixPedia : HPUX / LINUX / SOLARIS: August 2013

Thursday, August 29, 2013

HPUX : Swinstall got Hung while software install.

While installing a software product on the server it got hung.

Reason:

swinstall is checking for File system mount on the server against the fstab. To verify the bad entry in fstab ,run list_expander

#-> /opt/ignite/lbin/list_expander | more
WARNING: Filesystem /usr/local/CPR/tmp/mnt_tmp is not mounted.  It will be
         ignored.
WARNING: Filesystem /export/oracle/PPJ/sapbackup is not mounted.  It will be
         ignored.
WARNING: Filesystem /export/oracle/PPJ/archivelog_backup is not mounted.  It
         will be ignored.
WARNING: statvfs failed.  '/sap_backup' in /etc/fstab or /etc/mnttab not found:
         Value too large to be stored in data type (errno = 72)
WARNING: statvfs failed.  '/sap_backup' in /etc/fstab or /etc/mnttab not found:
         Value too large to be stored in data type (errno = 72)

 

Recommendation:

 

Hashed the enty in fstab and rerun the swinstall.

 

Saturday, August 24, 2013

HPUX : HOW TO RECOVER FILE FROM IGNITE IMAGE

To recover a file from ignite /etc/passwd

Issue :

/etc/passwd has been deleted

If needed, you can recover the system from it's image file:
On ignite server:
# cd /var/opt/ignite/recovery/archives/mymachine
   # ls
   2010-12-12,15:02  -- Name of the archive we extract from.


   # gzcat ./2010-12-12,15:02 | pax -r -f - etc/passwd




Recovering hosts file :

[root@mickey:/Unix_Ignite_Image8/archives/itmpbus1]#
#-> cp -p /etc/hosts /etc/hosts.backup
[root@mickey:/Unix_Ignite_Image8/archives/Donald]#
#-> gzcat 2014-05-01,08:01| pax -r -f - etc/hosts
[root@mickey:/Unix_Ignite_Image8/archives/Donald]#
#-> ll
total 10720728
-rw------- 1 65534 65534 5467472874 May 1 08:39 2014-05-01,08:01
drwxr-xr-x 2 65534 65534 4096 May 7 14:48 etc
[root@mickey:/Unix_Ignite_Image8/archives/Donald]#
#-> cd etc
[root@mickey:/Unix_Ignite_Image8/archives/Donald/etc]#
#-> ll
total 8
-rw-r--r-- 1 65534 65534 446 Apr 25 10:56 hosts
[root@mickey:/Unix_Ignite_Image8/archives/Donald/etc]#
#-> cat hosts
#
# hosts This file describes a number of hostname-to-address
# mappings for the TCP/IP subsystem. It is mostly
# used at boot time, when no name servers are running.
# On small systems, this file can be used instead of a
# "named" name server.
# Syntax:
#
# IP-Address Full-Qualified-Hostname Short-Hostname
#

127.0.0.1 localhost
XX.xXX.XX.XX Donald.tnx.com Donald
[root@mickey:/Unix_Ignite_Image8/archives/Donald/etc]#
#-> uptime
2:52pm up 159 days, 10:38, 1 user, load average: 0.41, 0.35, 0.25
[root@mickey:/Unix_Ignite_Image8/archives/Donald/etc]#
 

Thursday, August 22, 2013

HPUX : Centrify Issue:LDAP error to connect.

Attempting bind to jnj.com(site:RaritanUS) as SA-NA-centrifySA@IN.PHP.COM on itsinraphpdc12.php.com
Unexpected LDAP Error Connect error
 due to unexpected configuration or network error.

Action:

On Unix server , tried to join the Client from AD domain controller , script exited with LDAP error to connect. 
In past issue is resolved by service restart on the DC.


#-> adinfo
Local host name:   itsinovq
Joined to domain:  php.com
Joined as:         itsinovq.ph.com
Pre-win2K name:    itsnaovq
Current DC:        itsinraphpdc12.php.com
Preferred site:    DELHI,IN
Zone:              php.com/Program Data/Centrify/Zones/ITSINOVQ-HPUX-NA
Last password set: 2013-08-22 18:53:17 EDT
CentrifyDC mode:   connected

HPUX/LINUX : NFS STALE HANDLE AT NFS CLIENT

     The VG minor number in the prod server and the QA server are inconsistent – this would result in NFS stale handle errors for the NFS clients after  a package failover

FFile system /sapmnt/DC1 exported to linux server, when package failover to SYSTEMP12.
Eexport file system on the linux show NFS stale handle.
Iissue:
MMinor number for volume group, /dev/vg_DC1_bin is different on both the node.
RRecommendations:
NNeed to adapt same minor number on both nodes.
F
fFor DC1 minor number is not same, Resulting in  NFS stale handle for exported FS.
[root@SYSTEMS12:/etc/cmcluster/DC1]#
#-> ll /dev/vg_DC1_bin/group
crw-r--r--   1 root       sys         64 0x100000 May 28  2009 /dev/vg_DC1_bin/group
[root@SYSTEMS12:/etc/cmcluster/DC1]#
#-> ssh SYSTEMP12 ll /dev/vg_DC1_bin/group
crw-r-----   1 root       sys         64 0x0d0000 Aug  6 20:17 /dev/vg_DC1_bin/group

For AB1  is this OK , so issue of NFS stale handle is not happen.
[root@SYSTEMS12:/etc/cmcluster/DC1]#
#-> ssh SYSTEMP12 ll /dev/vg_AB1_bin/group
crw-r--r--   1 root       sys         64 0x020000 May 22  2009 /dev/vg_AB1_bin/group
[root@SYSTEMS12:/etc/cmcluster/DC1]#
#-> ll /dev/vg_AB1_bin/group
crw-r--r--   1 root       sys         64 0x020000 Dec 11  2011 /dev/vg_AB1_bin/group
[root@SYSTEMS12:/etc/cmcluster/DC1]#








VCS : VERITAS CLUSTER COMMANDS

VERITAS OPERATIONAL COMMANDS

Starting and Stopping the cluster
"-stale" instructs the engine to treat the local config as stale
"-force" instructs the engine to treat a stale config as a valid one
hastart [-stale|-force]
Bring the cluster into running mode from a stale state using the configuration file from a particular server hasys -force <server_name>
stop the cluster on the local server but leave the application/s running, do not failover the application/s hastop -local
stop cluster on local server but evacuate (failover) the application/s to another node within the cluster hastop -local -evacuate
stop the cluster on all nodes but leave the application/s running hastop -all -force
Cluster Status
display cluster summary hastatus -summary
continually monitor cluster hastatus
verify the cluster is operating hasys -display
Cluster Details

information about a cluster haclus -display
value for a specific cluster attribute haclus -value <attribute>
modify a cluster attribute haclus -modify <attribute name> <new>
Enable LinkMonitoring haclus -enable LinkMonitoring
Disable LinkMonitoring haclus -disable LinkMonitoring
Resource Operations
Online a resource hares -online <resource> [-sys]
Offline a resource hares -offline <resource> [-sys]
display the state of a resource( offline, online, etc) hares -state
display the parameters of a resource hares -display <resource>
Offline a resource and propagate the command to its children hares -offprop <resource> -sys <sys>
Cause a resource agent to immediately monitor the resource hares -probe <resource> -sys <sys>
Clearing a resource (automatically initiates the onlining) hares -clear <resource> [-sys]
Resource Types
Add a resource type hatype -add <type>
Remove a resource type hatype -delete <type>
List all resource types hatype -list
Display a resource type hatype -display <type>
List a partitcular resource type hatype -resources <type>
Change a particular resource types attributes hatype -value <type> <attr> 

Monday, August 19, 2013

HPUX : SFTP :Connection closed

Issue :

When trying to connect with sftp command from server system05.jnj.com to server ABCD012.na.jnj.com with the user  plabtcp the following message is shown:

$ sftp plabtcp@ABCD012.na.jnj.com

########################################################################

#                                                                      #

#                          WARNING NOTICE:                             #

#  "You are about to enter a Private Network that is intended for the  #

#  authorized  use of a Private  Company and its affiliate  companies  #

#  (the "Company") for business purposes only. The actual or attempted #

#  unauthorized access,use or modification of this network is strictly #

#  prohibited by the Company. Unauthorized users  and/or unauthorized  #

#  use are subject to Company disciplinary proceedings and/or criminal #

#  and  civil penalties in  accordance  with applicable  domestic and  #

#  foreign laws. The use of this system may be monitored and recorded  #

#  for administrative and security reasons. If such monitoring and/or  #

#  recording  reveals possible  evidence of  criminal  activity,  the  #

#  Company may provide the monitored evidence of such activity to law  #

#  enforcement officials."                                             #

#                                                                      #

########################################################################

Connection closed

 

In syslog entry :

Aug 19 03:53:08 system05 sshd[11255]: Accepted publickey for plabtcp from 10.xx.xx.69 port 40958 ssh2

Aug 19 03:53:08 system05 sshd[11309]: subsystem request for sftp

Aug 19 03:53:08 system05 sshd[11309]: Received disconnect from 10.xx.xx.69: 11: disconnected by user

Aug 19 03:53:08 system05 sshd[11309]: ERROR: Unexpected error in conversation: (6)

Aug 19 03:53:09 system05 sshd[11376]: Accepted publickey for plabtcp from 10.xx.xx.69 port 22675 ssh2

Aug 19 03:53:09 system05 sshd[11388]: subsystem request for sftp

Aug 19 03:53:09 system05 sshd[11388]: Received disconnect from 10.xx.xx.69: 11: disconnected by user

 

 

Solution :

Account plabtcp is in locked state on the ABCD012.na.jnj.com. unlocking the user resolve the issue.

 

i=plabtcp

/usr/lbin/getprpw $i

/usr/lbin/modprpw -k -l $i; /usr/lbin/modprpw -v -l $i

 

 

 

HPUX :File system is busy while mounting the LV in package.

Issue :

File system is busy while mounting the LV in package.

 

vxfs mount: V-3-21264: /dev/vgappp1/lv_tcs_mdm is already mounted, /tcs_mdm is busy,

                allowable number of mount points exceeded

        WARNING:   Running fuser on /tcs_mdm to remove anyone using the busy mount point directly.

/tcs_mdm:

vxfs mount: V-3-21264: /dev/vgappp1/lv_tcs_mdm is already mounted, /tcs_mdm is busy,

                allowable number of mount points exceeded

        ERROR:  Function freeup_busy_mountpoint_and_mount_fs

        ERROR:  Failed to mount /dev/vgappp1/lv_tcs_mdm to /tcs_mdm

        ERROR:  Function check_and_mount

        ERROR:  Failed to mount /dev/vgappp1/lv_tcs_mdm

 

 

Solution :

This issue happen as /tcs_mdm FS is locally share before it sahred by package. which make the /tcs_mdm directory busy and cause

issue while mounting on LV.

 

Recommendation :

 

For such issue check few point

Check whether FS is local share or not

#->showmount -e <FS>

#unshare <FS>

 

check is any process is using it.

#-> fusr -cu <FS>  # if nothig is runing on it.

#-> fuser -ku <FS> #to relaase any process.

HPUX : ndd parameter tcp_fin_wait_2_timeout and close_wait

The ndd parameter tcp_fin_wait_2_timeout can be tuned to adjust how long the side in FIN_WAIT_2 remains in that status, the default being forever and set by value ‘0’ (zero). On the two systems, ‘system1’ and ‘system2’ , ‘tcp_fin_wait_2_timeout’ has a value set to ‘0’. Hence the connection will never time out and will stay open until either the fin is received or the system is rebooted. Setting the value to some non-zero value will make the connection be reset when the time value that we set (in milliseconds) has been reached.

CLOSE_WAIT is also one of the states of a TCP socket during the life of the connection and do not generally indicate any issue.
The CLOSE_WAIT state indicates that the remote end of the connection has finished transmitting data and that the remote applications has issued a close or shutdown call.  The local TCP stack is now asking for the local application that owns the socket to close the local socket as well. When a socket remains in the CLOSE_WAIT state for a long period it typically indicates some issue with the local application that prevents it from closing the local socket. The application may have coding issues or hung or may just be busy doing other work at the moment. There is no way, from the TCP side, to tell why the socket has not been closed and will require examining the application(s) that still have not closed the socket.
To determine which process or processes still have the socket open locally, public domain tool ‘lsof’ could be used.  From its output, look for the specific socket with a run string such as:

# lsof -i tcp:<port> -- This will show the process(es) that have socket tied to the local port number specified open at the moment

Action plan to rectify issues with FIN_WAIT_2:

·         # ndd –set /dev/tcp tcp_fin_wait_2_timeout 600000 – setting tcp_fin_wait_2_timeout for 10 mins


·         /etc/rc.config.d/nddconf – edit this file add entries of tcp_fin_wait_2_timeout in to this file to restrain the values across reboots

Sunday, August 18, 2013

HPUX : RESTORE LVM volume group configuration from backup file

vgcfgrestore - display or restore LVM volume group configuration from
backup file


  /usr/sbin/vgcfgrestore -n vg_name -l [-v]
  /usr/sbin/vgcfgrestore [-R] [-F] -n vg_name [-o old_pv_path] pv_path
  /usr/sbin/vgcfgrestore -f vg_conf_path -l [-v]
  /usr/sbin/vgcfgrestore [-R] [-F] -f vg_conf_path [-o old_pv_path]
       pv_path


vgcfgrestore cannot be performed on devices attached to activated
  volume groups.  Prior to restoring a backup configuration to a disk,
  detach the physical volume from the volume group using the pvchange
  command (see pvchange(1M)), or deactivate the volume group using the
  vgchange command (see vgchange(1M)).


  vgcfgrestore will refuse to restore a configuration to a physical
  volume with a block size different than the block size stored in the
  configuration backup file for that physical volume.

          EXAMPLES
  Restore the LVM configuration information for the physical volume
  /dev/rdsk/c0t7d0 that was saved in the default file
  /etc/lvmconf/vg00.conf:


       vgcfgrestore -n /dev/vg00 /dev/rdsk/c0t7d0


  Force to restore the LVM configuration data when volume group is still
  active


       vgcfgrestore -R -n /dev/vg00 /dev/rdsk/c0t7d0


  Restore the LVM configuration information to physical volume
  /dev/rdsk/c0t4d0 using alternate configuration file /tmp/vg00.backup:


       vgcfgrestore -f /tmp/vg00.backup /dev/rdsk/c0t4d0


  List backup information saved in default configuration file
  /etc/lvmconf/vg00.conf:


       vgcfgrestore -n /dev/vg00 -l


  Above command might display the following:


       Volume Group Configuration information in "/etc/lvmconf/vg00.conf"
       VG Name /dev/vg00
        ---- Physical volumes : 2 ----
           /dev/rdsk/c0t6d0 (Bootable)
           /dev/rdsk/c0t5d0 (Non-bootable)


  Restore LVM configuration information stored for /dev/rdsk/c0t7d0 in
  default configuration file /etc/lvmconf/vg01.conf to physical volume
  /dev/rdsk/c0t6d0:


       vgcfgrestore -n /dev/vg01 -o /dev/rdsk/c0t7d0 /dev/rdsk/c0t6d0

HPUX :Glance is no longer working after OM Agent 11 installed

Glance is no longer working after OM Agent 11 installed. When glance is run I get the following error:
ERROR : license error :The entity is not licensed.
These servers are running DC-OE glance they are licensed as part of the OS build.


# swlist | grep OE
HPUX11i-DC-OE B.11.31.1003 HP-UX Data Center Operating Environment


Cause
Starting from Operations Agent version 11 (AgentOne or OA11), GlancePlus is also part of the package. The installation of AgentOne, should automatically recognize that GlancePlus has been installed and if /var/opt/perf/gkey is found then it should set the license to permanent.


With the new OA11 licensing scheme, there is no need for gkey anymore. All we need is "oalicense -check GP" to go through to be able to launch Glance successfully.

Fix
First of all, Run the command "/opt/OV/bin/oalicense -get -all" and check if GlancePlus has a valid license.

For eg., on licensed system, with a permanent license for Glance.

# oalicense -get -all
LICENSE NAME TYPE ACTIVATION EXPIRY EXTN
------------------------------------------------------------------------------------------------------------
HP Operations OS Inst Adv SW LTU PERMANENT 10/Mar/2010 N/A N/A
HP Operations OS Instance Software LTU EVALUATION 10/Mar/2010 09/May/2010 0
HP Operations Real-Time UpG OS Instance Software LTU EVALUATION 10/Mar/2010 09/May/2010 0
HP Operations OS Instance Software LTU PERMANENT 10/Mar/2010 N/A N/A
HP Glance OS Instance LTU PERMANENT 11/Nov/2010 N/A N/A
HP Operations Real-Time UpG OS Instance Software LTU PERMANENT 17/Mar/2011 N/A N/A

It runs the command "/opt/OV/bin/oalicense -check GP" to determine if GlancePlus has a permanent license. If it fails, then you might see the error as mentioned.

To resolve the "ERROR : license error :The entity is not licensed." run the following command to apply the permanent license:


# oalicense -set -type PERMANENT "Glance Software LTU"


The output of oalicense -get -all is printed out by reading a file called, lic.dat located at /var/opt/OV/datafiles/sec/lic/ and the output of oalicense -check <ENTITY> is printed out by reading a cache file called, reslic.dat located at the same place as lic.dat.

If you find a permanent license indeed for GlancePlus from "oalicense -get -all" then you can delete the reslic.dat file and run the command, "oalicense -resolve", which will recreate the reslic.dat file. Now, run "oalicense -check GP" and check if the output is "Success", then you should be able to launch glance without any issues.

HPUX :BUSINESS COPY STEPS IN STEPS IN NUTSHELL


ON P-vol server

1. Make a file as rdsk and keep rdisk names

### Create horcm instance
cat rdsk |/HORCM/usr/bin/mkconf.sh -g test_vgname -i 98

### Take the disk info from the newely created horm and add it in the existing horcm file

### If disk from a new array, also add chip disk information

### Now stop and start the Horcm instance
horcmshutdown.sh 0
horcmstart.sh 0

### export the Horcm instance
export HORCMINST=0   [Instance  number]
export HORCC_MRCF=1

### Make entries in /etc/horcmperm0.conf in case of ITSAPP servers

Now go to s-vol server

ON S-vol server

### Create horcm instance
cat rdsk |/HORCM/usr/bin/mkconf.sh -g test_vgname -i 99

### Take the disk info from the newely created horm and add it in the existing horcm file

### If disk from a new array, also add chip disk information

### Now stop and start the Horcm instance
horcmshutdown.sh 1
horcmstart.sh 1

### export the Horcm instance
export HORCMINST=1   [Instance  number]
export HORCC_MRCF=1

### Make entries in /etc/horcmperm1.conf in case of ITSAPP servers

Once this is done, come back to P-vol server

### do a pairdisplay on test vg
pairdisplay -g test_vgprm02 -fxcd

### Do a paircreate
paircreate -g test_vgprm02 -vl

### Check status till ist 100 %
pairdisplay -g test_vgprm02 -fxcd

### once paicreate is done, run a pairsplit
pairsplit -g test_vgprm02

*************************************************************************

Once the Test part is complete

### Rename the test_vgname disk to normal vgnames, take special attention on disk numbers on both P-vol and S-vol servers

### Once done, stop/start the horcm

### Export env variables

### On the p-vol server, run pairdisplay and see if the disk are correctly synced
pairdisplay -g vgprm02 -fxcd