Skip to content

Problem Analysis bye Live Partition Mobility

–    Problem with  LPM:
— Error Message:

HSCL400A There was a problem running the VIOS command. HSCLA29A The RMC command issued to partition VIO2102 failed.
The partition command is:
migmgr -f get_adapter  -t vscsi -s U9117.MMC.6583EEB-V2-C38 -w
13857705817653313572 -W 13857705817653313573 -d 2 The RMC return code is:
0
The OS command return code is:
82
The OS standard out is:
Running method ‘/usr/lib/methods/mig_vscsi’
82

— show log File on VIO:

alog -t cfg -o | less

C0 7405736 20:06:37 mig_vscsi.c 555 leaving mig_vscsi fn= get_adapter, rc=
83
CS 11993122 10223736 /usr/sbin/migmgr -f get_adapter -t vscsi -s
U9117.MMC.65D615E-V1-C63 -w 13857705817655803990 -W 13857705817655803991 -d
2
C4 11993122 Running method ‘/usr/lib/methods/mig_vscsi’
CS 11993122 10223736 20:06:37 mig_vscsi.c 485 /usr/sbin/migmgr -f get_adapter -t vscsi -s U9117.MMC.65D615E-V1-C63 -w 13857705817655803990 -W
13857705817655803991 -d 2
C0 11993122 20:06:38 vsmig_util.c 1750 original pipe_ctrl from RMC =0x1
M0 11993122 20:06:38 vsmig_util.c 1886 testing string adapter/vdevice/IBM,vfc-server
C0 11993122 20:06:38 npiv_wwpn.c 190 GPN_FT failed for wwpn
c0507605173d0056
C0 11993122 20:06:38 npiv_wwpn.c 196 GPN_FT failure:
cmd_rsp_code:    0x8001
reason_code:     0x9
reason_code_exp: 0x7
C0 11993122 20:06:38 mig_vscsi.c 555 leaving mig_vscsi fn= get_adapter, rc=
83

–    Problem analysis

— Clear up cfglog for all VIO Servers

rm /var/adm/ras/cfglog
echo “log cleared” | alog -t cfg
echo “Create cfg log1” | alog -t cfg

# create file /etc/drlog.cmd containing below line:

echo “CFGLOG=timestamp,detail,verbosity:9” > /etc/drlog.cmd

— run Validate again:

Recreate the failure using the following migrlpar syntax and save to filename, migrlpar_lshwres.log.

To list manage system names
# lssyscfg -r sys -F name

To recreate failure during validation:
# migrlpar -o v -m <source_managed_system> -t <target_managed_system> -p <mobile_partition_name> -i “source_msp_id=<source_VIO_lpar_ID>,source_msp_ipaddr=<source_VIO_IPaddr>,dest_msp_id=<target_VIO_lpar_ID>,dest_msp_ipaddr=<target_VIO_IPaddr>” -d 5 -v

To recreate failure during the actual migration, change ‘-o v’ to ‘-o m’.

—  check alog again

alog -t cfg -o > /tmp/alog_cfg_`date +%Y%m%d_%H%M%S`.log

Tagged ,

we get on VIO in errpt “DELAYED INTS” error

we saw on VIO Server

Excessive interrupt disablement time                                    
                                                                       
Problembeschreibung                                                     
LABEL:          DELAYED_INTS                                            
IDENTIFIER:     A2205861                                                
                                                                        
Date/Time:       Thu Jul 10 20:34:35 CEST 2014                          
Sequence Number: 28277                                                  
Machine Id:      00C7DE574C00                                           
Node Id:         apu001                                                 
Class:           S                                                      
Type:            PERF                                                   
WPAR:            Global                                                 
Resource Name:   SYSPROC                                                
                                                                        
Description                                                             
Excessive interrupt disablement time                                    
                                                                        
Probable Causes                                                         
SOFTWARE PROGRAM                                                        
                                                                        
Failure Causes                                                          
SOFTWARE PROGRAM                                                        
                                                                        
        Recommended Actions                                             
        REPORT DETAILED DATA 


      

Here is IBM explanation:

The error you highlighted in your errpt isn't showing a problem, it's
just a warning message. The DELAYED_INTs errpt mechanism is designed to
give "heads-up" regarding possible problems within the kernel, like if
an I/O routine is taking too long. The error DELAYED_INTS does not
really mean an actual error or failed state in the system and the
errors are simply logged to keep track of target areas of kernel which
stay disabled for long periods of time.
The device driver may disable interrupts for certain actions. If time
spent with disabled interrupts (measured by kernel) seems long an entry
in errpt is logged.
.
Further information on the error can can be found at:
http://www-1.ibm.com/support/docview.wss?uid=isg3T1000678

Since the DELAYED_INTS entry does not indicate an actual error or
failed state in the system. The entries aresimply logged to keep
track of target areas of the kernel which stay disabled for long
periods of time" The stack showed that it's about krlocks().This
may point to an issue with krlock() and maybe too much time spend
spinning while waiting for the krlock.

You can turn error checking persistently off at the system level with
 # errctrl errcheckoff -c all
or persistently on with
 # errctrl errcheckon -c all

 

 

 

 

 

 

 

 

HMC ALERT: VAR USAGE EXCEEDS THRESHOLD

Check disk usage by login as hscroot and open restricted shell.
Enter the command monhmc -r disk -n 0 to check the disk usage.
Enter the command chhmcfs -o f -f /var -s 1000 to delete unused files

Continue reading ›

Tagged

Change on the fly VIO FC mapping because of problem with Live Partition Mobility

we want move the lpar to another CEC. but we get the following error:

Migr: LPAR=nim9, Src_CEC=mars-9117-MMB-SNXXXXXX, Trgt_CEC=jupiter-9117-MMD-SNXXXXX, Src_VIO=xxx.xxx.xxx.xxx, Trgt_VIO=xxx.xxx.xxx.xxx
Partition migration failed with the following errors:
HSCLA27C The operation to get the physical device location for adapter U9117.MMB.06xxxxx-V2-C49 on the virtual I/O server partition VIO2 has failed.
The partition command is:
migmgr -f get_adapter -t vscsi -s U9117.MMB.XXXXXX-V2-C49 -w xxxxxxxxxxxxx -W xxxxxxxxxxxxx -d 2
The partition standard error is:

HSCL400A There was a problem running the VIOS command. HSCLA29A The RMC command issued to partition  VIO2 failed.
The partition command is:
migmgr -f get_adapter -t vscsi -s U9117.MMB.xxxxxP-V2-C49 -w xxxxxxxxxx -W xxxxxxxxxx -d 2

Continue reading ›

Tagged ,

How to create /etc/niminfo file on the client.

root@selix  [/home/root]
# niminit -a name=selix -a master=nim1 -a connect=nimsh
0513-059 The nimsh Subsystem has been started. Subsystem PID is 6095084.

root@selix  [/home/root]

Tagged ,

Create SEA Failover

Create two virtual adapter one for SEA other for control channel

$ entstat -all ent2 | grep VLAN ( set trunk priority 1 for primary and set trunk prio 2 for secondary VIO2 Server) (keep the access external network checkbox selected.)

Here is an example:
Invalid VLAN ID Packets: 0
Port VLAN ID:     20
VLAN Tag IDs:  None
$ entstat -all ent3 | grep VLAN    ( For Control Channel)
Invalid VLAN ID Packets: 0
Port VLAN ID:     29
VLAN Tag IDs:  None

Continue reading ›

Tagged ,

Configure SEA on the VIO

1) $lsdev -type adapter  ( look the output )

ent0             Available   4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)
ent1             Available   4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)
ent2             Available   4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)
ent3             Available   4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)
ent4             Available   EtherChannel / IEEE 802.3ad Link Aggregation

Continue reading ›

Tagged ,

Volume Groups limit on AIX

VG

 

 

 

 

 

 

 

 

 

 

 

 

example1: Normal VG

mkvg -s 16 -t 2 -y datavg hdisk4 (partition size is 16. Factor(t) is 2.  1016 x 2= 2032

( that means this volumegroup can have up to 16 disk. see table

example2: Big VG

mkvg -B -t 16 -y data2vg hdisk2 hdisk3 hdisk4 ( 16×1016=16256)

(This volume group can have up to 8 disks)

Tagged , ,

superblock corrupt in AIX?

each filesystem hast two superblock. one in logical block 1 and copy in logical block 31. We can copy supberblock fro block 31 to block1 for the root file system here is the command:

dd count=1 bs=4k skip=31 seek=1 if=/dev/hd4 of=/dev/hd4

Tagged ,

Boot problem on AIX

boot

 

 

 

 

 

 

 

 

 

 

 

 

 

Continue reading ›

Tagged ,