[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | ase |
|
Moderator: | SMURF::GROSSO |
|
Created: | Thu Jul 29 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2114 |
Total number of notes: | 7347 |
1860.0. "3.2g/1.3/hsz40 remove LSM mirror - hangs process" by NETRIX::"[email protected]" (decatl::johnson) Wed Feb 05 1997 11:33
I am submitting this here because I have yet obtained the specifics
required for a IPMT from the CSC and do not have access to resources or
system for followup answers:
3.2G ASE 1.3
Installed patches for LSM vold and ASE 1.3
I am assumming
PATCH ID: OSF375-350244, ASE130-015, & ASE130-014
(vold,voldisk),(lsm_dg_action),(am.o,am_scsi.o)
Initially the contact blamed the action on am.o patch, but his problem
statement (below) futher defines his suspects.
The problem is that when a disk was removed from a mirrored LSM
volume, the "# ls /filesystem" hung indefinately.
They then
set the ASE service off line
manually put LSM disk/diskgroup on line
mounted the /filesystem (outside of ASE)
and repeated the test.
the ls command did not hang.
I have the output of checkit from the system and expect the output
from the other system soon, and a 16 page fax of HSZ> output, but
do not have additional specifics as to the specific time these
events took place (which syslog), where in uerf. there is lots
of uerf info, looks like reseveration stuff...
Anyway, if you have any idea of the problem from this info,
it would help the issue::
=============================================================================
This is from Dave Ascher, Systems Integeration, at a Hot Customer
Production Site:
=============================================================================
The problem which I am reporting is that when a disk was pulled
out of the shelf (I'm afraid I cannot identify which disk at
this time) I/O to the LSM VOLUME of which that disk was one
of TWO mirrored PLEXes was hung indefinitely. THe I/O was
generated using an "ls" command.
When the DECsafe service was set OFFLINE, the disks brought
ONLINE manually using the "voldisk online" command, the disk
group imported using the "sapdg import" command, the file systems
mounted manually - the same experiment yielded the expected
behavior - that is I/O to the VOLUME completed normally.
We have installed the newest lsm_dg_action script (v 1.2.21.2?)
the newest versions of asehsm and aseagent and the newest am.o
and am_scsi.o. While I originally suspected that the problem
is most likely in the AM driver, I now believe that asehsm
is an equally likely culprit.
Clearly, the microcode version of the HSZ or the Rev level of the disk
or the KZPSA, etc... is not a factor as they remain constant
in our experiment. The experiment points right at a DECsafe
component.
......remainder deleted, dave can include it if he desires......
==============================================================================
Additional info from Sid, to help with the above email::
> Clearly, the microcode version of the HSZ or the Rev level of the disk
> or the KZPSA, etc... is not a factor as they remain constant
----------------------------------------------
from email/uerf: kzpsa (rev A10) -->
from uerf: bus (scsi4 through scsi8) -->
from hsz/uerf: hsz40 (rev 27 or 30) -->
from hsz: dual redundant -->
from hsz: units (D0-D1,D101,D202,etc____) -->
from hsz: disks (DISK110,etc) -->
from hsz: rz29B's (most at 016, some at 014)
----------------------------------------------
this is some of the output from df::
sap_PRD_domain4#sapdata1 16753952 5981952 10644144 36%
/o
racle/PRD/sapdata1
sap_PRD_domain5#sapdata2 16753952 5777040 10964240 35%
/o
racle/PRD/sapdata2
sap_PRD_domain6#sapdata3 16753952 5294496 11446784 32%
/o
racle/PRD/sapdata3
sap_PRD_domain7#sapdata4 16753952 7639360 9101920 46%
/o
racle/PRD/sapdata4
sap_PRD_domain8#sapdata5 16753952 6512960 10228320 39%
/o
racle/PRD/sapdata5
this is some of the output from volprint::
----------------------------------------------
v proddisk07 fsgen ENABLED ACTIVE 8376988 SELECT -
pl proddisk07-01 proddisk07 DISABLED NODEVICE 8376988 CONCAT - RW
sd rza33-01 proddisk07-01 0 0 8376988 rza33 -
pl proddisk07-02 proddisk07 ENABLED ACTIVE 8376988 CONCAT - RW
sd rza41-01 proddisk07-02 0 0 8376988 rza41 rza41
v proddisk08 fsgen ENABLED ACTIVE 8376988 SELECT -
pl proddisk08-01 proddisk08 ENABLED ACTIVE 8376988 CONCAT - RW
sd rzb33-01 proddisk08-01 0 0 8376988 rzb33 rzb33
pl proddisk08-02 proddisk08 ENABLED ACTIVE 8376988 CONCAT - RW
sd rzb41-01 proddisk08-02 0 0 8376988 rzb41 rzb41
v proddisk09 fsgen ENABLED ACTIVE 8376988 SELECT -
pl proddisk09-01 proddisk09 ENABLED ACTIVE 8376988 CONCAT - RW
sd rzc33-01 proddisk09-01 0 0 8376988 rzc33 rzc33
pl proddisk09-02 proddisk09 ENABLED ACTIVE 8376988 CONCAT - RW
sd rzc41-01 proddisk09-02 0 0 8376988 rzc41 rzc41
----------------------------------------------
i asked if the disk was rza33, answer 'dont know'
----------------------------------------------
rza33 sliced - - online
rza34 sliced rza34 prod_data online
rza40 sliced rza40 prod_data online
rzf73 sliced rzf73 prod_data online
rzf74 sliced rzf74 prod_data online
- - rza33 prod_data failed was:rza33
----------------------------------------------
only one service in asemgr, part of the output is
----------------------------------------------
Status for DISK service `dbfailover`
Status: Relocate: Placement Policy: Favored Member(s):
on copley no Restrict to Member(s) copley,pierre
Storage configuration for DISK service `dbfailover`
Mount Table (device, mount point, type, options)
/dev/vol/prod_data/proddisk37 NONE ufs NONE
/dev/vol/prod_data/proddisk38 NONE ufs NONE
...
/dev/vol/prod_data/proddisk20 NONE ufs NONE
/dev/vol/prod_data/proddisk19 NONE ufs NONE
LSM Configuration
Disk Group: Device(s):
prod_data rza32 rza33 rza34 rza40 rza41 rza42 rza48 rza49 rza50 rza56
rz
a57 rza58 rza64 rza65 rza66 rza72 rza73 rza74 rzb32 rzb33 rzb34 rzb40 rzb41
rzb4
2 rzb48 rzb49 rzb50 rzb56 rzb57 rzb58 rzb64 rzb65 rzb66 rzb72 rzb73 rzb74
rzc32
rzc33 rzc34 rzc40 rzc41 rzc42 rzc48 rzc49 rzc50 rzc56 rzc57 rzc58 rzc64 rzc65
rz
c66 rzc72 rzc73 rzc74 rzd32 rzd33 rzd34 rzd40 rzd41 rzd42 rzd48 rzd49 rzd50
rzd5
6 rzd57 rzd58 rzd64 rzd65 rzd66 rzd72 rzd73 rzd74 rze32 rze33 rze34 rze40
rze41
rze42 rze48 rze49 rze50 rze56 rze57 rze58 rze64 rze65 rze66 rze72 rze73 rze74
rz
f32 rzf33 rzf34 rzf40 rzf41 rzf42 rzf48 rzf49 rzf50 rzf56 rzf57 rzf58 rzf64
rzf6
5 rzf66 rzf72 rzf73 rzf74
----------------------------------------------
grep ASE rc.config
ASELOGGER="1"
export ASELOGGER
ASE_PARTIAL_MIRRORING="ON"
export ASE_PARTIAL_MIRRORING
---------drwxr-xr-x 2 root system 8192 Oct 3 15:34 root_domain
drwxr-xr-x 2 root system 8192 Jan 25 14:02 sap_PRD_domain0
drwxr-xr-x 2 root system 8192 Jan 30 02:51 sap_PRD_domain1
drwxr-xr-x 2 root system 8192 Jan 30 20:48 sap_PRD_domain2
drwxr-xr-x 2 root system 8192 Jan 30 20:51 sap_PRD_domain3
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain4
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain5
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain6
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain7
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain8
drwxr-xr-x 2 root system 8192 Jan 30 23:30 sap_PRD_domain9
drwxr-xr-x 2 root system 8192 Jan 27 23:37 test_domain
drw/etc/fdmns/sap_PRD_domain4:
total 0
lrwxr-xr-x 1 root system 29 Jan 30 02:51 proddisk16 ->
/dev/vol/pro
d_data/proddisk16
lrwxr-xr-x 1 root system 29 Jan 25 13:38 proddisk37 ->
/dev/vol/pro
d_data/proddisk37
/etc/fdmns/sap_PRD_domain5:
total 0
lrwxr-xr-x 1 root system 29 Jan 30 02:51 proddisk22 ->
/dev/vol/pro
d_data/proddisk22
lrwxr-xr-x 1 root system 29 Jan 30 18:27 proddisk43 ->
/dev/vol/pro
d_data/proddisk43
xr-xr-x 2 root system 8192 Oct 3 17:51 tmp_domain
drwxr-xr-x 2 root system 8192 Oct 3 17:51 usr_domain
drwxr-xr-x 2 root system 8192 Oct 3 17:51 var_domain
-------------------------------------
This if from another person working the issue::
Production
----------
CPU E2061-DA REV A01
MEMORY MS7CC-DF REV B01
MS7CC-DP REV B01
KFTIA (E2054-AA) REV C09
KFTHA (E2052-AA) REV D03
KZPSA-BA QUANTITY=4
KZPSA REV A10 QUANTITY=6
FDDI QUANTITY=2
ALPHA FIRMWARE UPGRADED WITH VERSION 3.8 CD
QA
--
CPU E2061-DA REV A01
MEMORY MS7CC-FP REV B01
KFTIA (E2054-AA) REV C09
KFTHA (E2052-AA) REV D02
KZPSA-BA QUANTITY=4
KZPSA REV A10 QUANTITY=6
FDDI QUANTITY=2
ALPHA FIRMWARE UPGRADED WITH VERSION 3.8 CD
-------------------------------------
[Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines
|
---|