[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | VAX and Alpha VMS |
Notice: | This is a new VMSnotes, please read note 2.1 |
Moderator: | VAXAXP::BERNARDO |
|
Created: | Wed Jan 22 1997 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 703 |
Total number of notes: | 3722 |
575.0. "OpenVMS Alpha V7.1 MSCP serving problem" by BSS::JILSON (WFH in the Chemung River Valley) Thu May 08 1997 16:56
A customer has a very strange MSCP disk serving problem in OpenVMS Alpha
V7.1 . While we wait for them to force crashes of the MSCP server and the
non-local system for an IPMT I thought I'd post it here. This problem
happens with random disks on random nodes in this cluster and has only
started happening under V7.1. There is a VAX system in the cluster but the
problem even happens if that system is left powered off. The problem is
that disks that are supposed to be MSCP served are not being seen by the
non-local nodes.
Node ALONSO an AlphaServer 2100 4/233 has $7$DKA100 that is supposed to be
served. Node LEAR an AlphaServer 2100 4/200 does not see it and yet it
sees other $7$ disk from ALONSO. SCSCONNCNT is sufficient as is NPAGEDYN
on both nodes. I have killed and restarted the CONFIGURE process on LEAR
but $7$DKA100 still doesn't show up. On ALONSO SHOW DEVICE/SERVED shows
$7$DKA100 as being served. Also verified that there is an MSCP$DISK SYSAPP
open to both ALONSO & LEAR. From the uptime we see that LEAR was booted
first so this may be some new synch problem when ALONSO booted as the
customer claims they could reboot LEAR and the $7$DKA100 would then show
up on LEAR.
Has anyone seen anything similar?
Jilly
alonso$ show dev dka
Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
$1$DKA0: (DUNCAN) Mounted 0 INSTR_DRV1 1309208 1 7
$1$DKA100: (DUNCAN) Mounted 0 INSTR_DRV2 2572076 1 7
$1$DKA400: (OBERON) Mounted 0 USER_DRV1 1069668 1 7
$1$DKA600: (OBERON) Online 0
$5$DKA600: (LEAR) Online 0
$7$DKA0: (ALONSO) Mounted 1 AXPSYS2 2536644 585 5
$7$DKA100: (ALONSO) Mounted 0 PWRK_DRV2 271596 118 2
$7$DKA600: (ALONSO) Online wrtlck 0
$54$DKA0: (JUNO) Mounted 0 JUNO_1078 765843 1 5
$54$DKA400: (JUNO) Mounted 0 AXPSYS 598683 1 5
$56$DKA0: (MENTOR) Mounted 0 MENTOR_1080 176811 1 5
alonso$ show dev/served
MSCP-Served Devices on ALONSO 8-MAY-1997 15:30:36.09
Queue Requests
Device: Status Total Size Current Max Hosts
7$DKA0 Online 4110480 0 0 4
7$DKA100 Online 4110480 0 0 1
7$DKA600 Avail 0 0 0 0
alonso$ mcr sysgen show msc[p
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
MSCP_LOAD 1 0 0 16384 Coded-valu
MSCP_SERVE_ALL 1 0 0 2 Coded-valu
MSCP_BUFFER 128 128 16 -1 Coded-valu
MSCP_CREDITS 8 8 2 128 Coded-valu
MSCP_CMD_TMO 600 600 0 2147483647 CNTLRTMOs D
alonso$ show cpu
ALONSO, a AlphaServer 2100 4/233
alonso$ show system/noproc
OpenVMS V7.1 on node ALONSO 8-MAY-1997 15:31:58.37 Uptime 2 07:22:53
I/O data structures
-------------------
$7$DKA100 [ALONSO$DKA100] DEC RZ28M UCB: 810A5500
Device status: 08021810 online,valid,unload,lcl_valid,exfunc_supp
Characteristics: 1C4D4008 dir,fod,shr,avl,mnt,elg,idv,odv,rnd
21010281 clu,srv,nnm,nlt,scsi,dtn
Owner UIC [000001,000004] Operation count 369895 ORB address 8108BDC0
PID 00000000 Error count 0 DDB address 8105B080
Alloc. lock ID 43000126 Reference count 114 DDT address 87DFDFA0
Alloc. class 7 Online count 2 VCB address 81174780
Class/Type 01/36 Retry cnt/max 16/16 CRB address 8105B100
Def. buf. size 512 BOFF 00000C00 I/O wait queue 810A556C
DEVDEPEND 0BAC1056 Byte count 00000400
DEVDEPND2 00000000 SVAPTE 8152C608
DEVDEPND3 01000001 DEVSTS 00000004
FLCK index 3A
DLCK address 8105B180
--- Primary Class Driver Data Block (CDDB) 81070680 ---
Status: 00000000
Controller Flags: 0000
Allocation class 7 CDRP Queue 00000000 DDB address 8105B080
System ID 00000000 Restart Queue 00000000 CRB address 8105B100
00000000 DAP Count 0 CDDB link 00000000
Contrl. ID 00000000 Contr. timeout 0 PDT address 00000000
00000000 Reinit Count 0 Original UCB 00000000
Response ID 00000000 Wait UCB Count 0 UCB chain 00000000
MSCP Cmd status 00000000
*** PORT I/O queue is empty ***
*** DEVICE I/O queue is empty ***
*** I/O request queue is empty ***
--- Volume Control Block (VCB) 81174780 ---
Volume: PWRK_DRV2 Lock name: PWRK_DRV2
Status: A0 extfid,system
Status2: 14 mountver,nohighwater
Status3: 00000000
Mount count 1 Rel. volume 0 AQB address 810AFFC0
Transactions 122 Max. files 411048 RVT address 810A5500
Free blocks 271416 Rsvd. files 9 FCB queue 81177500
Window size 7 Cluster size 4 Cache blk. 81144400
Vol. lock ID 1700028C Def. extend sz. 5
Block. lock ID 0200029F Record size 0
--- ACP Queue Block (AQB) 810AFFC0 ---
ACP requests are serviced by the eXtended Qio Processor (XQP)
Status: 14 defsys,xqioproc
Mount count 25 ACP type f11v2 Request queue 00000000
ACP class 0
*** ACP request queue is empty ***
--- CDT Summary Page ---
CDT Address Local Process Connection ID State Remote Node
----------- ------------- ------------- ----- -----------
8102A0A0 SCS$DIRECTORY 626F0000 listen
8102A230 MSCP$TAPE 626F0001 listen
8102A3C0 VMS$VAXcluster 626F0002 listen
8102A550 MSCP$DISK 626F0003 listen
8102A6E0 VMS$SDA_AXP 626F0004 listen
8102A870 SCA$TRANSPORT 626F0005 listen
8102AA00 SCA$TRANSPORT 62870006 open OBERON
8102AB90 VMS$VAXcluster 626F0007 open JUNO
8102AD20 MSCP$DISK 626F0008 open OBERON
8102AEB0 VMS$VAXcluster 626F0009 open OBERON
8102B040 MSCP$DISK 626F000A open JUNO
8102B1D0 VMS$VAXcluster 626F000B open DUNCAN
8102B360 VMS$VAXcluster 626F000C open LEAR
8102B4F0 MSCP$DISK 626F000D open DUNCAN
8102B680 MSCP$DISK 626F000E open LEAR
8102B810 VMS$DISK_CL_DRVR 626F000F open OBERON
8102B9A0 VMS$DISK_CL_DRVR 626F0010 open DUNCAN
8102BB30 VMS$DISK_CL_DRVR 626F0011 open LEAR
8102BCC0 VMS$DISK_CL_DRVR 626F0012 open JUNO
8102BE50 VMS$TAPE_CL_DRVR 626F0013 open OBERON
8102BFE0 MSCP$DISK 62710014 open CERES
8102C170 VMS$DISK_CL_DRVR 626F0015 open CERES
8102C300 VMS$VAXcluster 62700016 open CERES
8102C490 PATHWORKScluster 62700017 listen
8102C620 PATHWORKScluster 62710018 open LEAR
8102C7B0 PATHWORKScluster 62740019 open LEAR
8102C940 VMS$VAXcluster 626F001A open MENTOR
8102CAD0 VMS$DISK_CL_DRVR 626F001B open MENTOR
8102CC60 MSCP$DISK 626F001C open MENTOR
Number of free CDT's: 13
alonso$ show mem/pool/full
System Memory Resources on 8-MAY-1997 15:33:42.19
Nonpaged Dynamic Memory (Lists + Variable)
Current Size (bytes) 24289280 Current Size (pagelets) 47440
Initial Size 24289280 Initial Size (pagelets) 47440
Maximum Size 114688000 Maximum Size (pagelets) 224000
Free Space (bytes) 14376320 Space in Use (bytes) 9912960
Largest Variable Block 12643072 Smallest Variable Block 64
Number of Free Blocks 6296 Free Blocks LEQU 64 Bytes 657
Free Blocks on Lookasides 1016 Lookaside Space (bytes) 470912
lear$ show system/noproc
OpenVMS V7.1 on node LEAR 8-MAY-1997 15:36:06.00 Uptime 2 07:26:55
lear$ show cpu
LEAR, a AlphaServer 2100 4/200
lear$ show dev dka
Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
$1$DKA0: (DUNCAN) Mounted 0 INSTR_DRV1 1308928 1 7
$1$DKA100: (DUNCAN) Mounted 0 INSTR_DRV2 2572040 1 7
$1$DKA400: (OBERON) Mounted 0 USER_DRV1 1069760 2 7
$1$DKA600: (DUNCAN) Online 0
$5$DKA600: (LEAR) Online wrtlck 0
$7$DKA0: (ALONSO) Mounted 0 AXPSYS2 2536644 1 5
$7$DKA600: (ALONSO) Online 0
$54$DKA0: (JUNO) Mounted 0 JUNO_1078 765843 1 5
$54$DKA400: (JUNO) Mounted 0 AXPSYS 598683 1 5
$56$DKA0: (MENTOR) Mounted 0 MENTOR_1080 176811 1 5
lear$ show mem/pool/full
System Memory Resources on 8-MAY-1997 15:36:59.42
Nonpaged Dynamic Memory (Lists + Variable)
Current Size (bytes) 17227776 Current Size (pagelets) 33648
Initial Size 17227776 Initial Size (pagelets) 33648
Maximum Size 71958528 Maximum Size (pagelets) 140544
Free Space (bytes) 7364672 Space in Use (bytes) 9863104
Largest Variable Block 5655296 Smallest Variable Block 64
Number of Free Blocks 5001 Free Blocks LEQU 64 Bytes 561
Free Blocks on Lookasides 1370 Lookaside Space (bytes) 603200
lear$ mcr sysgen show mscp
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
MSCP_LOAD 1 0 0 16384 Coded-valu
MSCP_SERVE_ALL 2 0 0 2 Coded-valu
MSCP_BUFFER 128 128 16 -1 Coded-valu
MSCP_CREDITS 8 8 2 128 Coded-valu
MSCP_CMD_TMO 600 600 0 2147483647 CNTLRTMOs D
--- CDT Summary Page ---
CDT Address Local Process Connection ID State Remote Node
----------- ------------- ------------- ----- -----------
80C2A020 SCS$DIRECTORY 87120000 listen
80C2A1B0 MSCP$TAPE 87120001 listen
80C2A340 VMS$VAXcluster 87120002 listen
80C2A4D0 MSCP$DISK 87120003 listen
80C2A660 VMS$SDA_AXP 87120004 listen
80C2A7F0 SCA$TRANSPORT 87120005 listen
80C2A980 SCA$TRANSPORT 872C0006 open OBERON
80C2AB10 VMS$VAXcluster 87120007 open DUNCAN
80C2ACA0 MSCP$DISK 87120008 open DUNCAN
80C2AE30 VMS$VAXcluster 87120009 open OBERON
80C2AFC0 MSCP$DISK 8712000A open OBERON
80C2B150 MSCP$DISK 8712000B open JUNO
80C2B2E0 VMS$VAXcluster 8712000C open JUNO
80C2B470 VMS$VAXcluster 8712000D open ALONSO
80C2B600 VMS$DISK_CL_DRVR 8712000E open ALONSO
80C2B790 VMS$DISK_CL_DRVR 8712000F open DUNCAN
80C2B920 VMS$DISK_CL_DRVR 87120010 open OBERON
80C2BAB0 VMS$DISK_CL_DRVR 87120011 open JUNO
80C2BC40 VMS$TAPE_CL_DRVR 87120012 open OBERON
80C2BDD0 MSCP$DISK 87120013 open ALONSO
80C2BF60 MSCP$DISK 87140014 open CERES
80C2C0F0 VMS$DISK_CL_DRVR 87120015 open CERES
80C2C280 VMS$VAXcluster 87130016 open CERES
80C2C410 PATHWORKScluster 87130017 listen
80C2C5A0 PATHWORKScluster 87140018 open ALONSO
80C2C730 PATHWORKScluster 87160019 open ALONSO
80C2C8C0 VMS$DISK_CL_DRVR 8712001A open MENTOR
80C2CA50 VMS$VAXcluster 8712001B open MENTOR
80C2CBE0 MSCP$DISK 8712001C open MENTOR
Number of free CDT's: 13
T.R | Title | User | Personal Name | Date | Lines |
---|
575.1 | | EEMELI::MOSER | Orienteers do it in the bush... | Fri May 09 1997 08:52 | 3 |
| are you using PAC (port allocation class), i.e. is DEVICE_NAMING = 1?
/cmos
|
575.2 | | BSS::JILSON | WFH in the Chemung River Valley | Fri May 09 1997 10:36 | 3 |
| Nope DEVICE_NAMING is 0 on all nodes. Hadn't thought about that one.
Jilly
|
575.3 | | SOS6::BERNARD | Bernard Ourghanlian, Alpha Resource Center | Wed May 14 1997 06:27 | 14 |
| I had the exact same problem to troubleshoot here.
After some time, it appeared the problem was linked to the way an on-going
tagged IO is detected. This problem appeared to be a phase timing problem
when you have a Fast SCSI drive that does not support Tagged Command
Queing. The problem occurs when the non-TCQ drive is under a high IO load
while a slower device like a CD-ROM drive is also performing a high number
of IO's the non-TCQ device periodically suffers a phase timing problem that
results in a Bus Reset that causes the device to go into Mount Verification.
I fixed the problem in installing new (not already released) SCSI
port and class drivers.
But I don't know if this is your problem...
|
575.4 | | BSS::JILSON | WFH in the Chemung River Valley | Wed May 14 1997 10:56 | 4 |
| Thanks. I have forced crashes now for these 2 systems and will be IPMT'ng
this case.
Jilly
|
575.5 | FYI | BSS::JILSON | WFH in the Chemung River Valley | Wed May 14 1997 16:07 | 1 |
| IPMT case is HPAQ50PA9.
|
575.6 | SCSI driver issue was separate issue | VMSSPT::DIFABIO | MOVL #OPINION,EXE$GL_BLAKHOLE | Thu May 15 1997 15:38 | 5 |
| The fix was within SYS$PKEDRIVER and would not affect devices being
served to a node (since that uses SYS$DUDRIVER) or serving a
device(MSCP).
Mark d.
|
575.7 | | SOS6::BERNARD | Bernard Ourghanlian, Alpha Resource Center | Thu May 22 1997 13:09 | 2 |
| I do not agree with this analysis. I did fix this problem using the new
SYS$PKEDRIVER.
|