T.R | Title | User | Personal Name | Date | Lines |
---|
586.1 | No Answers, Just Questions... | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed May 14 1997 10:28 | 18 |
|
That message is from SYSINIT debugging, and should only appear when
the DEBUG_MSG_FLAG cell is set non-zero in the SYSINIT image. (Have
you turned on extra logging in the bootstrap?)
What is the DSSI bus configuration?
Are all three nodes running V7.1?
Shadowing?
I will assume that you have checked the DSSI unit numbers on all DSSI
disks, DSSI tapes, and all host DSSI controllers, making sure that all
are set to unique values. I will assume that all three hosts are set
in the same host allocation non-zero class if there is storage on the
DSSI (or shared SCSI), and that all volumes on all hosts in the same
(host or SCSI port) allocation class have unique unit numbers.
|
586.2 | this it? | CTHU26::S_BURRIDGE | | Wed May 14 1997 14:38 | 83 |
| If the system is running OpenVMS 7.1 or has the ALPCLUSIO or
ALPCOMPAT_62 patches installed, and if the error message is actually
"%SYSINIT-I-LOCKWAIT, waiting for locks on system disk", then the
problem may be the one described in the following:
(This is mail I got from the Colorado CSC. Apparently a STARS article
is in the works but has not yet appeared.)
Hi,
THis sounds exactly like a problem I had worked.
It is caused by perfectcache.
After upgrading to 7.1, installing the ...CLISIO, or the ..COMPAT_062 kits
Boots may hang with the following error:
%SYSINIT-I-LOCKWAIT, waiting for locks on system disk
The new MOUNT96 code that is installed with 7.1, CLISIO, and ..COMPAT_062 kits
now requires the use of the DMT$ lock on the device to insure proper
synchronization.
If PerfectCache is also running on the cluster it also uses the DMT$
lock which causes the MOUNT/BOOT to hang.
We are in the process of working with PerfectCache to resolve this problem.
Analysis:
Looking at [MOUNT96.LIS]SYSMOU.LIS, which is the module that generates the
SYSINIT-I-LOCKWAIT error. Here, we simply
Loop:
attempt to get the MNT$physical_device for device_index=0
lock in EX mode
attempt to find the device using search_device
This takes out the MNT$physical_device for device_index=j
lock in EX mode
This takes out the DMT$physical_device for device_index=j
lock in PW mode
This takes out the SYS$device_name in PW mode
If any of these attempts fail with MOUN$_DEVBUSY
then
release the DMT$ lock (which was taken out in search_device)
release the MNT$ lock (which was taken out in search_device)
release the MNT$ lock (for device index = 0)
Write the "SYSINIT-I-LOCKWAIT..." message once
wait, then goto loop
So, to determine what is causing the error we need to see the lock information,
and since this is at startup, that means that the customer will have to
force a crash when it is hung.
The above locks can be examined and you then follow the PIDs to find what
process is holding the lock that you are waiting for. Following is a
picture on one lock:
Lock id: 010008A0 PID: 00010028 Flags: SYNCSTS SYSTEM NODLCKW
Par. id: 00000000 SUBLCKs: 0 NODLCKB
LKB: 83781D80 BLKAST: 00010A98
PRIORTY: 0000
Granted at PR 00000000-FFFFFFFF
Resource: 2431245F 24544D44 DMT$_$1$ Status:
Length 15 003A3030 37415544 DUA700:.
Exec. mode 00000000 00000000 ........
System 00000000 00000000 ........
Process copy of lock 210018B6 on system 00010001<CR><LF>
Process index: 0028 Name: .PerfectCache.. Extended PID: 20601028
--------------------------------------------------------------------
Process status: 01840011 RES,PSWAPM,PHDRES,NODELET
Required capabilities: 0000000C QUORUM,RUN
|
586.3 | Valid SYSINIT Message | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed May 14 1997 14:58 | 4 |
|
re: 2.
"SYSINIT-I-LOCK, taking out lock on system device" is a valid
message displayed -- when debugging is turned on -- by SYSINIT.
|
586.4 | No debug, checking DSSI config... | CRLRFR::BLUNT | | Wed May 14 1997 15:02 | 6 |
|
Indirectly an answer to both .1 and .2. No, SYSINIT debugging is not
turned on. I will check the DSSI unit, ALLOCLASS and related questions
Steve posted in .1
bob
|
586.5 | boot flags? | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed May 14 1997 15:38 | 3 |
|
...What Alpha console boot command flags were used here?
|
586.6 | | VMSSG::FRIEDRICHS | Ask me about Young Eagles | Wed May 14 1997 17:26 | 16 |
| I suspect that PerfectCache is running....
It was very recently discovered that PerfectCache has an "unusal
usage" of the DMT$device lock.
The newly re-written MOUNT, to fix a number of synch. problems, now
also takes out the DMT$device lock as part of the mounting process.
This deadlock leads to the SYSINIT message you are seeing.
The easy workaround is to turn off PerfectCache. We are working to
get this resolved.
Cheers,
jeff
|
586.7 | | VMSSG::FRIEDRICHS | Ask me about Young Eagles | Thu May 15 1997 14:45 | 15 |
| This morning I talked with the engineer at RAXCO...
V6.0 of PerfectCache does not use the DMT$ lock. Customer that are
running PerfectCache and want to upgrade to V7.1 or CLUSIO01_062
should first upgrade to V6.0.
As a workaround, V5.0 can be disabled.
We are working with Storage engineering to be sure that V6.0 gets
shipped witih the HSxxx boxes and we are working on STARS/BLITZ
articles to inform the field..
Cheers,
jeff
|
586.8 | | SSDEVO::DESKO | Rick in Storage - DTN 522-3905 | Thu May 15 1997 20:50 | 8 |
| > We are working with Storage engineering to be sure that V6.0 gets
> shipped witih the HSxxx boxes and we are working on STARS/BLITZ
> articles to inform the field..
Actually, PerfectCache never shipped with HSxxx FDDI Servers.
It has only shipped with SWXNA (the latest generation of FDDI Servers).
Rick
|