T.R | Title | User | Personal Name | Date | Lines |
---|
3311.1 | OpenVMS engineers comments and observations | GIDDAY::SETHI | Ahhhh (-: an upside down smile from OZ | Wed Sep 22 1993 03:52 | 87 |
| Here are the notes from the OpenVMS group that may give an insight into
the problem.
I would like to mention that the base note is based upon my
observations and notes taken when I talked to the customer and when I
was logged onto their system.
If you require further information please let me know and I will do
what I can.
A process (in=19f) was in RES,DELPEN,INQUAN,RESPEN,PHDRES state. It was
waiting on this lock:
Lock data:
Lock id: 20E20028 PID: 004E019F Flags: VALBLK SYNCSTS
SYSTEM
Par. id: 01000072 SUBLCKs: 0 NOQUOTA
LKB: 87737D00 BLKAST: 00000000
PRIORTY: 0000 RQSEQNM: 19CE
Waiting for EX 00000000-FFFFFFFF
Resource: 0001002B 24534D52 RMS$+... Status: ASYNC NOQUOTA
Length 26 5F464144 53020000 ...SDAF_
Exec. mode 00202020 20445F41 A_D .
System 00000000 00000000 ........
Local copy
for which the resource is:
Resource database
-----------------
Address of RSB: 85747410 GGMODE: EX Status: VALID WTFULRG
CVTFULR
Parent RSB: 85B10A10 CGMODE: EX
Sub-RSB count: 0 FGMODE: EX
Lock Count: 137 CSID: 00000000
BLKAST count: 1 RQSEQNM: 1C69
Resource: 0001002B 24534D52 RMS$+... Valblk: 000B3400 0000015E
Length 26 5F464144 53020000 ...SDAF_ 00000000 00000000
Exec. mode 00202020 20445F41 A_D .
System 00000000 00000000 ........ Seqnum: 000021BE
Granted queue (Lock ID / Gr mode / Range):
4CB70034 EX 00000000-FFFFFFFF
Conversion queue (Lock ID / Gr mode / Range -> Rq mode / Range):
2B007632 NL 00000000-FFFFFFFF / EX 00000000-FFFFFFFF
5ADF0001 NL 00000000-FFFFFFFF / EX 00000000-FFFFFFFF
0800A127 NL 00000000-FFFFFFFF / EX 00000000-FFFFFFFF
4200D2A0 NL 00000000-FFFFFFFF / EX 00000000-FFFFFFFF
09004F69 NL 00000000-FFFFFFFF / EX 00000000-FFFFFFFF
Waiting queue (Lock ID / Rq mode / Range):
20E20028 EX 00000000-FFFFFFFF 6600B36A EX 00000000-FFFFFFFF
7B0016DD EX 00000000-FFFFFFFF 5900FB24 EX 00000000-FFFFFFFF
.
. (hundreds)
.
but it has been granted to lock 4CB70034 (in EX mode). This lock
(unfortunately or by mistake) belongs to the same process. So, the same
process requests a lock in EX mode for the resource for which it already
has granted lock in EX mode !!!
One thing is not clear, (can not be seen in the dump), and that is a
busy cannel to a NET device (seen while looking at the live system):
NET9299 Unknown UCB address:
831F5870
Device status: 00010010 online,deleteucb
Characteristics: 0C1C2000 net,avl,mnt,mbx,idv,odv
00000000
Owner UIC [016002,000025] Operation count 12 ORB address
831F5920
PID 004E019F Error count 0 DDB address 878FA700
Class/Type 00/00 Reference count 1 DDT address 82F4A878
Def. buf. size 256 BOFF 009F VCB address 82F50450
DEVDEPEND 00000001 Byte count 0005 CRB address 878FA680
DEVDEPND2 00000000 SVAPTE 8557BA90 I/O wait queue empty
FLCK index 34 DEVSTS 0002 DLCK address 8126D2F0
Charge PID 004E019F
|
3311.2 | The lock manager does not understand threading | IOSG::MARCHANT | I'd sink therefore I swam | Wed Sep 22 1993 15:18 | 29 |
| Sunil,
> but it has been granted to lock 4CB70034 (in EX mode). This lock
> (unfortunately or by mistake) belongs to the same process. So, the same
> process requests a lock in EX mode for the resource for which it already
> has granted lock in EX mode !!!
It may belong to the same process (the FCS) but as far as the FCS is concerned
the locks belong to *different* threads. Unfortunately the VMS lock manager
does not understand threading, and so the lock manager has tried to resolve
what it thinks is a deadlock situation, and this has caused your customer's
problem.
Paul Chinnick came across this problem whilst he was here, and one of his
suggested workarounds was:
`` The simplest form of workaround available is to increase the timeout
period for deadlock search initiation which is specified by the SYSGEN
parameter DEADLOCK_WAIT. Increasing this value to 30 seconds or more would
allow extra processing time to complete resource operations and hence
prevent premature and false detection of deadlocks. Unfortunately, such an
increase would delay the detection of any genuine deadlocks which may
adversely impact other software and applications including such sensitive
components as database systems. ''
This last part is an important consideration to take account of before using
this workaround.
Cheers,
Paul.
|
3311.3 | 787.8 ? | AIMTEC::VOLLER_I | Gordon (T) Gopher for President | Thu Sep 23 1993 19:47 | 10 |
| Sunil,
How did the user exit ALL-IN-1 ?
Have you checked Note 787.8 ? It could be a variation on the
process run down problem that we analyzed.
Cheers,
Iain.
|
3311.4 | It appears to be a similar problem | BUSHIE::SETHI | My name is Sunil without the H | Fri Sep 24 1993 01:35 | 17 |
| Hi Iain,
>How did the user exit ALL-IN-1 ?
EX, when the user tried to login sometime later she was given an error
message. The customer is not too sure as to what it said.
>Have you checked Note 787.8 ? It could be a variation on the
>process run down problem that we analyzed.
I can confirm that it appears to be so, we will try to get as much
information from the customer as possible but don't hold your breath.
Regards,
Sunil
|
3311.5 | Dump file still available ?? | AIMTEC::VOLLER_I | Gordon (T) Gopher for President | Fri Sep 24 1993 16:33 | 8 |
| Sunil,
If the dump is still available somewhere it should be easy to
confirm/disprove the process rundown theory.
Let me know if I can help at all.
Iain.
|