T.R | Title | User | Personal Name | Date | Lines |
---|
5009.1 | information from the crash dump | jpalp1.jp.oracle.com::HKATSURA | | Fri Feb 07 1997 03:56 | 304 |
| Following is some information from the crash dump.
Regards,
Haruko Katsurai
---------------------------------------------------------------------------
1. DBR status
================
RDM_RB_1 and RDM_RB_2 were the only one waiting for the freeze lock.
Other processes were waiting for the DBR.
SDA> sh proc/lock
Process index: 0199 Name: RDM_RB_1 Extended PID: 20800599
-------------------------------------------------------------
Lock data:
Lock id: 04051389 PID: 00010199 Flags: SYNCSTS SYSTEM PROTECT
Par. id: 04051388 SUBLCKs: 0
LKB: 85FCB080 BLKAST: 00000000
PRIORTY: 0000 RQSEQNM: 0000
Waiting for PR 00000000-FFFFFFFF
Resource: 00000000 00000046 F....... Status: ASYNC PROTCT
Length 04 00000000 00000000 ........
Exec. mode 00000000 00000000 ........
System 00000000 00000000 ........
Local copy
SDA> sh proc/lock
Process index: 01C7 Name: RDM_RB_2 Extended PID: 208005C7
-------------------------------------------------------------
Lock data:
Lock id: 1004F25B PID: 000101C7 Flags: SYNCSTS SYSTEM PROTECT
Par. id: 1004DD84 SUBLCKs: 0
LKB: 853F3180 BLKAST: 00000000
PRIORTY: 0000 RQSEQNM: 05E8
Waiting for PR 00000000-FFFFFFFF
Resource: 00000000 00000046 F....... Status: ASYNC PROTCT
Length 04 00000000 00000000 ........
Exec. mode 00000000 00000000 ........
System 00000000 00000000 ........
Local copy
2. Freeze lock information of RDM_RB_2
=======================================
SDA> sh res/lock=1004F25B
Resource database
Address of RSB: 8396C0C0 GGMODE: CW Status: WTFULRG
Parent RSB: 822D5340 CGMODE: CW
Sub-RSB count: 0 FGMODE: CW
Lock Count: 55 CSID: 00000000
BLKAST count: 1 RQSEQNM: 083B
Resource: 00000000 00000046 F....... Valblk: 00000000 00000000
Length 4 00000000 00000000 ........ 00000000 00000000
Exec. mode 00000000 00000000 ........
System 00000000 00000000 ........ Seqnum: 00000000
Granted queue (Lock ID / Gr mode / Range):
09035D91 CW 00000000-FFFFFFFF
Conversion queue (Lock ID / Gr mode / Range -> Rq mode / Range):
*** EMPTY QUEUE ***
Waiting queue (Lock ID / Rq mode / Range):
04051389 PR 00000000-FFFFFFFF 6005138B PR 00000000-FFFFFFFF
710313D1 PR 00000000-FFFFFFFF 70044850 PR 00000000-FFFFFFFF
1203A046 PR 00000000-FFFFFFFF 2702DEDE PR 00000000-FFFFFFFF
2602E6DD PR 00000000-FFFFFFFF 03006ECD PR 00000000-FFFFFFFF
03006ECF PR 00000000-FFFFFFFF 09038F3C PR 00000000-FFFFFFFF
0D037C81 PR 00000000-FFFFFFFF 0D02D550 PR 00000000-FFFFFFFF
03053074 PR 00000000-FFFFFFFF 0305307C PR 00000000-FFFFFFFF
030530AB PR 00000000-FFFFFFFF 03053065 PR 00000000-FFFFFFFF
03053054 PR 00000000-FFFFFFFF 030530B0 PR 00000000-FFFFFFFF
030530B2 PR 00000000-FFFFFFFF 03053059 PR 00000000-FFFFFFFF
03053049 PR 00000000-FFFFFFFF 0905304E PR 00000000-FFFFFFFF
03053091 PR 00000000-FFFFFFFF 0305302F PR 00000000-FFFFFFFF
0A053075 PR 00000000-FFFFFFFF 0C05309F PR 00000000-FFFFFFFF
030530A2 PR 00000000-FFFFFFFF 0603047C PR 00000000-FFFFFFFF
0305306B PR 00000000-FFFFFFFF 0C0530A5 PR 00000000-FFFFFFFF
030349E4 PR 00000000-FFFFFFFF 0304465A PR 00000000-FFFFFFFF
090530E5 PR 00000000-FFFFFFFF 090530DA PR 00000000-FFFFFFFF
0905309A PR 00000000-FFFFFFFF 0905309D PR 00000000-FFFFFFFF
030530DF PR 00000000-FFFFFFFF 0305302C PR 00000000-FFFFFFFF
0A02D1F9 PR 00000000-FFFFFFFF 03053259 PR 00000000-FFFFFFFF
42043865 PR 00000000-FFFFFFFF 1204B8B7 PR 00000000-FFFFFFFF
7C04AFD0 PR 00000000-FFFFFFFF 5804B7E8 PR 00000000-FFFFFFFF
14036307 PR 00000000-FFFFFFFF 0A04B016 PR 00000000-FFFFFFFF
6C0398B1 PR 00000000-FFFFFFFF 7102CDC5 PR 00000000-FFFFFFFF
25041BA6 PR 00000000-FFFFFFFF 3002C8E1 PR 00000000-FFFFFFFF
09030E84 PR 00000000-FFFFFFFF 1004F25B PR 00000000-FFFFFFFF
3A0336EB PR 00000000-FFFFFFFF 3903A00A CW 00000000-FFFFFFFF
3. More information on Freeze lock
====================================
SDA> sh lock 09035D91
Lock database
-------------
Lock id: 09035D91 PID: 0001008E Flags: SYSTEM
Par. id: 01003554 SUBLCKs: 0
LKB: 86152E80 BLKAST: 002A65A0
PRIORTY: 0000
Granted at CW 00000000-FFFFFFFF
Resource: 00000000 00000046 F....... Status: BLASTQD
Length 04 00000000 00000000 ........
Exec. mode 00000000 00000000 ........
System 00000000 00000000 ........
Local copy
4. Information on the Freeze lock holder (RDM_ALS_2)
=====================================================
SDA> sh proc /ind=0001008E
Process index: 008E Name: RDM_ALS_2 Extended PID: 2080048E
Process status: 00040001 RES,PHDRES
Required capabilities: 0000000C QUORUM,RUN
PCB address 81EFA980 JIB address 82381680
PHD address B9B30000 Swapfile disk address 00000000
Master internal PID 0001008E Subprocess count 0
Internal PID 0001008E Creator internal PID 00000000
Extended PID 2080048E Creator extended PID 00000000
State HIB Termination mailbox 0045
Previous CPU Id 00000002 Current CPU Id 00000002
Previous ASNSEQ 00000000000163BD Previous ASN 000000000000004A
Current priority 15 # of threads 0000000000000000
Initial process priority 15 Delete pending count 0
Base priority 15 AST's active NONE
UIC [00001,000004] AST's remaining 3999
Mutex count 0 Buffered I/O count/limit 1000/1000
Waiting EF cluster 0 Direct I/O count/limit 1000/1000
Abs time of last event 01B24C13 BUFIO byte count/limit 99999167/99999167
Event flag wait mask FEFFFFFF # open files allowed left 1997
5. Info on RDM_ALS_2
=======================
SDA> sh proc
Process index: 008E Name: RDM_ALS_2 Extended PID: 2080048E
--------------------------------------------------------------
Process status: 00040001 RES,PHDRES
Required capabilities: 0000000C QUORUM,RUN
PCB address 81EFA980 JIB address 82381680
PHD address B9B30000 Swapfile disk address 00000000
Master internal PID 0001008E Subprocess count 0
Internal PID 0001008E Creator internal PID 00000000
Extended PID 2080048E Creator extended PID 00000000
State HIB Termination mailbox 0045
Previous CPU Id 00000002 Current CPU Id 00000002
Previous ASNSEQ 00000000000163BD Previous ASN 000000000000004A
Current priority 15 # of threads 0000000000000000
Initial process priority 15 Delete pending count 0
Base priority 15 AST's active NONE
UIC [00001,000004] AST's remaining 3999
Mutex count 0 Buffered I/O count/limit 1000/1000
Waiting EF cluster 0 Direct I/O count/limit 1000/1000
Abs time of last event 01B24C13 BUFIO byte count/limit 99999167/99999167
Event flag wait mask FEFFFFFF # open files allowed left 1997
6. Activated images for RDM_ALS_2
===================================
Process index: 008E Name: RDM_ALS_2 Extended PID: 2080048E
--------------------------------------------------------------
Process activated images
------------------------
IMCB Start End Sym Vect Type Image Name Major ID,Minor ID
-------- -------- -------- -------- ------------ -----------------------------
7FEFC5C8 00010000 001813FF 00000000 MAIN RDMALS 0,0
7FEFE060 002A6000 003486E8 00336000 GLBL PRT SHR RDMPRV 1,1
7FEFE140 7FE4C000 7FEADFFF 7FE60F80 GLBL SHR DECC$SHR 1,1
Base End ImageOff Section Type
80566000 80684000 00000000 System Resident Code
7FE4C000 7FE67000 00120000 Shareable Address Data
7FE6C000 7FE75A00 00140000 Read-Write Data
7FE7C000 7FE85400 00150000 Shareable Read-Only Data
7FE8C000 7FE8C200 00160000 Read-Write Data
7FE9C000 7FEA0600 00170000 Demand Zero Data
7FEAC000 7FEADC00 00180000 Read-Write Data
7FEFEBA0 7FBBC000 7FD1DFFF 7FC0A370 GLBL SHR DPML$SHR 1,0
804A0000 80564600 00090000 System Resident Code
7FBBC000 7FBF2600 00000000 Shareable Read-Only Data
7FBFC000 7FC0DE00 00040000 Shareable Address Data
7FC1C000 7FC1C400 00060000 Read-Write Data
7FC2C000 7FC48200 00070000 Shareable Read-Only Data
7FD1C000 7FD1D000 00160000 Read-Write Data
7FEFE760 7FB7C000 7FBADFFF 7FB7C540 GLBL SHR CMA$TIS_SHR 1,1
Base End ImageOff Section Type
8049E000 8049EE00 00020000 System Resident Code
7FB7C000 7FB7CA00 00000000 Shareable Address Data
7FB8C000 7FB8C200 00010000 Read-Write Data
7FBAC000 7FBAC200 00030000 Read-Write Data
7FEFE3E0 00204000 002A47FF 00288930 GLBL SHR SMGSHR 1,104
7FEFE4C0 00182000 00202DFF 00195950 GLBL SHR SORTSHR 2,28
7FEFE300 7FADC000 7FB2DFFF 7FAE63C0 GLBL SHR LIBRTL 1,1
80400000 8048DE00 00000000 System Resident Code
7FADC000 7FAE8800 00090000 Shareable Address Data
7FAEC000 7FAED000 000A0000 Read-Write Data
7FAFC000 7FB07C00 000B0000 Shareable Read-Only Data
7FB0C000 7FB0C200 000C0000 Read-Write Data
7FB1C000 7FB1D000 000D0000 Demand Zero Data
7FB2C000 7FB2D600 000E0000 Read-Write Data
7FEFE220 7FB3C000 7FB6DFFF 7FB4C000 GLBL SHR LIBOTS 1,3
Base End ImageOff Section Type
8048E000 8049CC00 00020000 System Resident Code
7FB3C000 7FB3E600 00000000 Shareable Read-Only Data
7FB4C000 7FB4DC00 00010000 Shareable Address Data
7FB6C000 7FB6C200 00030000 Read-Write Data
7FEFE5A0 B13909A0 B139D6B0 B13909A0 GLBL SYS$BASE_IMAGE 18,4611610
7FEFE680 B1382AF8 B13840E8 B1382AF8 GLBL SYS$PUBLIC_VECTORS 69,8183133
7. PCB for RDM_ALS_2
=======================
SDA> for 81EFA980
81EFA980 PCB$L_SQFL 81A8E840
81EFA984 PCB$L_SQBL 81F76500
81EFA988 PCB$W_SIZE 0280
81EFA98A PCB$B_TYPE 0C
81EFA98B 00
81EFA98C PCB$L_AST_PENDING 00000000
81EFA990 PCB$Q_PHYPCB 24CBA080
81EFA994 00000000
81EFA998 PCB$L_LEFC_0_SWAPPED 00000000
PCB$Q_LEFC_SWAPPED
81EFA99C PCB$L_LEFC_1_SWAPPED 00000000
81EFA9A0 PCB$L_ASTQFL_SPK 81EFA9A0 PCB+00020
81EFA9A4 PCB$L_ASTQBL_SPK 81EFA9A0 PCB+00020
81EFA9A8 PCB$L_ASTQFL_K 81EFA9A8 PCB+00028
81EFA9AC PCB$L_ASTQBL_K 81EFA9A8 PCB+00028
81EFA9B0 PCB$L_ASTQFL_E 81EFA9B0 PCB+00030
81EFA9B4 PCB$L_ASTQBL_E 81EFA9B0 PCB+00030
81EFA9B8 PCB$L_ASTQFL_S 81EFA9B8 PCB+00038
81EFA9BC PCB$L_ASTQBL_S 81EFA9B8 PCB+00038
81EFA9C0 PCB$L_ASTQFL_U 81EFA9C0 PCB+00040
81EFA9C4 PCB$L_ASTQBL_U 81EFA9C0 PCB+00040
Process SYSTEM_1 logged out at 31-JAN-1997 13:53:31.22
8. instruction at PC for RDM_ALS_2
===================================
SDA> ex/i @pc
SYS$PUBLIC_VECTORS_NPRO+00304: BIS SP,R31,R28
9. Call Frames for RDM_ALS_2
================================
SDA> sh call
Call Frame Information
----------------------
Stack Frame Procedure Descriptor
Flags: Base Register = FP, No Jacket, Native
Procedure Entry: 00000000 00064608
Handler at 00000000 7FB4D460, Data = 00000000 00000020
Return address on stack = 00000000 00064500
Registers saved on stack
------------------------
7F9A74A0 00000000 00010160 Saved R2 SYS$K_VERSION_16+00120
7F9A74A8 00000000 00041AEC Saved R3
7F9A74B0 00000000 00000000 Saved R4
7F9A74B8 00000000 00041AC8 Saved R5
7F9A74C0 00000000 00368400 Saved R6
7F9A74C8 00000000 00000000 Saved R7
7F9A74D0 00000000 00000000 Saved R8
7F9A74D8 FFFFFFFF F38312C8 Saved R9 EXE$CATCH_ALL
7F9A74E0 00000000 7F9A7500 Saved R29
SDA> sh call/n
Call Frame Information
----------------------
Stack Frame Procedure Descriptor
Flags: Base Register = FP, No Jacket, Native
Procedure Entry: 00000000 00063B00
Handler at 00000000 7FB4D460, Data = 00000000 00000078
Return address on stack = FFFFFFFF F3821C44 EXE$PROC_IMGACT_C+00384
Registers saved on stack
------------------------
7F9A7B48 00000000 7FFBF880 Saved R2 MMG$IMGHDRBUF+00080
7F9A7B50 00000000 7FFBF934 Saved R3 MMG$IMGHDRBUF+00134
7F9A7B58 FFFFFFFF 81EFA980 Saved R4 PCB
7F9A7B60 00000000 7FF40000 Saved R5
7F9A7B68 FFFFFFFF F21EE620 Saved R6
7F9A7B70 00000000 00001000 Saved R7 PRV$M_PSWAPM
7F9A7B78 00000000 7F9A7BA0 Saved R29
|
5009.2 | | HOTRDB::PMEAD | Paul, [email protected], 719-577-8032 | Fri Feb 07 1997 10:03 | 8 |
| Haruko,
You are very close to getting the right info. We need to determine why
the ALS is not responding. I can't see any reason from the info
supplied. Could you do more SHOW CALL/NEXT commands until it returns
an error?
Paul
|
5009.3 | Call Frame for ALS | jpalp1.jp.oracle.com::HKATSURA | | Tue Feb 11 1997 21:18 | 71 |
| Hi,
Paul, thank you for your response. The call frame shows one more call frame
and I receive an error beyond that. The call frames for other hibernating
ALSs seem to be exactly the same.
Thanks,
Haruko
9. Call Frames for RDM_ALS_2 (modified)
========================================
SDA> sh call
Call Frame Information
----------------------
Stack Frame Procedure Descriptor
Flags: Base Register = FP, No Jacket, Native
Procedure Entry: 00000000 00064608
Handler at 00000000 7FB4D460, Data = 00000000 00000020
Return address on stack = 00000000 00064500
Registers saved on stack
------------------------
7F9A74A0 00000000 00010160 Saved R2 SYS$K_VERSION_16+00120
7F9A74A8 00000000 00041AEC Saved R3
7F9A74B0 00000000 00000000 Saved R4
7F9A74B8 00000000 00041AC8 Saved R5
7F9A74C0 00000000 00368400 Saved R6
7F9A74C8 00000000 00000000 Saved R7
7F9A74D0 00000000 00000000 Saved R8
7F9A74D8 FFFFFFFF F38312C8 Saved R9 EXE$CATCH_ALL
7F9A74E0 00000000 7F9A7500 Saved R29
SDA> sh call/n
Call Frame Information
----------------------
Stack Frame Procedure Descriptor
Flags: Base Register = FP, No Jacket, Native
Procedure Entry: 00000000 00063B00
Handler at 00000000 7FB4D460, Data = 00000000 00000078
Return address on stack = FFFFFFFF F3821C44 EXE$PROC_IMGACT_C+00384
Registers saved on stack
------------------------
7F9A7B48 00000000 7FFBF880 Saved R2 MMG$IMGHDRBUF+00080
7F9A7B50 00000000 7FFBF934 Saved R3 MMG$IMGHDRBUF+00134
7F9A7B58 FFFFFFFF 81EFA980 Saved R4 PCB
7F9A7B60 00000000 7FF40000 Saved R5
7F9A7B68 FFFFFFFF F21EE620 Saved R6
7F9A7B70 00000000 00001000 Saved R7 PRV$M_PSWAPM
7F9A7B78 00000000 7F9A7BA0 Saved R29
Call Frame Information
----------------------
Stack Frame Procedure Descriptor
Flags: Base Register = FP, No Jacket, Native
Procedure Entry: FFFFFFFF F3821B20 EXE$PROC_IMGACT_C+00260
Handler at FFFFFFFF F38311A0
Return address on stack = FFFFFFFF F3821B1C EXE$PROC_IMGACT_C+0025C
SDA> sh call/n
Registers saved on stack
------------------------
7F9A7BC0 00000000 00000028 Saved R2
7F9A7BC8 00000000 7FF91F40 Saved R3
7F9A7BD0 FFFFFFFF F3831180 Saved R13 EXE$PRCDELMSG+00020
7F9A7BD8 00000000 00000000 Saved R29
SDA> show call/n
%SDA-E-NOTINPHYS, 00000000 : virtual data not in physical memory
|
5009.4 | | HOTRDB::PMEAD | Paul, [email protected], 719-577-8032 | Wed Feb 12 1997 12:19 | 22 |
| Well Haruko, I'm stumped. I don't know VMS internals that well, but
from what I can see VMS appears to have lost the blocking AST for the
process. That is, the PCB shows nothing on the kernel mode blast queue
(the FREEZE lock is a kernel mode lock), yet the lock itself shows a
BLASTQD status.
Perhaps you can find a VMS person to verify this, but it is my
suspicion that the BLASTQD status should not be set if an AST has been
delivered.
Btw, we have seen a number of situations over the years where a $DEQ
appears to fail for some strange reason and even though the process
thinks it has released the lock it really has not. In every case
disabling VMS dynamic lock remastering has caused the problem to
disappear. If the customer continues to see the problem you might have
the customer set SYSGEN parameter PE1 to a small value (like 1) to see
if that prevents the problem from occurring. If it does prevent the
problem then they might want to pursue the problem with Digital.
fwiw,
Paul
|
5009.5 | | NOVA::MCGEE | Oracle Rdb Mission Critical Engineering | Wed Feb 12 1997 12:37 | 12 |
| >Perhaps you can find a VMS person to verify this, but it is my
>suspicion that the BLASTQD status should not be set if an AST has been
>delivered.
I think the BLASTQD is set when a BLAST is queued. When it is set, it
tells VMS not to re-blast the lock. The flag is not cleared until the
lock is DEQueued or demoted.
Anyway, disabling dynamic lock remastering is a good test since it only
seems to cause problems anyway.
Steve.
|
5009.6 | | jpalp1.jp.oracle.com::HKATSURA | | Thu Feb 13 1997 04:35 | 4 |
| I will suggest this to the customer.
Thank you for both of your input,
Haruko
|