[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | + OpenVMS Clusters - The best clusters in the world! + |
Notice: | This conference is COMPANY CONFIDENTIAL. See #1.3 |
Moderator: | PROXY::MOORE |
|
Created: | Fri Aug 26 1988 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 5320 |
Total number of notes: | 23384 |
5286.0. "invexceptn in dudriver 6.2" by CSC32::REIGELMAN () Thu Apr 17 1997 03:10
Has any one seem this before, customer had a system crash with an INVEXCEPTN
but for resaons unknown to me they halted the machine before it finished
writing the dump file. Here is what I able to gather from the VCS and
the live system. Crash is in DUdriver, but I can't match the code with
the listing we have here at the center. Also I can find a patch kit
with the link date and file id that match the version of DUdriver they are
running. Just prior to the crash there where some shadow sets that when
into mount verify.
Image Identification Information
image name: "DUDRIVER"
image file identification: "X-96"
link date/time: 6-FEB-1997 12:08:23.32
linker identification: "05-13"
Patch Information
There are no patches at this time.
from the VCS
**** Fatal BUG CHECK, version = V6.2 INVEXCEPTN, Exception while above ASTD
EL or on interrupt stack
Crash CPU: 00 Primary CPU: 00
Active/available CPU masks: 0000000F/0000000F
Current process = NULL
Register dump
R0 = 00000008
R1 = 04080000
R2 = BD2BF6E0
R3 = 00000000
R4 = BBEF2790
R5 = BBFD2ED8
R6 = 00000000
R7 = 000000CC
R8 = BBEF3140
R9 = BBFA8280
R10= 7FE80F3E
R11= 7FE80F0A
AP = 7FE21970
FP = 7FE2194C
SP = CC621D34
PC = B6357C7F
PSL= 04080009
Kernel/interrupt/boot stack
CC621D3C 00000004
CC621D40 7FE2194C : FP of establisher
CC621D44 FFFFFFFD : depth scan
CC621D48 00000001 : R0
CC621D4C 00000000 : R1
CC621D50 00000001 : flags
CC621D54 00000005 : # of argument
CC621D58 0000000C :
CC621D5C 00000000 : reason mask
CC621D60 00000000 : failing VA
CC621D64 BFFC914E : failing PC
CC621D68 04080004 : PSL
CC621D6C 00000001
CC621D70 00000000
CC621D74 BD2BF6E0
CC621D78 BBE2C440
CC621D7C BBEF2790
CC621D80 BBFD2ED8
CC621D84 BFFC5804
CC621D88 BBEEC85F
CC621D8C BBE2BEC0
CC621D90 BBEF2790
CC621D94 BBEED3DC
CC621D98 BD2BF6C0
CC621D9C BD2BF709
CC621DA0 00000000
CC621DA4 BBEF3150
CC621DA8 BBEEDD85
CC621DAC BBEED46C
CC621DB0 BBEF3150
CC621DB4 CC620210
CC621DB8 00000034
CC621DBC 00003768
CC621DC0 7FE80E80
CC621DC4 7FE80F3E
CC621DC8 7FE80F0A
CC621DCC 7FE21970
CC621DD0 B648F318
CC621DD4 BBEED445
CC621DD8 00000001
CC621DDC 00000000
CC621DE0 00000008
CC621DE4 CC620000
CC621DE8 BD7D7240
CC621DEC CCCB5400
CC621DF0 0000C418
CC621DF4 00000000
CC621DF8 B63DCD53
CC621DFC 04C30004
Loaded images
[SYSMSG]SYSMSG.EXE B6225A00 B6265800
[SYS$LDR]SYSLDR_DYN.EXE B6495200 B6497200
[SYS$LDR]DDIF$RMS_EXTENSION.EXE B6497800 B6498A00
[SYS$LDR]RECOVERY_UNIT_SERVICES.EXE B6498C00 B6499400
[SYS$LDR]RMS.EXE B6265C00 B6290E00
VBSS.EXE B6308A00 B630A600
VAXCLUSTER_CACHE.EXE B630AC00 B630F600
SYS$NETWORK_SERVICES.EXE B630FC00 B630FE00
SYS$UTC_SERVICES.EXE B6310400 B6311200
SYS$TRANSACTION_SERVICES.EXE B6311800 B631D800
SYS$IPC_SERVICES.EXE B631DC00 B6330000
CPULOA.EXE B6330200 B6335200
LMF$GROUP_TABLE.EXE B6337600 B6339000
SYSLICENSE.EXE B6339400 B633B200
SNAPSHOT_SERVICES.EXE B633B800 B633C400
SYSGETSYI.EXE B633CA00 B633E200
SYSDEVICE.EXE B633E600 B6340E00
MESSAGE_ROUTINES.EXE B6341400 B6347400
EXCEPTION.EXE B6357A00 B6362200
LOGICAL_NAMES.EXE B6362A00 B6364E00
SECURITY.EXE B6365800 B636F200
LOCKING.EXE B636FC00 B6376A00
PAGE_MANAGEMENT.EXE B6377400 B6381000
WORKING_SET_MANAGEMENT.EXE B63C1E00 B63C7A00
IMAGE_MANAGEMENT.EXE B63C8400 B63CB800
EVENT_FLAGS_AND_ASTS.EXE B63CBE00 B63CDE00
IO_ROUTINES.EXE B63CE800 B63DB000
PROCESS_MANAGEMENT.EXE B63DCC00 B63E8200
ERRORLOG.EXE B6487600 B6488200
PRIMITIVE_IO.EXE B6488800 B6489A00
SYSTEM_SYNCHRONIZATION_SPC.EXE B6489E00 B648E200
SYSTEM_PRIMITIVES_MIN.EXE B648E800 B6492600
**** Starting memory dump, writing dump to HBVS member with unit number of 95
Header and error log buffers dumped...
SPT & GPT dumped...
System space dumped...
Global pages dumped...
AUDIT_SERVER dumped...
NETACP dumped...
REMACP dumped...
CONFIGURE dumped...
PO_RPT_PTR dumped...
IPCACP dumped...
ERRFMT dumped...
CACHE_SERVER dumped...
CLUSTER_SERVER dumped...
OPCOM dumped...
JOB_CONTROL dumped...
SHADOW_SERVER dumped...
SECURITY_SERVER dumped...
SMISERVER dumped...
TP_SERVER dumped...
SYMBIONT_104 dumped...
MULTINET_SERVER dumped...
LATACP dumped...
EVL dumped...
H i t m a n dumped...
SYMBIONT_94 dumped...
RDMS_MONITOR dumped...
DDS$055_i1 dumped...
SYMBIONT_97 dumped...
SYMBIONT_106 dumped...
LH_HPLAS_D dumped...
PSDC$DC_SERVER dumped...
MRLISTEN_8213 dumped...
SYMBIONT_109 dumped...
SYMBIONT_98 dumped...
SYMBIONT_105 dumped...
ROBO_SERVER dumped...
ROBO_ACTION dumped...
SYMBIONT_107 dumped...
NSCHED dumped...
LPS_CYLPSA dumped...
SYMBIONT_99 dumped...
ROBOCHG_COLLECT dumped...
OPENV_Server dumped...
SYMBIONT_96 dumped...
LH_HPLAS_K dumped...
RPC$SWL dumped...
DCE$RPCD dumped...
LH_HPLAS_C dumped...
SYMBIONT_93 dumped...
SYMBIONT_92 dumped...
LH_HPLAS_B dumped...
SYMBIONT_91 dumped...
SYMBIONT_85 dumped...
SYMBIONT_84 dumped...
SYMBIONT_81 dumped...
SYMBIONT_80 dumped...
MRLOGGER dumped...
SCHED_REMOTE dumped...
LH_HPLAS_E dumped...
SYMBIONT_82 dumped...
MR$T_N_1 dumped...
MR$T_N_2 dumped...
SYMBIONT_78 dumped...
SYMBIONT_79 dumped...
DQS$NOTIFIER dumped...
DENNY_DM dumped...
SYMBIONT_103 dumped...
DDS$055_lt2 dumped...
OA$FCV dumped...
DDS$055_lt1 dumped...
SYMBIONT_182 dumped...
DDS$LSTN_055_1 dumped...
SYMBIONT_111 dumped...
ALLIN1_103 dumped...
CPU:0 Console entry reason: ^P or Node Halt
Entry PC: BFFCE7F0 Entry PSL:041F8200
P00>>>
P00>>>
from the live using the PC from the console output.
SDA> exam/inst BFFC914E
DUDRIVER+0508E: CMPF #3C,#30510830
SDA> exam/inst BFFC914E-40;50
DUDRIVER+0504E: MOVQ (SP)+,R4
DUDRIVER+05051: RSB
DUDRIVER+05052: REMQUE @00B8(R3),R5
DUDRIVER+05057: BVS DUDRIVER+05069
DUDRIVER+05059: BISL3 60(R5),64(R5),R0
DUDRIVER+0505F: BNEQ DUDRIVER+05052
DUDRIVER+05061: MOVAB 60(R5),R5
DUDRIVER+05065: BSBB DUDRIVER+05070
DUDRIVER+05067: BRB DUDRIVER+05052
DUDRIVER+05069: RSB
DUDRIVER+0506A: REMQUE -60(R5),-(SP)
DUDRIVER+0506E: TSTL (SP)+
DUDRIVER+05070: PUSHR #3F
DUDRIVER+05072: REMQUE @-20(R5),R0
DUDRIVER+05076: BVS DUDRIVER+05087
DUDRIVER+05078: PUSHL R1
DUDRIVER+0507A: MOVZWL #0830,R1
DUDRIVER+0507F: BSBW DUDRIVER+0530A
DUDRIVER+05082: MOVL (SP)+,R1
DUDRIVER+05085: BRB DUDRIVER+05072
DUDRIVER+05087: REMQUE @-28(R5),R0
DUDRIVER+0508B: BVS DUDRIVER+0509C
DUDRIVER+0508D: PUSHL R1
DUDRIVER+0508F: MOVZWL #0830,R1
DUDRIVER+05094: BSBW DUDRIVER+0530A
DUDRIVER+05097: MOVL (SP)+,R1
DUDRIVER+0509A: BRB DUDRIVER+05087
DUDRIVER+0509C: MOVAB -60(R5),R0
DUDRIVER+050A0: JSB @#V_COM$DRVDEALMEM
SDA> sho sym/all mmg$gl_npag
Symbols sorted by name
----------------------
MMG$GL_NPAGEDYN = 80008460 : BBDB8000
MMG$GL_NPAGNEXT = 80008464 : BFFD2000
T.R | Title | User | Personal Name | Date | Lines |
---|
5286.1 | It may not have been in DUDRIVER | CSC32::B_HIBBERT | When in doubt, PANIC | Thu Apr 17 1997 11:19 | 21 |
| Hi Tim,
Comparing the crash PC to the live system gives only a chance at
finding the correct module where the crash occurred. In this case I
suspect that something other than DUDRIVER was loaded at BFFC914E.
Note that this address is in the middle of an instruction on the live
system.
You might try using CLUE to see if you can get anything out of the
dump file (sometimes it works, other times it doesn't). First check
SYS$ERRORLOG to see if there is a CLUE output file for this crash, if
not you can TRY the following:
$ CLUE :== $CLUE !define a foriegn command
$ CLUE /CANASTA SYS$SYSTEM:SYSDUMP.DMP
You can specify and output file on the /CANASTA=file.name switch.
If this works, it will give you a more accurate module and offset than
trying to compare the fault address with the live system.
Brian Hibbert
|
5286.2 | Get Latest Shadowing Patches... | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Thu Apr 17 1997 11:55 | 16 |
|
Get the CLUE output (if you can -- you will want to
explain to the customer that they should not halt
the system before the dump has completed), and send
it to the CANASTA e-mail server -- see VMSNOTES 233.*
for pointers. Then take a look at the CANASTA response.
See if anything is turned up by the COMET search engine
(at http://comet.alf.dec.com/).
That 6-Feb-1997 date makes it look like there are some
DUDRIVER-related patches installed, probably shadowing.
(And I'd definitely get the latest shadowing patches,
and I'd seriously consider upgrading to the "redhawk"
V7.1 compatibility kit.)
|
5286.3 | | VMSSG::FRIEDRICHS | Ask me about Young Eagles | Thu Apr 17 1997 12:41 | 16 |
| The driver is from the CLUSIO01_062 TIMA kit. This is the most current
V6.2 remedial kit for Shadowing and DUDRIVER and is also a superset
of the "V7.1 Cluster Compatibility Kit"
I took a look and the stack doesn't make a lot of sense given the code
path. R0 should have the address of the item removed from the
CDRP$L_ABTDQFL queue. Kevin suggested that if the queue was corrupt,
it might lead to an ACCVIO during the REMQUE.
Of course, all of this is based on the probability that DUDRIVER got
reloaded at the same base address. If not, who knows where the PC
was at the time of the crash.
Cheers,
jeff
|
5286.4 | thanks | CSC32::REIGELMAN | | Thu Apr 17 1997 21:23 | 7 |
| Thank you to all who replied, yes looking on the live system
is a best guess and we have told the customer several times
don't touch that button. But do they listen. When I couldn't
find dudriver with that link date, I just wanted to be sure
that I didn't miss a patch.
Tim
|