T.R | Title | User | Personal Name | Date | Lines |
---|
2250.1 | please use CANASTA | HAN::HALLE | Volker Halle MCS @HAO DTN 863-5216 | Wed Apr 30 1997 11:34 | 18 |
| Karel,
could you please use the CANASTA Mail Server to check your CLUE file
for a known footprint ?
CANASTA is a Digital-internal crash analysis tool, which has a
knowledge database of known crash footprints and solution and also has
a huge database of crash footprints, which will be automatically
searched, if there is no solution for your crash.
For details, please read note VAXAXP::VMSNOTES 233.
As a first step during crash analysis, please ALWAYS obtain the CLUE
file and send it to the CANASTA Mail Server.
Thanks,
Volker.
|
2250.2 | CANASTA status: UNIDENTIFIED | DECPRG::ZVONAR | | Fri May 02 1997 02:11 | 18 |
| Volker,
I used CANASTA as the first step of problem solving. The result from CANASTA
was STATUS: UNIDENTIFIED. I'm finding in COMET, TIMA etc. too. I found only
NON-FATAL ACCVIO crashes in SLS$SYSBAK. The BUGCHECKFATAL on customer system is
set to 0. No errors in errorlog (only the FATAL BUGCHECK message).
Installed ECOs:
ALPCPUC06_62, ALPLIBR05_70, ALPSYS04_62, ALPDISM01_62, ALPSCSI02_70,
ALPDRIV04_70, ALPSHAD05_62, ALPINIT01_70, ALPSMUP01_70, ALPMANA02_70
I'm looking for some similar SLS 2.8A crashes before I start deeper analyze of
crash dump.
Thanks,
Karel
|
2250.3 | the problem identified, no solution yet | DECPRG::ZVONAR | | Fri May 02 1997 11:01 | 52 |
| I have more crashes on the same node on customer site and some progress in
analyze occured:
1. System crashes on DOUBLDEALO, Double deallocation of memory block
I am not sure that some SLS job was running.
The CLUE output and dump was not accessible at this time.
Customer then set POOLCHECK by our recommendation to ON.
2. Crash aprox. 1 hour later
POOLCHECK, Corruption or inconsistency in pool discovered by pool checker
Here are 2 SLS batch jobs in LEF state, one job has device RDEVA0: busy.
RDEVA0: -> TZ87 drive in TL820, I/O request queue is empty, errorcount = 0
CANASTA status: UNIDENTIFIED. I have some similar rules.
After boot customer set POOLCHECK to off without next reboot (ACTIVE
POOLCHECK off, CURRENT POOLCHECK on).
3. Crash aprox. 1 hour later
BADDALRQSZ, Bad memory deallocation request size or address
Here are 2 SLS batch jobs, the current job has RDEVA0: busy.
CANASTA status: PARTIAL RULE - 009B385C-DD25A4E0-31015A. From the first view
it looks as the same problem. SLSE02028 is not installed. IPMT CFS_50555
does not offer a final solution.
Crash dump information:
CPU 03 reason for Bugcheck: BADDALRQSZ, Bad memory deallocation request size or
address
Process currently executing on this CPU: BATCH_198
Current image file: DSA2:[SLS$FILES.][TTI_RDEV]RDCONTROL_A62.EXE;1
Current IPL: 2 (decimal)
Image Identification Information
image name: "RDCDRIVER_A62"
image file identification: "X-3"
image file build identification: ""
link date/time: 11-APR-1997 07:48:54.93
linker identification: "A11-20"
On Monday I will receive dumps from all crashes.
Now is CURRENT & ACTIVE POOLCHECK set off.
So, it looks as this is the known problem without solution.
Karel
|
2250.4 | me too have two | MUNICH::REIN | How come holes in SWISS CHEESE?? | Mon May 05 1997 03:56 | 12 |
| Hallo Karel,
We have also two customers reporting this type of crash. When we have
the crash dumps available we will escalate the cases.
A lot is pointing to the rdclient software.
I found one IPMT open on the subject cfs.50606 which says, that TTI is
involved.
regards
Volker
|
2250.5 | | COOKIE::MCCLELLAND | Marty, SLS/MDMS Engineering | Fri May 16 1997 13:31 | 8 |
|
I agree your crashes are the same as some already reported against
RDF's RDCDRIVER. Touch Technologies, Inc. (the developer/maintainer
of RDF) is currently working on the crashes. It is my understanding
they have a fix for each of the reported crash types and they are
doing exhaustive testing to make sure they have covered all bases.
Marty
|