[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
Notice: | Welcome to the Digital UNIX Conference |
Moderator: | SMURF::DENHAM |
|
Created: | Thu Mar 16 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 10068 |
Total number of notes: | 35879 |
9308.0. "8400, KZPSA -> MTI, pciaerror panic" by DYPSS1::SCHAFER (Kalh�un!) Wed Mar 26 1997 21:06
This sounds suspiciously similar to 9302, but what the heck ...
Configuration:
DEC Unix v4.0B, 2-27-97 patches installed.
Alpha 8400 (EV56/440) x 8, 8GB memory (4 boards)
13 KZPSAs; 9 connected to MTI shelves
Each MTI shelf = 1 controller serving 1 8-member RAID-5 array
All 9 RAID-5 arrays are used by one rootdg-based LSM volume. Each
individual disk is sliced (g public, h private). Strangely, each
disk's C partition cylinder count does not match the combination of
g & h (g+h > c). Don't know if this has any relationship to the
problem or not.
One ADVfs domain/fset (db_domain#db) uses this volume.
Problem:
Customer is encountering several types of panics, including pciaerror,
machine check, and ADVfs-related crashes. while panics are not always
consistent, a panic is easily reproduced by the following:
introduce a *large* I/O load (test pgm is a perl script)
cause one RAID-5 set to enter a reconstruct state
cause a second RAID-5 to enter a reconstruct state
The panic cannot be duplicated if the I/O load is introduced AFTER the
reconstruction operations have begun. It does not matter which RAID
sets are forced to fail.
I'll include the latest crash output and dia output as replies to this
note.
T.R | Title | User | Personal Name | Date | Lines |
---|
9308.1 | pointers to crash and dia files | DYPSS1::SCHAFER | Kalh�un! | Wed Mar 26 1997 21:11 | 9 |
| since the crash and dia output are pretty long, i thought i'd just post
pointers:
DYPSS1::crash.abe
DYPSS1::dia.abe
DYPSS1 is 35223 for the name-impaired. cheers,
+b
|