T.R | Title | User | Personal Name | Date | Lines |
---|
912.1 | QAR on PHVQAR | MARVIN::COBB | Graham R. Cobb (Wide Area Comms.), REO2-G/H9, 830-3917 | Mon Apr 15 1991 15:57 | 12 |
| Enter Wave 1 (but not MCC) QARs on node PHVQAR::. Also there is a Wave 1
notes conference on MARVIN::DECNET-VAX_EXTENSIONS (press KP7 to add to your
notebook).
Did you try the commands before installing Wave 1?
This is almost certainly a Wave 1 problem (possibly in addition to an MCC
problem): MCC *shouldn't* be able to crash the system! By the way, you
should mention what sort of system PHZ5G8 is -- I presume it is a WANrouter
(OSIrouter)? Have you tried issuing the commands from NCL?
Graham
|
912.2 | Wave1 group thinks it's DECmcc | PARZVL::KENNEDY | Give me your watch & I'll tell you | Tue Apr 16 1991 10:01 | 29 |
| I did also enter this in PHVQAR, the response was that this was not likely to
be a Wave1 problem. We tried NCL and the same command did not crash the system.
Here's the response to the Wave1 QAR - any comments from the DECmcc group? I
do have a crash dump available.
From: MARVIN::GILLOTT "Mark Gillott, >>> RKG, 831-3172 <<< 16-Apr-1991 1056" 16-APR-1991 05:56:08.91
To: PARZVL::KENNEDY
CC: GILLOTT
Subj: QAR 00249 in the DNVEXT_US database has been CLOSED
If you are are sure that the system crash is related to the issuing of the
"show node x ..." command to DECmcc (and it certainly looks like this is
the case), then this is not a Wave 1 problem.
At present ALL communication between a VMS host and an OSIRouter uses
standard Phase IV DECnet (we don't yet have a Phase V VMS NSP implemenation
- unless you are actually using Wave 2?). Consequently the problem MUST be
related to DECmcc - as you have indicated issuing the same command to NCL
works perfectly.
Now it may be that DECmcc is using the logical link in a non-standard way,
which in turn is exposing a problem with the NSP/Session Control
implementation on the OSIrouter (or possibly exposing a problem with the
Phase IV DECnet-VMS implementation). If you can prove that this is the
case, you should enter a QAR into the OSIrouter QAR database
(WANLAD::OSI500_V10_QAR).
|
912.3 | Looking for common factors | TOOK::KOHLS | Ruth Kohls | Tue Apr 16 1991 11:47 | 16 |
| I'm looking for common factors between this and the crash mentioned in
passing in note 886.2, since the same message appears. I really don't
know what is relevant, so don't try and read anything into the questions!
So, what version of DNS are you using, and where is the DNS server?
Did you keep a log of all these installations, and/or did you see ANYthing
"unusual"? Is this the first time and first version of DECmcc you've
installed? (Was your system "clean" of all MCC stuff before the installation?)
Did you set up your namespace as documented in the DECmcc installation manual?
I'm passing this note on to any DECmcc development people I think might be
able to help. I do agree that MCC by itself ought not be able to crash
anything--I think its bad combinations of factors, and we all need to
find out what the factors are.
Ruth
|
912.4 | | TOOK::J_HALPIN | | Tue Apr 16 1991 18:46 | 31 |
|
Well, this is a bizzarre bug. I can reproduce MaryEllen's crash
on her system (VMS 5.4-1 & Wave 1) and on my workstation (VMS 5.4-1 &
Wave 2 but doing Phase IV style connects). The crash does not occur on
VMS V5.3 or VMS V5.4-1 systems running DECnet/VAX Phase IV.
This is the command that will do it every time:
MCC> SHOW NODE PHZ5G8 ROUTING CIRCUIT CSMACD-0 ADJACENCY RTG$007F -
MCC_> ALL ATTRIBUTES
NOTE: The ADJACENCY instance name is irrelevant, the crash happens on
any adjacency.
The DNA5 AM correctly returns the IDENTIFIER and STATUS partitions
(the only two defined for this entity), then there is a long pause
followed by a the system crash. One the non-WAVE # systems, the same
long pause is there before the MCC prompt is returned.
I can issue individual requests for the IDENTIFIER and STATUS
partitions without any problems. So the crash is definitely related
to an ALL ATTRIBUTES request.
Could this have something to do with the REFERENCE Partition???
JimH
|
912.5 | Maybe an MCC/DNS interaction problem? | TOOK::GUERTIN | I do this for a living -- really | Wed Apr 17 1991 10:08 | 7 |
| RE:.4
If the problem is with the REFERENCE attributes then that would imply a
problem with the DNS Clerk. Can you try the same command with ALL
REFERENCE?
-Matt.
|
912.6 | SHOW ALL REFERENCE will do it! | PARZVL::KENNEDY | Give me your watch & I'll tell you | Fri Apr 19 1991 10:02 | 5 |
| Matt,
SHOW ALL REFERENCE will cause the crash.
_Mek
|
912.7 | There is a patch for a DNS caused crash for Vax/VMS | COOKIE::KITTELL | Richard - Architected Info Mgmt | Fri Apr 19 1991 11:38 | 13 |
|
This may not be relevant to Wave * systems, but we just crashed a VMS V5.4
system. I was involved in the crash analysis because my process was active,
running MCC_MAIN.
The crash was "Unexpected system service exception" and the site of the
faulting instruction was traced to DNS$SHARE.
At that point our system managers went "Aha! we've heard of a patch that
fixes a problem with DNS getting an error returning from a system service."
I was unshackled and allowed to leave the dungeon where people who crash
the production cluster are confined until they confess their sins. :-)
|
912.8 | The patch seems to do the trick | TOOK::GUERTIN | I do this for a living -- really | Fri Apr 19 1991 15:33 | 18 |
| Thanks, Richard.
This morning I ran SDA on the crash dump, and it is definitely crashing
in DNS$SHARE with an ACCVIO on address 00000008. The system I was looking
at is VMS V4.5-1. The ACCVIO ended up generating a SSRVEXCEPT. I'm
sure we're all seeing the same thing. I found the patch in the
DNS_PROGRAMMERs notes file, note 120.3 (hit KP7).
I just tried the patch on a system which was crashing on a
SHOW NODE .... ALL REFERENCE and it stopped crashing after the patch!
(The date of the SYS$SHARE:DNS$SHARE.EXE image was 8-OCT-1990.)
So, I'm going to assume (based on this empirical data) that this is
a DNS bug, and not investigate this one any further.
BTW: Has anyone seen this crash on anything other than a DNA5 NODE command?
-Matt.
|
912.9 | DNS patch worked for us! | PARZVL::KENNEDY | Give me your watch & I'll tell you | Thu Apr 25 1991 10:16 | 6 |
| Sorry for the delay, just wanted to confirm that the DNS patch fixed our
problems.
Thanks to Jim Halpin and everyone else who got this nailed down so quickly.
_Mek
|