[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | NAS Message Queuing Bus |
Notice: | KITS/DOC, see 4.*; Entering QARs, see 9.1; Register in 10 |
Moderator: | PAMSRC::MARCUS EN |
|
Created: | Wed Feb 27 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2898 |
Total number of notes: | 12363 |
2836.0. "Misterious crashes (DmQ 3.2 OpenVMS VAX/Alpha)" by PLACEK::STEFANOWICZ () Thu Apr 03 1997 12:28
We have serious problem at customer site. It is difficult to track
the cause of problem, it is just a guess that it might be DmQ as
it appears in all components that use DmQ.
Customer reported machine crash (VAX 7000, OpenVMS 6.2, DmQ 3.2 RT,
DECnet/OSI).
No dump is present as they have 2GB RAM and did not want to loose
disk for dump storage ;-) We just know crash occured in NETACP, and
there were several process crashes in our application processes.
We started looking in our development environment (AlphaStation
SCSI Cluster, OpenVMS v6.2, DmQ 3.2 Dev, TCP/IP connections) and found
simmilar problems (but without whole system crash).
We just have error.log and accounting.dat. Analysing them we
found that our application processes report error 1036
(AST Fault) in EXECUTIVE mode. We do not explicitly use AST and
do not explicitly change modes. Maybe it would be of interest
that we extensively use put_msg until quota exceeding - this is
our simple application-flow-control mechanism.
We have 2 assumptions:
(1) AST FAULT reason is as suggested in documentation - stack too small
or corrupted. But being too small is unlikely as we use around 8k
maximum.
(2) There is a hidden error somewhere in communication layer. We
suspected RMS also, as running in EXEC mode, but is not used in
one component which also reports mentioned error.
Folks, we know this is quite vogue description. We just hope any of
you already had simmilar situation.
Thanks for any help.
Artur
T.R | Title | User | Personal Name | Date | Lines |
---|
2836.1 | I know of only one instance of this... | KLOVIA::MICHELSEN | BEA/DEC MessageQ Engineering | Thu Apr 03 1997 16:16 | 12 |
| ...where a process that uses DmQ sets a DmQ timer then does a ^Y STOP which prevents
the USER mode exit handlers from running. When the EXEC mode DmQ timer AST goes off
it lands off in space causing ACCVIOs or ASTFAULTs. This has been been fixed with
ECO 3247. However, since you say that NETACP is the one reporting the problem, I
know no case in which DmQ corrupted a process that was never known to DmQ.
I think you are going to have to reconfigure the system to able to save a crash
dump.
Marty
|