[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | POLYCENTER Console Manager |
Notice: | Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS: |
Moderator: | CSC32::BUTTERWORTH |
|
Created: | Thu Aug 06 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1541 |
Total number of notes: | 6564 |
1001.0. "Controller loops during reboot after 7620 crashes" by CSC32::TANTS () Fri Sep 22 1995 10:43
PCM 1.6
Recently a user has had a significant problem with an alpha it is
monitoring with PCM. The machine has been crashing regularly. The
user has observed the following behavior about 1 time for every 10
crashs, but it's serious enough that he's concerned. This has only
happened in connection with this one machine.
Symptoms: The Alpha 7620 crashes. He brings up a monitor session to
watch it's console as it comes back up. At some point, the controller
process goes into a tight loop running CONSOLE$DAEMON. No further
console output makes it to his monitor session. Eventually, this
causes the Flow Control to flip on on his terminal server (which can't
deliver the information) and in one case this caused the cluster state
transition to freeze and his entire cluster to hang.
He realizes that the flow control and cluster hang parts of this are
timing flukes - this has happened in different stages of the startup.
He's more interested in whether the tight loop part is happening
elsewhere and what can be done about it (as that avoids the rest of the
problem entirely).
So the questions are thus:
1) Has anyone seen anything like this? If so, what did you find?
2) Any suggestions on things for him to watch or do next time it happens?
Becki
SYSMGT-C
T.R | Title | User | Personal Name | Date | Lines
|
---|