[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:POLYCENTER Console Manager
Notice:Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:CSC32::BUTTERWORTH
Created:Thu Aug 06 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1541
Total number of notes:6564

1001.0. "Controller loops during reboot after 7620 crashes" by CSC32::TANTS () Fri Sep 22 1995 10:43

    PCM 1.6
    Recently a user has had a significant problem with an alpha it is
    monitoring with PCM.  The machine has been crashing regularly.  The
    user has observed the following behavior about 1 time for every 10
    crashs, but it's serious enough that he's concerned.  This has only
    happened in connection with this one machine.
    
    Symptoms:  The Alpha 7620 crashes.  He brings up a monitor session to
    watch it's console as it comes back up.  At some point, the controller
    process goes into a tight loop running CONSOLE$DAEMON.  No further
    console output makes it to his monitor session.  Eventually, this
    causes the Flow Control to flip on on his terminal server (which can't
    deliver the information) and in one case this caused the cluster state
    transition to freeze and his entire cluster to hang.
    
    He realizes that the flow control and cluster hang parts of this are
    timing flukes - this has happened in different stages of the startup.
    He's more interested in whether the tight loop part is happening
    elsewhere and what can be done about it (as that avoids the rest of the
    problem entirely).
    
    So the questions are thus:  
    1) Has anyone seen anything like this?  If so, what did you find? 
    2) Any suggestions on things for him to watch or do next time it happens?
    
    Becki 
    SYSMGT-C
                                                                     
T.RTitleUserPersonal
Name
DateLines