[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:POLYCENTER Console Manager
Notice:Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:CSC32::BUTTERWORTH
Created:Thu Aug 06 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1541
Total number of notes:6564

1498.0. "VAX 7740 OPA0: has problems when..." by CHOWDA::GLICKMAN (writing from Newport,RI) Tue Mar 11 1997 14:52

    
    I had a problem with a console connect yesterday with one of the nodes
    (VAX 7740) that is connected to PCM.  An operator had connected to
    this node and logged on to OPA0: was doing a repetitive SHOW QUEUE.
    Eventually we could not connect 0PA0: and had to do a re-boot of the
    7740 to get the connection back.  I found this note in the 7000 notes
    conference and I added a reply there.  I also thought I might try in
    here.  Would the symptoms described below have anything to do with
    the operator logged on to 0PA0: from PCM and staying logged on for
    awhile doing a repetitive task?
    
    Appreciating any comments.
    
              <<< PROXY::$7$DUA113:[NOTES$LIBRARY]LASER.NOTE;1 >>>
                                   -< laser >-
================================================================================
Note 1125.0  OPA0 Problems on many systems after update to 4.2 firmware  1 reply
CSC32::TANTS                                         42 lines  10-APR-1996 12:12
--------------------------------------------------------------------------------
    A Mission Critical customer is having a problem with OPA0 on his
    7000's.  His configuration:
    
    sys   os     firm   opa0		term svr to pcm
    7640  5.5-2  4.2	no response	trm232
    7820   6.1   4.2	no response	trm220
    7730   6.1   4.2	no response	trm220
    7650   6.1   4.2	no response	trm220
    7640   6.1   4.2	no response	trm232
    9210  5.5-2		ok		trm220
    7820   6.1   4.2	*		trm220
    7730   6.1   4.2	ok		trm222
    
    In the cases of the no response systems noted above, we get no response
    on OPA0 from either PCM or (using an a/b switch) the direct connect
    terminals available to the system.  OPA0 is logging errors (about
    1/second) and the Device Status is either TIM, INT, ONLINE, TIMEOUT or
    TIM, INT, ONLINE, BSY.  The PCM system has been
    stopped/restarted/rebooted several times since this first appeared, and
    has had no effect.  In one case, there was an owner of the OPA0 device
    - we stopped the process successfully, but still get no reaction from
    the system.  I've eliminated a problem with the terminal server based
    on the fact that we have two different ones in play here.
    
    The system marked with the * above was in the no response state last
    week, but crashed (due to something else) over the weekend.  When it
    started to come back up, it was still unresponseive.  They powercycled
    the machine and it came up fine.  It has not had the problem recur
    since then.  This led me to the firmware, since that is the only thing
    that would be effected by the powercycle  that wouldn't be effected by
    the reboot.
    
    Is this a known problem?  Is there any fix for it (other then shutting
    down and doing an init, which is not an option for this customer
    currently)?  Anything else I should look at?  What further information
    will I need to provide if I have to escallate this (which seems fairly
    likely at this point)?
    
    Thanks!
    Becki Tants
    Mission Critical Support Team
    csc32::tants
================================================================================
Note 1125.1  OPA0 Problems on many systems after update to 4.2 firmware   1 of 1
CHOWDA::GLICKMAN "writing from Newport,RI"           34 lines  11-MAR-1997 11:48
                      -< Another site with this problem >-
--------------------------------------------------------------------------------
>>>    In the cases of the no response systems noted above, we get no response
>>>    on OPA0 from either PCM or (using an a/b switch) the direct connect
>>>    terminals available to the system.  OPA0 is logging errors (about
>>>    1/second) and the Device Status is either TIM, INT, ONLINE, TIMEOUT or
>>>    TIM, INT, ONLINE, BSY.

	On a VAX 7740 running OpenVMS 6.1 VAX we experienced a similar problem
yesterday.  All of the above symptoms were there.  BTW, PCM 1.6-300 on OpenVMS 
6.2 AXP. 

>>>    In one case, there was an owner of the OPA0 device - we stopped the
>>>    process successfully, but still get no reaction from
>>>    the system.
	
       We also had an operator logged onto the VAX from the PCM and when we
stopped the process it didn't matter either.  Errors kept logging in the
count not the error log.

	We also run a logging program and when I looked at the operator's
log for that time period yesterday he was running a repetitive SHOW QUEUE
command in the middle of that got a %RMS-W-TMO, timeout period expired
error message which kept repeating (I guess until we stopped the process).

	We eventually did a shutdown and re-boot.  Once the machine was
shutdown the operator was able to do a Ctrl-P.
    
>>>    Is this a known problem?  Is there any fix for it (other then shutting
>>>    down and doing an init, which is not an option for this customer
>>>    currently)?  Anything else I should look at?  What further information
>>>    will I need to provide if I have to escallate this (which seems fairly
>>>    likely at this point)?
 
	Any status on this problem as of today?   
 
T.RTitleUserPersonal
Name
DateLines
1498.1CSC32::BUTTERWORTHGun Control is a steady hand.Thu Mar 13 1997 11:219
    The key is the device status field. If you go into SDA and do a SHOW
    DEV OPA0: what do you see? In the example posted in -1 there's no
    question that OPA0 is hung as the INT and TIM bits are set. 
    
    Lastly, all of this is ringing abell. I thought there was a blitz out
    on this so check the time critical database.
    
    Regs,
      Dan