[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | POLYCENTER Console Manager |
Notice: | Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS: |
Moderator: | CSC32::BUTTERWORTH |
|
Created: | Thu Aug 06 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1541 |
Total number of notes: | 6564 |
1498.0. "VAX 7740 OPA0: has problems when..." by CHOWDA::GLICKMAN (writing from Newport,RI) Tue Mar 11 1997 14:52
I had a problem with a console connect yesterday with one of the nodes
(VAX 7740) that is connected to PCM. An operator had connected to
this node and logged on to OPA0: was doing a repetitive SHOW QUEUE.
Eventually we could not connect 0PA0: and had to do a re-boot of the
7740 to get the connection back. I found this note in the 7000 notes
conference and I added a reply there. I also thought I might try in
here. Would the symptoms described below have anything to do with
the operator logged on to 0PA0: from PCM and staying logged on for
awhile doing a repetitive task?
Appreciating any comments.
<<< PROXY::$7$DUA113:[NOTES$LIBRARY]LASER.NOTE;1 >>>
-< laser >-
================================================================================
Note 1125.0 OPA0 Problems on many systems after update to 4.2 firmware 1 reply
CSC32::TANTS 42 lines 10-APR-1996 12:12
--------------------------------------------------------------------------------
A Mission Critical customer is having a problem with OPA0 on his
7000's. His configuration:
sys os firm opa0 term svr to pcm
7640 5.5-2 4.2 no response trm232
7820 6.1 4.2 no response trm220
7730 6.1 4.2 no response trm220
7650 6.1 4.2 no response trm220
7640 6.1 4.2 no response trm232
9210 5.5-2 ok trm220
7820 6.1 4.2 * trm220
7730 6.1 4.2 ok trm222
In the cases of the no response systems noted above, we get no response
on OPA0 from either PCM or (using an a/b switch) the direct connect
terminals available to the system. OPA0 is logging errors (about
1/second) and the Device Status is either TIM, INT, ONLINE, TIMEOUT or
TIM, INT, ONLINE, BSY. The PCM system has been
stopped/restarted/rebooted several times since this first appeared, and
has had no effect. In one case, there was an owner of the OPA0 device
- we stopped the process successfully, but still get no reaction from
the system. I've eliminated a problem with the terminal server based
on the fact that we have two different ones in play here.
The system marked with the * above was in the no response state last
week, but crashed (due to something else) over the weekend. When it
started to come back up, it was still unresponseive. They powercycled
the machine and it came up fine. It has not had the problem recur
since then. This led me to the firmware, since that is the only thing
that would be effected by the powercycle that wouldn't be effected by
the reboot.
Is this a known problem? Is there any fix for it (other then shutting
down and doing an init, which is not an option for this customer
currently)? Anything else I should look at? What further information
will I need to provide if I have to escallate this (which seems fairly
likely at this point)?
Thanks!
Becki Tants
Mission Critical Support Team
csc32::tants
================================================================================
Note 1125.1 OPA0 Problems on many systems after update to 4.2 firmware 1 of 1
CHOWDA::GLICKMAN "writing from Newport,RI" 34 lines 11-MAR-1997 11:48
-< Another site with this problem >-
--------------------------------------------------------------------------------
>>> In the cases of the no response systems noted above, we get no response
>>> on OPA0 from either PCM or (using an a/b switch) the direct connect
>>> terminals available to the system. OPA0 is logging errors (about
>>> 1/second) and the Device Status is either TIM, INT, ONLINE, TIMEOUT or
>>> TIM, INT, ONLINE, BSY.
On a VAX 7740 running OpenVMS 6.1 VAX we experienced a similar problem
yesterday. All of the above symptoms were there. BTW, PCM 1.6-300 on OpenVMS
6.2 AXP.
>>> In one case, there was an owner of the OPA0 device - we stopped the
>>> process successfully, but still get no reaction from
>>> the system.
We also had an operator logged onto the VAX from the PCM and when we
stopped the process it didn't matter either. Errors kept logging in the
count not the error log.
We also run a logging program and when I looked at the operator's
log for that time period yesterday he was running a repetitive SHOW QUEUE
command in the middle of that got a %RMS-W-TMO, timeout period expired
error message which kept repeating (I guess until we stopped the process).
We eventually did a shutdown and re-boot. Once the machine was
shutdown the operator was able to do a Ctrl-P.
>>> Is this a known problem? Is there any fix for it (other then shutting
>>> down and doing an init, which is not an option for this customer
>>> currently)? Anything else I should look at? What further information
>>> will I need to provide if I have to escallate this (which seems fairly
>>> likely at this point)?
Any status on this problem as of today?
T.R | Title | User | Personal Name | Date | Lines |
---|
1498.1 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Thu Mar 13 1997 11:21 | 9 |
| The key is the device status field. If you go into SDA and do a SHOW
DEV OPA0: what do you see? In the example posted in -1 there's no
question that OPA0 is hung as the INT and TIM bits are set.
Lastly, all of this is ringing abell. I thought there was a blitz out
on this so check the time critical database.
Regs,
Dan
|