T.R | Title | User | Personal Name | Date | Lines |
---|
856.1 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Thu Jul 06 1995 20:02 | 10 |
| Geert,
Assuming these crashes are access violations or some such, we would
want process dumps of the controllers. To do this you are going to have
to deinstall the images and modify the CONSOLE$STARTUP procedure so
that it dones't reinstall CONSOLE$IMAGE:CONSOLE$DAEMON.EXE. Let it
crash and make sure the customer recovers the dumps and logfiles from
console$tmp.
Regards,
Dan
|
856.2 | | BACHUS::WILLEMSG | Geert Willems MCS-Belgium | Mon Jul 10 1995 15:58 | 18 |
|
Hi Dan,
First feedback from the customer.
He suspected a PDP that was connected to PCM.
So, he first disabled the PDP in his PCM database.
He included this PDP in his backup PCM database(other management
system, in fact this is a backup system for the other) and he also
changed the device_type from LA210 to VT300. Until now everything
is still running. Can this be provoked by the device_type ?
I don't think, but ... !
We didn't deinstal the image until now. But he will do if the
problem should re-happen.
Rgds,
Geert
|
856.3 | yes, it's an accvio | BACHUS::WILLEMSG | Geert Willems MCS-Belgium | Tue Jul 11 1995 15:02 | 195 |
|
Hi Dan,Phil,Simon,
The customer had again the CONSOLE CTRL xx crash.
This time it was Console Ctrl 03.
This was found in the CONTROLLER_03.LOG file :
type CONTROLLER_03.LOG;1
-----------------------
$!
$! This command procedure is always run when anybody on the entire system
$! logs in. It is equivalent to LOGIN.COM except that the instructions
$! contained herein are executed everytime anyone on the VMS system
$! logs in to their account.
$!
$! For interactive processes, turn on Control T, and set the terminal type
$!
$ mode = f$mode()
$ tt_devname = f$trnlnm("TT")
$ session_mgr_login = (mode .eqs. "INTERACTIVE") .and. -
(f$locate("WSA",tt_devname) .ne. f$len(tt_devname))
$ session_detached_process = (mode .eqs. "INTERACTIVE") .and. -
(f$locate("MBA",tt_devname) .ne. f$len(tt_devname))
$ unknown_devtyp = (mode .eqs. "INTERACTIVE") .and. -
(f$getdvi("sys$command","devtype") .eq. 0)
$!
$ if (mode .eqs. "INTERACTIVE") .and. unknown_devtyp .and. .not. -
(session_mgr_login .or. session_detached_process)
$ endif
$!
$ if (mode .eqs. "INTERACTIVE") .and. .not. -
(session_mgr_login .or. session_detached_process)
$ endif
$!
$! MicroVAX Support Removed from OpenVMS Alpha
$!
$! Place your site-specific LOGIN commands below
$!
$ !
$ ! Start a Child Controller process, name_num 3, child_num 3
$ !
$ CHILD :== $CONSOLE$IMAGE:CONSOLE$DAEMON.EXE
$ CHILD "child" 3
POLYCENTER Console Manager
Console Controller Daemon Version V1.6-100
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
Read error on Local socket CONSOLE_CTRL_NETBR4
Read error on Local socket CONSOLE_CTRL_NETBR3
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=00000000,
PC=0005016C, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 00000005
Name = 0000000C
00000004
00000000
0005016C
0000001B
Register dump:
R0 = 0000000000000001 R1 = 0000000000000008 R2 = 00000000000124B0
R3 = 00000000001B0D04 R4 = 00000000001B0354 R5 = 00000000000301D0
R6 = 00000000001B0354 R7 = 0000000000061ADA R8 = 0000000000000003
R9 = 000000007FF9C410 R10 = 000000007FF9D198 R11 = 000000007FFBE3E0
R12 = 0000000000000000 R13 = FFFFFFFF8083C3A8 R14 = 0000000000000000
R15 = 0000000500000000 R16 = 0000000000000001 R17 = 0000000000000000
R18 = 0000000000000000 R19 = 00000000001C2984 R20 = 0000000000012623
R21 = 000000007FB86828 R22 = 15A8002D40800001 R23 = 001D2170F00D0000
R24 = 00000000F00D0000 R25 = 0000000000000001 R26 = FFFFFFFF80075DA0
R27 = FFFFFFFF80838120 R28 = 0000000000050140 R29 = 000000007F959060
SP = 000000007F959060 PC = 000000000005016C PS = 200000000000001B
SYSTEM job terminated at 10-JUL-1995 20:04:37.76
Accounting information:
Buffered I/O count: 198298 Peak working set size: 4512
Direct I/O count: 103082 Peak page file size: 19520
Page faults: 2982 Mounted volumes: 0
Charged CPU time: 0 00:07:15.04 Elapsed time: 4 04:21:29.38
Status after the Console Ctrl 03 was crashed.
CONS STAT/ALL
=============
------SYSTEM------ ---PID--- STATE -BYTES- -LINES- EVENTS ------USER------
1 ALFA01 000001DB LYE 105.5K 2.2K 8
2 BCCNO3 000001DB LYE 419 10 419
3 BDCNCC 000001DB LYE 138.6K 3.5K 32
4 BOPRO1 000001DB LYE 9.9K 249 10
5 BRSSAP 000001DB LYE 199.9K 4.7K 8
6 BRUOA2 000001DB LYE 281.3K 8.3K 30
7 CDVMV1 000001DB LYE 6.52M 173.5K 3.5K
8 COV01 000001DB LYE 91.4K 2.3K 4
9 COV02 000001DB LYE 9.11M 205.2K 1.6K COV_OPER
10 COV03 000001DB LYE 3.96M 91.0K 4.7K COV_OPER
11 COV04 000001DB LYE 8.80M 184.8K 766 COV_OPER
12 COV05 000001DB LYE 2.0K 39 6 COV_OPER
13 C_BOSNCC 000001DB LYE 32.9K 1.2K 11
14 C_FNXFE2 000001DB LYE 39.7K 870 17
15 C_RCTNCC 000001DB LN- 289 6 1 NCC_OPER
16 DEVNCC 000001DB LYE 11.0K 367 16
17 EBW 00000000 LN- 0 0 0
18 FE_C 000001DC LYE 6.02M 208 4
19 FE_G 000001DC LYE 0 0 0
20 FE_M 000001DC LYE 4.65M 869 8
21 FE_W 000001DC LYE 5.29M 178 4
22 FE_Z 000001DC LYE 0 0 0
23 FNXFE3 000001DC LYE 0 0 0
24 FNXFE5 000001DC LYE 0 0 0
25 HSC000 000001DC LYE 270 11 1
26 HSC001 000001DC LYE 0 0 0
27 HSJ02 000001DC LYE 0 0 0
28 HSJ04 000001DC LYE 0 0 0
29 HSJ07 000001DC LYE 0 0 0
30 HSJ08 000001DC LYE 0 0 0
31 INFOSE 000001DC LYE 0 0 0
32 NETBR1 000001DC LYE 127.5K 2.2K 28
33 NETBR2 000001DD LYE 157.7K 2.5K 11
34 NETBR3 000001DD LYE 85.9K 1.8K 33
35 NETBR4 000001DD LYE 453.0K 13.2K 51
36 PCMAXA 00000000 LN- 0 0 0
37 PCMAXB 000001DD PYE 70.0K 2.1K 1.0K
38 R204A 000001DD LYE 0 0 0
39 SWIFTE 000001DD LYE 2.98M 92.1K 0
40 SWIFTQ 000001DD LYE 0 0 0
41 SWIFTR 000001DD LYE 3.23M 86.2K 2
42 XSERVA 000001DD LYE 5.8K 137 4
43 XSERVB 00000000 LN- 0 0 0
CONS STAT/SYSTEM=NETBR4
=======================
System ............: NETBR4
Enabled ...........: Yes
Line status .......: OK
In use by .........:
Parent pid ........: 000001DD
Child Index .......: 3
Connection type ...: LAT
Logging device ....: 58% Full
Last Archive ......: None Performed
Lines of data .....: 13.2K Bytes: 453.0K
Event total .......: 51
PCMAXB_SYSTEM> : 0 Min: 3 Warn: 5 Clr: 43 Ind: 0
CONS STAT
=========
POLYCENTER Console Manager Summary
Totals
Configured Systems: 43 User disabled: 4
Active Systems : 39 (D:000 P:001 L:042 T:000) Unreachable: 000
Active Users : 1 (Connect/Monitor: 000 C3: 001 Event sources: 003)
CM pid ........: 000001D9 V1.6-100 Uptime: 4 16:32:52
ENS pid .......: 000001DA V1.6-100 Uptime: 4 16:32:52
Total bytes ...: 52.44M (0) Ave bps: 129.43
Total lines ...: 881.6K (0) Ave lpm: 130.55
Total events ..: 12261 (0) Ave epm: 1.82
Total actions .: 1500 (0) Ave aph: 13.33
Active actions : 6 Failed actions : 19
Crit: 514 Maj: 662 Min: 7863 Warn: 2437 Clr: 785 Ind: 0
He forgot to copy(save) the daemon logfile . Sorry.
We re-started consolemanager without console$daemon.exe
and console$control.exe installed in memory.
Were will the dump be created ?
Do I have to open an IPMT (this is a serious problem for the
customer in his production environment) ?
Is there anything else that you need about logfiles,etc...
Thanks for your help sofar.
Rgds,
Geert
|
856.4 | | ZEDAR::simon | Simon Jackson 830 x3879 | Tue Jul 11 1995 18:05 | 6 |
| Geert,
please do log an IPMT. We are in the middle of transferring
the support of PCM to a group in Israel, so we need to make sure
problems are tracked.
Cheers Simon...
|
856.5 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Tue Jul 11 1995 23:09 | 4 |
| The dump *should* be in the CONSOLE$TMP directory.
REgs,
Dan
|
856.6 | PCM support in Israel ? And engineering ? | BACHUS::WILLEMSG | Geert Willems MCS-Belgium | Wed Jul 12 1995 08:47 | 12 |
|
Hi Simon,
Israel ????
Who will do the support ?
Do you move to Israel ? What's going on guys ?
Rgds,
Geert
|
856.7 | | OPG::PHILIP | And through the square window... | Wed Jul 12 1995 10:25 | 27 |
| Geert and anybody else that is interested....
Product support and maintenance is moving to Israel,
like within the next two weeks.
Simon and I dont work for Engineering, we work for GPS
(was OMS, IM&T, I.S or whatever). The work we do on PCM
was funded by NSM Engineering.
Due to the supposedly imminent release of the PEM product
no further PCM major functional enhancements are planned
(this may change in the future) because of this it is
cheaper for NSM to have PCM support done from Israel.
This is a move we fully support as it is more cost effective
for Digital.
As a result of all this, there is a possibility that Simon
and I will be leaving Digital in the near future as we really
dont have anything more to contibute to the companys bottom
line.
If you have any questions about this transition or the future
of PCM, please talk to product management as they will be able
to align this move with NSM's future strategy.
Cheers,
Phil
|
856.8 | names of future PCM support people ? | BACHUS::WILLEMSG | Geert Willems MCS-Belgium | Tue Jul 18 1995 09:07 | 11 |
|
Hi,
Still waiting on a dump from the customer.
Phil, can you give me the names of the PCM support persons
in Israel ?
Thanks & Rgds,
Geert
|
856.9 | | 54625::WILLEMS | Johan Willems @BRO DTN 856-8739 | Wed Aug 09 1995 08:50 | 11 |
| Dan,
The controller process crashed again. All information has been saved in
a saveset. The failing controller process DID NOT create a dump file
although the daemon image was not installed.
I will open an IPMT case for Geert (who is ill for the moment)
Kind regards,
Johan
|
856.10 | | 29067::BUTTERWORTH | Gun Control is a steady hand. | Wed Aug 09 1995 14:29 | 3 |
| Where is the saveset?
Regs,
Dan
|
856.11 | Saveset location | 54625::WILLEMS | Johan Willems @BRO DTN 856-8739 | Thu Aug 10 1995 06:27 | 9 |
| Dan,
The saveset is available at :
BRSDVP::PCMCRASH090895.BCK
Kind regards,
Johan
|
856.12 | | 29067::BUTTERWORTH | Gun Control is a steady hand. | Fri Aug 11 1995 13:58 | 4 |
| I'm copying it now.
Regs,
Dan
|
856.13 | | 29067::BUTTERWORTH | Gun Control is a steady hand. | Fri Aug 11 1995 20:13 | 22 |
| Johan,
I have analyzed the available data and the control 1 process had
socket read errors on each socket and then died trying to remove an
entry from an internal queue. I cannot tell what this queue was without
a process dump. What we need to do is
DEFINE/SYSTEM CONSOLE$DEBUG TRUE
and then comment out the following line the CONSOLE$STARTUP.COM
by placing an ! in front i.e., make it look like the below but note
that I didn't include all of the line:
!Install ADD Console$image:Console$Daemon.exe .............
Now restart the software and we should get a process dump when it
happens again.
Regards,
dan
|
856.14 | no dump was created | 54625::WILLEMSG | Geert Willems MCS-Belgium | Wed Aug 16 1995 05:45 | 21 |
|
Hi Dan,
I'm back. I just spoke the customer. We already did the
$!!!Install ADD Console$Image:console$Daemon.exe /Open/Share/
But no dump was created in the console$tmp ! Why ?
Do we need to define the console$debug logical also to have this dump
file ? I thought that the $!!!Install ADD Console$Image:console$Daemon.exe
was enough .
For the moment they have around 35/40 systems connected and between
the last time and before last time we had more then a month time
difference. This is a long time to activate console$debug. What are the
things to look for, because sometimes there is a lot of traffic ...
We are dealing with a huge banking environment.
Thanks & Rgds,
Geert
|
856.15 | | 29067::BUTTERWORTH | Gun Control is a steady hand. | Thu Aug 17 1995 14:14 | 16 |
| > I'm back. I just spoke the customer. We already did the
> $!!!Install ADD Console$Image:console$Daemon.exe /Open/Share/
> But no dump was created in the console$tmp ! Why ?
Well I checked the code and the flag is not set on the $CREPRC call!
What I wll have to do is patch it. Even if it's an AXP I can use
a VAX to patch/abosolute the AXP image.
By doing this, we won't have to turn on debug!
I'll patch it and copy it to BACHUS and send mail.
Regs,
Dan
|
856.16 | another customer has the same problem (VAX) | 54625::WILLEMSG | Geert Willems MCS-Belgium | Fri Aug 18 1995 04:19 | 82 |
|
Hi Dan,
I have another customer who has also the CONSOLE CTRL crash problem.
This time it's on a VAX. I will ask to install the ECO kit, but we know
that this doesn't solve the problem !
Rgds,
Geert
This message is what I have got from the customer :
Today we remarked the disappearance of a console controller process "Console
Ctrl 01 ". The process dumped last friday.
In the logfile CONTROLLER_02.LOG we see:
$ set noverify
POLYCENTER Console Manager
Console Controller Daemon Version V1.6-100
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
Read error on Local socket CONSOLE_CTRL_SUNDEV
Read error on Local socket CONSOLE_CTRL_SUNPR1
Read error on Local socket CONSOLE_CTRL_SUNUK
Read error on Local socket CONSOLE_CTRL_SUNDEV
Read error on Local socket CONSOLE_CTRL_SUNPRO
Read error on Local socket CONSOLE_CTRL_SUNDEV
Read error on Local socket CONSOLE_CTRL_SUNDEV
%SYSTEM-F-ACCVIO, access violation, reason mask=01, virtual address=38313029,
PC=00052C88, PSL=03C00000
Improperly handled condition, image exit forced.
Signal arguments Stack contents
Number = 00000005 00000000
Name = 0000000C 00000000
00000001 200C0000
38313029 7FE9BDB4
00052C88 7FE9BDA0
03C00000 000B6633
00210420
00001420
00000001
38313030
Register dump
R0 = 38313020 R1 = 827EDAD0 R2 = 00210420 R3 = 0006F608
R4 = 7FE9BD00 R5 = 7FFE5EBC R6 = 00000000 R7 = 00000001
R8 = 7FFECA48 R9 = 7FFECC50 R10= 7FFED7D4 R11= 7FFE2BDC
AP = 7FE9BD38 FP = 7FE9BCF8 SP = 7FE9BD74 PC = 00052C88
PSL= 03C00000
SYSTEM job terminated at 11-AUG-1995 20:34:24.90
Accounting information:
Buffered I/O count: 18049 Peak working set size: 2562
Direct I/O count: 11973 Peak page file size: 7314
Page faults: 13604 Mounted volumes: 0
Charged CPU time: 0 00:09:03.48 Elapsed time: 0 04:20:16.83
When restarting PCM completely we have again some consoles that become
unreachable ( 6 gray icons in the c3 interface : ITSHAM, ITSHAL, ITSFE1,
ITSFE2, SABLE1 & SABLE2 ).
In CONTROLLER_02.LOG;1 we remark :
$ set noverify
POLYCENTER Console Manager
Console Controller Daemon Version V1.6-100
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
Read error on Local socket CONSOLE_CTRL_SUNCH
Read error on Local socket CONSOLE_CTRL_SUNDEV
Read error on Local socket CONSOLE_CTRL_SUNPRO
Read error on Local socket CONSOLE_CTRL_SUNUK2
|
856.17 | | 54625::WILLEMSG | Geert Willems MCS-Belgium | Fri Aug 18 1995 10:35 | 65 |
|
Hi Dan,
It happened a second time with the other customer(VAX).
So, I will need the image for AXP and for VAX.
This feedback he gave me :
------------------------------------------------------------------------------
When doing a sho sys I only find following processes ...
000000A4 Console Daemon HIB 6 5080 0 00:00:06.57 9525 274
000000A7 Console Notify HIB 6 1099 0 00:00:05.65 15101 164
000000AA Console Ctrl 01 HIB 8 18740 0 00:01:43.80 40241 592
The console$tmp:CONTROLLER_02.LOG;1 file looks like this :
$ set noverify
POLYCENTER Console Manager
Console Controller Daemon Version V1.6-100
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
Read error on Local socket CONSOLE_CTRL_SUNDEV
Read error on Local socket CONSOLE_CTRL_SUNPRO
%SYSTEM-F-ACCVIO, access violation, reason mask=01, virtual address=38313029, PC
=00052C88, PSL=03C00000
Improperly handled condition, image exit forced.
Signal arguments Stack contents
Number = 00000005 00000000
Name = 0000000C 00000000
00000001 200C0000
38313029 7FE9BDB4
00052C88 7FE9BDA0
03C00000 000B6633
00220E08
00001420
00000001
38313030
Register dump
R0 = 38313020 R1 = 8287CB90 R2 = 00220E08 R3 = 0006F608
R4 = 7FE9BD00 R5 = 7FFE5EBC R6 = 00000000 R7 = 00000001
R8 = 7FFECA48 R9 = 7FFECC50 R10= 7FFED7D4 R11= 7FFE2BDC
AP = 7FE9BD38 FP = 7FE9BCF8 SP = 7FE9BD74 PC = 00052C88
PSL= 03C00000
SYSTEM job terminated at 18-AUG-1995 12:25:27.52
Accounting information:
Buffered I/O count: 3104 Peak working set size: 2540
Direct I/O count: 5026 Peak page file size: 7314
Page faults: 12556 Mounted volumes: 0
Charged CPU time: 0 00:03:08.29 Elapsed time: 0 01:50:37.93
It does not look good, does it ????
P.S. Why do we have those problems (now) ??? Was it perhaps because PCM was
archiving ???
-------------------------------------------------------------------------------
|
856.18 | are patches images ready ? | 54625::WILLEMSG | Geert Willems MCS-Belgium | Thu Sep 21 1995 05:56 | 25 |
|
Hi Dan,
After 5 weeks absence, I'm back at the office.
The last time you replied, you were working on the $CREPRC(debug flag).
Do you have the patched images (for VAX and AXP) ?
Johan Willems did the follow up in my absence, but he didn't had any
feedback.
Dan, do you still have the PCMCRASH090895.BCK file that you copied.
I need this, because Johan's file is already deleted and PCM
enigineering Israel needs this to analyze the problem. Can you copy it
to BRSDVP"":: .
Will PCM enigineering Israel find something in this PCMCRASH090895.BCK
file, because it only contains logfiles and no dump file ?
What's you're advise ?
Thanks for the help.
Rgds,
Geert
|