T.R | Title | User | Personal Name | Date | Lines |
---|
953.1 | more info on previous entry | 42326::HEGARTYP | Virtually Real | Thu Aug 31 1995 10:29 | 101 |
| Hi,
the customer has just given me some more info. which is word for word
if it is of any use:
in case it helps, here is some additional information I have been
able to locate. Also, I have found yet another problem with the C3
program. I was running C3, stopped the console manager (CONS SHUT), all
the C3 icons went grey, and then restarted the console manager. The
icons eventually went back to white - and stayed white. Event though
events were occuring (as confirmed by using 'monitor interface'), the
icons did not change colour. I don't think that the eventlist received
any events either. When I stopped and restarted C3 (and the eventlist)
all went back to how it was supposed to be.
Anyway... having defined CONSOLE$DEBUG to be ENS, this is an extract
from the file, CONSOLE$TMP:CONSOLE$ENS_DAEMON.LOG. It may suggest why
the multi-line events stop working.
------
CONSOLE$ENS_DAEMON: Event received from Source - PTools - DSK011
Details are:
S.No. : 4236, 811 (277611307)
Type : 1
System : MISFS1
SubSystem : Power Tools (Disk)
Source : Console
Class : Power Tools
Info : I/O operation rate is high for disk
Posting event to Monitor : 1020400
Posting event to C3 : 982264
Sending to Action : Multi-Line Window, Userdata : 132.146.146.54 0
0 tcpip
Finding suitable process name for action...
Process name for action will be : Multi-Line 001
Eventport name for action will be : Multi-Line__001
Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
Log file name for action will be : CONSOLE$TMP:Multi-Line__001.LOG
CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
Multi-Line__001
execute failed, removing running entry
Finding suitable process name for action...
Process name for action will be : Multi-Line 001
Eventport name for action will be : Multi-Line__001
Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
Log file name for action will be : CONSOLE$TMP:Multi-Line__001.LOG
CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
Multi-Line__001
execute failed, removing running entry
Sending to Action : Multi-Line Window, Userdata : rtdvxt 0 0 tcpip
Finding suitable process name for action...
Process name for action will be : Multi-Line 001
Eventport name for action will be : Multi-Line__001
Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
Log file name for action will be : CONSOLE$TMP:Multi-Line__001.LOG
CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
Multi-Line__001
execute failed, removing running entry
Finding suitable process name for action...
Process name for action will be : Multi-Line 001
Eventport name for action will be : Multi-Line__001
Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
Log file name for action will be : CONSOLE$TMP:Multi-Line__001.LOG
CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
Multi-Line__001
execute failed, removing running entry
Finding suitable process name for action...
Process name for action will be : Multi-Line 001
Eventport name for action will be : Multi-Line__001
Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
Log file name for action will be : CONSOLE$TMP:Multi-Line__001.LOG
CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
Multi-Line__001
execute failed, removing running entry
Sending to Action : Multi-Line Window, Userdata : net5 0 0 tcpip
CONSOLE$ENS_DAEMON: Event processing complete
-----
Also, as if we didn't have enough problems, I was monitoring the
interface of one system, while having the eventlist in the background.
At some point, while the monitor was running, it displayed (in the
lower four lines of the display)...
Read error on local socket ENS_MONITOR
Event Notification Service not running, event update disabled
This did not seem to effect the 'monitor interface' and the event list
didn't say anything, but a short while after (2-3 mins), the event list
came up with the 'connection lost' type error. I restarted PCM
(CONSOLE$STARTUP) yesterday, and while the system WAS dropping (and
remaking) the connection to ENS on a regular basis (every minute or
so), it seems largely OK this morning.
I will keep looking at various performance settings, but can't see
anything being exhausted yet.
that's all folks! (so far)
Paul.
UK CSC
|
953.2 | | ZENDIA::DBIGELOW | Innovate, Integrate, Evaporate | Thu Aug 31 1995 10:38 | 32 |
| > o Event list and multi-line displays detach from ens (event list often
> reconnects then disconnects then...etc)
> o Event through a filter using multi-line dispatch, this does not always
> work. It seems to work once but when he closes the window it does not
> come back when new events occur (sometimes it does but mostly it
> doesn't)
These 2 problems seem to be one and the same. Unfortunately, I don't
know what the problem is.
> o The C3 interface often hangs completely, leaving just an empty
> window, i.e. the X-window fails to refresh at all. If he starts a new
> C3 session, it runs but the old one will not process window messages.
This is because it cannot connect. The easiest way is to just stop the
process. Here's another suggestion. Shut down PCM, delete all the files
in console$tmp and then restart PCM. More often than not this will cure
the problem of the C3 not being able to connect to the PCM processes.
o Some users get stack dumps when they try to connect to systems, the
customer can reproduce this and he gets a process dump. Not all users
see this even though from the VMS system, they are all set up the same.
> o He has problems with archiving, this is where a 'console archive all'
> command sometimes hangs when archiving some systems, whereas 'console
> archive x' does not hang!
Archiving is a known problem.
dave
|
953.3 | | 42498::GAMET | | Thu Aug 31 1995 10:59 | 11 |
| Dave,
are you saying that archiving in general is a known problem, or
that this aspect of using it is prone to known problems ?
Thanks for your note, however, as the customer runs 24hrs,
restarting PCM whenever there is a problem seems a little extreme.
Ross Watson
Digital Onsite Support
British Telecom
|
953.4 | | ZENDIA::DBIGELOW | Innovate, Integrate, Evaporate | Thu Aug 31 1995 11:23 | 5 |
| Ross,
it would appear that is a known problem.
Dave
|
953.5 | | 42498::GAMET | | Thu Aug 31 1995 11:36 | 4 |
| Thanks, I've just pages back through the last few months notes and seen
the note refering to bytlm, so I'll explore down that path for a while.
Ross
|
953.6 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Wed Sep 06 1995 15:57 | 9 |
| Per your dubug log it really appears that your loss of connection to
ENS and ENS dying is a resource issue. Do you have the logfile output for
when ENS actually died? You probably can't leave CONSOLE$DEBUG set to
ENS as the debug log will get quite large but if you can at least leave it
defined to TRUE we will at least have an output file in CONSOLE$TMP
that should have the final status of the process.
Regs,
Dan
|
953.7 | Still no luck | 42498::GAMET | | Thu Sep 14 1995 06:25 | 33 |
| Hi,
after setting CONSOLE$DEBUG to true, i received the following in the
ENS_DAEMON log file...
CONSOLE$ENS_DAEMON: eventport_error Callback
errno_val : 0
msg1 : Read error on Local socket ENS
msg2 : error 0
CONSOLE$ENS_DAEMON: eventport_error Callback
errno_val : 0
msg1 : Read error on Local socket ENS_MONITOR
msg2 : error 0
CONSOLE$ENS_DAEMON: eventport_error Callback
errno_val : 0
msg1 : Read error on Local socket ENS_C3
msg2 : error 0
CONSOLE$ENS_DAEMON: eventport_error Callback
errno_val : 0
msg1 : Read error on Local socket ENS_C3
msg2 : error 0
CONSOLE$ENS_DAEMON: eventport_error Callback
errno_val : 0
msg1 : Read error on Local socket ENS_MONITOR
msg2 : error 0
etc...
etc...
Also, what resources would be good to look at ? I've used AMDS/PSDC
etc, but can't see any shortages. Any ideas ?
Ross
|
953.8 | | 42498::GAMET | | Thu Sep 14 1995 06:30 | 23 |
| Also, going back to the archiving problem, is this also a "known
problem" ?
(00:00) $ console archive /keep=24 /noconfirm CPU298
POLYCENTER Console Manager
Archive facility Version V1.6-110
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
Starting archive procedure for system CPU298
Timefile Pass 1: Working ... done.
Timefile Pass 2: Working ... done.
Logfile Pass 1: Working ... done.
Logfile Pass 2: Working ... done.
Eventfile Pass 1: Working ... done.
Updating Pass 1: Working ... done.
Archive procedure for system CPU298 completed successfully
IPCAstRtn: status is 44 (UNHANDLED completion?)
ConnID is 1235992
(00:01) $
This happens quite a bit.
Ross
|
953.9 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Thu Sep 14 1995 12:45 | 15 |
| I looked abck at your original note with the debug log showing the
failure to create the event ports. These ports are nothing more than
mailboxes so our problems revolve around them. Let's do the following,
$ DEFINE/SYSTEM CONSOLE$DEBUG IPC
restart the software, recreate a few of the problems and then shutdown
PCM and post the resulting log files. Turning on debug for IPC will
give us system serevice status returns on the crembx and various IO's.
Regs,
Dan
|
953.10 | Eventlist problems caused ENS exhausting it's BYTLM | 42441::JUDD | Geoff Judd. UK TSC. Viables, Basingstoke | Fri Sep 15 1995 07:40 | 13 |
| It has been found that the ENS processes are getting their byte count quota
reduced down to 5000 bytes after a few hours. This is not enough to create any
more Eventlist windows. The cause appears to be due to ENS trying to create
eventlist windows on displays that do not exist. The configuration file has
dispatches of the eventlist action to 10 different displays of which about 6 do
not exist. Each time an event occurs a process is created for each of these
displays but the processes dies when it is found the displays don't exist. The
creation of each eventlist process decrements the BYTLM of the ENS process by
approximately 10000 bytes (for the creation of mailboxes I presume). When the
eventlist process dies the BYTLM should be given back but it appears that this
does not always heppen when there are multiple non-existent displays.
Geoff Judd.
|
953.11 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Fri Sep 15 1995 12:10 | 6 |
| Geoff,
Do you see a bunch of mailboxes being left around? Specifically,
permanent ones that have no channels assigned?
Regs,
Dan
|
953.12 | | 42498::GAMET | | Fri Sep 15 1995 12:20 | 10 |
|
On the customer site, I have removed all action dispatches that use a
device that might not be switched on, and asked the remainder to leave
thier displays switched on. The "Console Notify" process does not
appear to loose its byte count quota.
I have not yet seen any re-occurence of the 'detach-reconnect' error but
will confirm (or deny) this after a few more days.
Ross
|
953.13 | | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Fri Sep 15 1995 12:36 | 4 |
| I can setup a test scnarios easily enough for the byt-count issue.
Will keep you posted.
Dan
|
953.14 | Workaround seems ok for one problem | 42498::GAMET | | Tue Sep 26 1995 12:30 | 9 |
| Hi,
since the last post, I have not seen any recurrence of the
connect/detach problem. I did see one node with a failing byte count
quota, but I removed a few more multi-line dispatches (for users who
probably forgot to leave equipment on) and the quota now remains
stable.
Ross
|