[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:POLYCENTER Console Manager
Notice:Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:CSC32::BUTTERWORTH
Created:Thu Aug 06 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1541
Total number of notes:6564

953.0. "PCM V1.6-110 various failures" by 42326::HEGARTYP (Virtually Real) Wed Aug 30 1995 05:52

    Hello PCMers,
    
    I have a customer who had the RWMBX problem on PCM V1.5, I was
    advised to send him the V1.6 and ECO kit which I have done and he has
    installed this (now on PCM V1.6-110), the RWMBX appears to be fixed but 
    he now reports many other problems which are making the PCM systems 
    (3 systems) almost unusable. Not all problems occur on all his PCM
    machines i.e. one might have two and another only one but he thinks the
    3 systems are set up the same.
    
    The problems are as follows and in his words, so hopefully someone
    understands what he is talking about:
                     
    o Event list and multi-line displays detach from ens (event list often
    reconnects then disconnects then...etc)
    
    o Event through a filter using multi-line dispatch, this does not always
    work. It seems to work once but when he closes the window it does not
    come back when new events occur (sometimes it does but mostly it
    doesn't)
    
    o The C3 interface often hangs completely, leaving just an empty
    window, i.e. the X-window fails to refresh at all. If he starts a new
    C3 session, it runs but the old one will not process window messages.
    
    o Some users get stack dumps when they try to connect to systems, the
    customer can reproduce this and he gets a process dump. Not all users
    see this even though from the VMS system, they are all set up the same.
    
    o He has problems with archiving, this is where a 'console archive all'
    command sometimes hangs when archiving some systems, whereas 'console
    archive x' does not hang!
    
    He says a reboot will clear all the above problems but then they all
    come back sooner or later. If anyone has seen these problems (if they
    are problems) before or knows how I can go about troubleshooting any of
    them I would be very grateful e.g. are there any specific quotas that
    should be checked? thanks in advance,
    
    Paul
    
    UK CSC
                      
T.RTitleUserPersonal
Name
DateLines
953.1more info on previous entry42326::HEGARTYPVirtually RealThu Aug 31 1995 10:29101
    Hi,
    
    the customer has just given me some more info. which is word for word
    if it is of any use:
    
       in case it helps, here is some additional information I have been
    able to locate. Also, I have found yet another problem with the C3
    program. I was running C3, stopped the console manager (CONS SHUT), all
    the C3 icons went grey, and then restarted the console manager. The
    icons eventually went back to white - and stayed white. Event though
    events were occuring (as confirmed by using 'monitor interface'), the
    icons did not change colour. I don't think that the eventlist received
    any events either. When I stopped and restarted C3 (and the eventlist)
    all went back to how it was supposed to be.
    
    Anyway... having defined CONSOLE$DEBUG to be ENS, this is an extract
    from the file, CONSOLE$TMP:CONSOLE$ENS_DAEMON.LOG. It may suggest why
    the multi-line events stop working.
    
    ------
    CONSOLE$ENS_DAEMON: Event received from Source - PTools - DSK011
          Details are:
          S.No.     : 4236, 811 (277611307)
          Type      : 1
          System    : MISFS1
          SubSystem : Power Tools (Disk)
          Source    : Console
          Class     : Power Tools
          Info      : I/O operation rate is high for disk
        Posting event to Monitor : 1020400
        Posting event to C3      : 982264
       Sending to Action   : Multi-Line Window, Userdata : 132.146.146.54 0
    0 tcpip
        Finding suitable process name for action...
    Process name for action will be      : Multi-Line  001
    Eventport name for action will be    : Multi-Line__001
    Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
    Log file name for action will be     : CONSOLE$TMP:Multi-Line__001.LOG
    CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
    Multi-Line__001
    execute failed, removing running entry
        Finding suitable process name for action...
    Process name for action will be      : Multi-Line  001
    Eventport name for action will be    : Multi-Line__001
    Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
    Log file name for action will be     : CONSOLE$TMP:Multi-Line__001.LOG
    CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
    Multi-Line__001
    execute failed, removing running entry
       Sending to Action   : Multi-Line Window, Userdata : rtdvxt 0 0 tcpip
        Finding suitable process name for action...
    Process name for action will be      : Multi-Line  001
    Eventport name for action will be    : Multi-Line__001
    Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
    Log file name for action will be     : CONSOLE$TMP:Multi-Line__001.LOG
    CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
    Multi-Line__001
    execute failed, removing running entry
        Finding suitable process name for action...
    Process name for action will be      : Multi-Line  001
    Eventport name for action will be    : Multi-Line__001
    Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
    Log file name for action will be     : CONSOLE$TMP:Multi-Line__001.LOG
    CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
    Multi-Line__001
    execute failed, removing running entry
        Finding suitable process name for action...
    Process name for action will be      : Multi-Line  001
    Eventport name for action will be    : Multi-Line__001
    Command file name for action will be : CONSOLE$TMP:Multi-Line__001.COM
    Log file name for action will be     : CONSOLE$TMP:Multi-Line__001.LOG
    CONSOLE$ENS_DAEMON: CMCreateEventListener Failed for event Port
    Multi-Line__001
    execute failed, removing running entry
       Sending to Action   : Multi-Line Window, Userdata : net5 0 0 tcpip
    CONSOLE$ENS_DAEMON: Event processing complete
    -----
    
    Also, as if we didn't have enough problems, I was monitoring the
    interface of one system, while having the eventlist in the background.
    At some point, while the monitor was running, it displayed (in the
    lower four lines of the display)...
    
    Read error on local socket ENS_MONITOR
    Event Notification Service not running, event update disabled
    
    This did not seem to effect the 'monitor interface' and the event list
    didn't say anything, but a short while after (2-3 mins), the event list
    came up with the 'connection lost' type error. I restarted PCM
    (CONSOLE$STARTUP) yesterday, and while the system WAS dropping (and
    remaking) the connection to ENS on a regular basis (every minute or
    so), it seems largely OK this morning.
    
    I will keep looking at various performance settings, but can't see
    anything being exhausted yet.
    
    that's all folks! (so far)
    
    Paul.
    
    UK CSC
953.2ZENDIA::DBIGELOWInnovate, Integrate, EvaporateThu Aug 31 1995 10:3832
>    o Event list and multi-line displays detach from ens (event list often
>    reconnects then disconnects then...etc)
    
>    o Event through a filter using multi-line dispatch, this does not always
>    work. It seems to work once but when he closes the window it does not
>    come back when new events occur (sometimes it does but mostly it
>    doesn't)
    
    These 2 problems seem to be one and the same. Unfortunately, I don't
    know what the problem is. 
        
>    o The C3 interface often hangs completely, leaving just an empty
>    window, i.e. the X-window fails to refresh at all. If he starts a new
>    C3 session, it runs but the old one will not process window messages.
    
    This is because it cannot connect. The easiest way is to just stop the
    process. Here's another suggestion. Shut down PCM, delete all the files
    in console$tmp and then restart PCM. More often than not this will cure
    the problem of the C3 not being able to connect to the PCM processes.
    
    o Some users get stack dumps when they try to connect to systems, the
    customer can reproduce this and he gets a process dump. Not all users
    see this even though from the VMS system, they are all set up the same.
    
>    o He has problems with archiving, this is where a 'console archive all'
>    command sometimes hangs when archiving some systems, whereas 'console
>    archive x' does not hang!
    
    Archiving is a known problem. 
    
    
    dave
953.342498::GAMETThu Aug 31 1995 10:5911
    Dave,
    
        are you saying that archiving in general is a known problem, or
    that this aspect of using it is prone to known problems ? 
    
        Thanks for your note, however, as the customer runs 24hrs,
    restarting PCM whenever there is a problem seems a little extreme.
    
    Ross Watson
    Digital Onsite Support
    British Telecom
953.4ZENDIA::DBIGELOWInnovate, Integrate, EvaporateThu Aug 31 1995 11:235
    Ross,
    
       it would appear that is a known problem.
    
    Dave
953.542498::GAMETThu Aug 31 1995 11:364
    Thanks, I've just pages back through the last few months notes and seen
    the note refering to bytlm, so I'll explore down that path for a while.
    
    Ross
953.6CSC32::BUTTERWORTHGun Control is a steady hand.Wed Sep 06 1995 15:579
    Per your dubug log it really appears that your loss of connection to
    ENS and ENS dying is a resource issue. Do you have the logfile output for
    when ENS actually died? You probably can't leave CONSOLE$DEBUG set to
    ENS as the debug log will get quite large but if you can at least leave it
    defined to TRUE we will at least have an output file in CONSOLE$TMP
    that should have the final status of the process.
    
    Regs,
      Dan
953.7Still no luck42498::GAMETThu Sep 14 1995 06:2533
    Hi,
    
      after setting CONSOLE$DEBUG to true, i received the following in the
    ENS_DAEMON log file...
    
    CONSOLE$ENS_DAEMON: eventport_error Callback
        errno_val : 0
             msg1 : Read error on Local socket ENS
             msg2 : error 0
    CONSOLE$ENS_DAEMON: eventport_error Callback
        errno_val : 0
             msg1 : Read error on Local socket ENS_MONITOR
             msg2 : error 0
    CONSOLE$ENS_DAEMON: eventport_error Callback
        errno_val : 0
             msg1 : Read error on Local socket ENS_C3
             msg2 : error 0
    CONSOLE$ENS_DAEMON: eventport_error Callback
        errno_val : 0
             msg1 : Read error on Local socket ENS_C3
             msg2 : error 0
    CONSOLE$ENS_DAEMON: eventport_error Callback
        errno_val : 0
             msg1 : Read error on Local socket ENS_MONITOR
             msg2 : error 0
    
    etc...
    etc...
    
    Also, what resources would be good to look at ? I've used AMDS/PSDC
    etc, but can't see any shortages. Any ideas ?
    
    Ross
953.842498::GAMETThu Sep 14 1995 06:3023
    Also, going back to the archiving problem, is this also a "known
    problem" ?
    
    (00:00) $ console archive /keep=24 /noconfirm CPU298
    POLYCENTER Console Manager
    Archive facility Version V1.6-110
    Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
    
    Starting archive procedure for system CPU298
    Timefile  Pass 1: Working ... done.
    Timefile  Pass 2: Working ... done.
    Logfile   Pass 1: Working ... done.
    Logfile   Pass 2: Working ... done.
    Eventfile Pass 1: Working ... done.
    Updating  Pass 1: Working ... done.
    Archive procedure for system CPU298 completed successfully
    IPCAstRtn: status is 44 (UNHANDLED completion?)
    ConnID is 1235992
    (00:01) $
    
    This happens quite a bit.
    
    Ross
953.9CSC32::BUTTERWORTHGun Control is a steady hand.Thu Sep 14 1995 12:4515
    I looked abck at your original note with the debug log showing the
    failure to create the event ports. These ports are nothing more than
    mailboxes so our problems revolve around them. Let's do the following,
    
    $ DEFINE/SYSTEM CONSOLE$DEBUG IPC
    
    restart the software, recreate a few of the problems and then shutdown
    PCM and post the resulting log files. Turning on debug for IPC will
    give us system serevice status returns on the crembx and various IO's.
    
    
    Regs,
      Dan
    
    
953.10Eventlist problems caused ENS exhausting it's BYTLM42441::JUDDGeoff Judd. UK TSC. Viables, BasingstokeFri Sep 15 1995 07:4013
It has been found that the ENS processes are getting their byte count quota
reduced down to 5000 bytes after a few hours. This is not enough to create any
more Eventlist windows. The cause appears to be due to ENS trying to create
eventlist windows on displays that do not exist. The configuration file has
dispatches of the eventlist action to 10 different displays of which about 6 do
not exist. Each time an event occurs a process is created for each of these
displays but the processes dies when it is found the displays don't exist. The
creation of each eventlist process decrements the BYTLM of the ENS process by
approximately 10000 bytes (for the creation of mailboxes I presume). When the
eventlist process dies the BYTLM should be given back but it appears that this
does not always heppen when there are multiple non-existent displays.

Geoff Judd. 
953.11CSC32::BUTTERWORTHGun Control is a steady hand.Fri Sep 15 1995 12:106
    Geoff,
      Do you see a bunch of mailboxes being left around? Specifically,
    permanent ones that have no channels assigned? 
    
    Regs,
      Dan
953.1242498::GAMETFri Sep 15 1995 12:2010
    
    On the customer site, I have removed all action dispatches that use a
    device that might not be switched on, and asked the remainder to leave
    thier displays switched on. The "Console Notify" process does not
    appear to loose its byte count quota.
    
    I have not yet seen any re-occurence of the 'detach-reconnect' error but
    will confirm (or deny) this after a few more days.
    
    Ross
953.13CSC32::BUTTERWORTHGun Control is a steady hand.Fri Sep 15 1995 12:364
    I can setup a test scnarios easily enough for the byt-count issue.
    Will keep you posted.
    
    Dan
953.14Workaround seems ok for one problem42498::GAMETTue Sep 26 1995 12:309
    Hi,
    
    since the last post, I have not seen any recurrence of the
    connect/detach problem. I did see one node with a failing byte count
    quota, but I removed a few more multi-line dispatches (for users who
    probably forgot to leave equipment on) and the quota now remains
    stable.
    
    Ross