[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:POLYCENTER Console Manager
Notice:Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:CSC32::BUTTERWORTH
Created:Thu Aug 06 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1541
Total number of notes:6564

480.0. "Issues about DEFMBXQUOTA ?" by DECAUX::VNATIG::KARASEK (Thomas KARASEK @AUI) Wed Nov 16 1994 20:15

Hi !

On the VAX/VMS-platform (V1.5 ECO-1) a customer ran into the following problem.
He did create a pseudo terminal executing a command file which only does a
$ CONSOLE WATCH/ALL /OUT=TT

After a few lines of output arriving in consequence, the 'Console Ctrl 01' -
process went into 'RWMBX'.
I reproduced this configuration in our office, and found out that raising
DEFMBXQUOTA would solve the problem (at least for a while).

Are there any recommended parameter settings for using the WATCH interface ?

			Thanks, Tom.
T.RTitleUserPersonal
Name
DateLines
480.1LoopZENDIA::DBIGELOWInnovate, Integrate, EvaporateWed Nov 16 1994 21:2114
Tom,

   If I read you correctly, the customer is in an
infinite loop. As the console data comes out of the
pseudoterminal, it gets put back out to the
pseudoterminal which gets put out to the pseudoterminal
which gets put out to ... and so forth.

You could suggest to the customer that he/she replace 
the "ALL" with a list of systems. In that way, they are
not watching themselve (the pseudo terminal).

Dave

480.2The actually use a list of systems ...DECAUX::VNATIG::KARASEKThomas KARASEK @AUIThu Nov 17 1994 10:4920
Dave,

sorry for that misleading information. In fact they do use a list of systems
along with the 'console watch' - command. I have only a single system configured
which will be watched and output through the pseudo terminal.

This "system" is in fact connected to a decserver-port (Port #1) which itself is
connected to another port of that decserver (port #2).
All I do is just copy small textfiles to port #2 which will simulate console
input to port #1. 
It only takes a few lines of text (e.g. 20 to 30 lines) until the control
daemon will fall into RWMBX.
I did raise bytlm to 300000 and pagfilquota to 200000 for the account, CM is run
of. DEFMBXQUOTA is currently 10000 (for the default=1056 didn't work at all).

Since the customer absolutely wants to watch the incoming data he is very 
concerned about that problem.

			Any help is strongly appreciated,
				regards, Tom.
480.3OPG::PHILIPAnd through the square window...Thu Nov 17 1994 11:1924
Tom,

  My guess is that there is a condition where the watch process
  has received some data, which it is trying to log with PCM,
  however, the watch process has had its mailbox fill up and so
  PCM has hung in RWMBX, now, because PCM is in this state, it
  cannot service the data that the watch interface has put into
  the pseudo-terminal, so its all in a deadlock and hung up!!!!

  In this situation, raising quotas will only postpone the
  inevitable.

  I am not sure that I would want to support this kind of "abuse"
  of the software.

  I am also not sure what the customer is trying to do here!! Why
  do they want to log their console output twice (once normally
  via the console connection and a second time via the "watch")?

  Please explain what your customer is trying to do, we may be
  able to come up with a more efficient mechanism for them.

Cheers,
Phil
480.4DECAUX::VNATIG::KARASEKThomas KARASEK @AUIThu Nov 17 1994 13:0113
Phil, 

Thanks for your quick answer.
I just did verify what the customer really wants to do:

There are workstations in his cluster, which do not have any physical console
connection. The idea is to capture the console data of these stations via pseudo
devices.
  Redirecting the console output to a decserver port would be a rather costly
solution.

	Do you have any simplier ideas ?
	Thanks, Tom.
480.5OPG::PHILIPAnd through the square window...Thu Nov 17 1994 13:359
Tom,

  I am sorry, I am a little confused!! How was the customer
  getting the output into a pseudo-terminal from the
  workstations if they didnt want to use a DECserver?

Cheers,
Phil

480.6some more confusion ...DECAUX::VNATIG::KARASEKThomas KARASEK @AUIThu Nov 17 1994 13:5821
The way it worked up to VCS V1.4 was rather simple:

The OPCOM-messages of these systems will show up on other cluster members as
well.
(e.g. 
%%%%%%%%%%%  OPCOM   9-NOV-1994 11:43:51.50  %%%%%%%%%%%
Message from user AUDIT$SERVER on VNORHM			)

This messages were filtered according to the nodename, and then directed
to a pseudo terminal. So in fact this is not the 'real' console output, but
it may be enough for tracing most of the system events.

The only thing the customer wants, is to record a very limited set of events
(or OPCOM-messages) and feed them to a dummy system, which has its own icon
and does react on events (by changing its color).

I'm getting more and more convinced, that the simpliest solution would be to
set up action routines for 'really' connected system, which just take the
events an put them to the appropriate FTA-device of our dummy-system.

			cheers, Tom.
480.7CSC32::BUTTERWORTHGun Control is a steady hand.Thu Nov 17 1994 19:129
    I guess I don't understand why hooking up the real consoles of these
    workstations is so costly? Decservers are cheap theses days and so
    is deconnect cable and bear in mind that a node that uses a
    pseudo-terminal still requires a license. If you alreayd paying for
    a license then for my money I'd rather spend a little money and
    hook up the real consoles!
    
    Regards,
       Dan
480.8same as monitoring peripheral devices ...DECAUX::VNATIG::KARASEKThomas KARASEK @AUIFri Nov 18 1994 10:2818
Dan,

you are perfectly right. But they also want to monitor X.25 routers and items
which really do not have any physical console line.
 So, for my opinion, this should be possible the same way as applies to
peripheral devices.
In fact we do nothing else than what John Becker did suggest in your TIMA
article:
"[PLY_CM] Monitoring A Peripheral Device for PCM Event Notification"

It does work perfectly, unless there are a number of lines coming in in short
sequence. It looks like the pseudo terminal is not able to keep up with the
speed of incomming data. This will result in hanging the control daemon into the
RWMBX-deadlock. Since this does not only affect that specific pseudo terminal,
but the whole PCM-interface, this should be considered as a serious bug, which
at least two of our customers are really concerned about.

				regards, Tom.
480.9CSC32::BUTTERWORTHGun Control is a steady hand.Fri Nov 18 1994 17:5814
>It does work perfectly, unless there are a number of lines coming in in short
>sequence. It looks like the pseudo terminal is not able to keep up with the
>speed of incomming data. This will result in hanging the control daemon into the
>RWMBX-deadlock. Since this does not only affect that specific pseudo terminal,
>but the whole PCM-interface, this should be considered as a serious bug, which
>at least two of our customers are really concerned about.

    I agree in that this particular problem needs to be analyzed and fixed.
    I just wanted to understand the real need here hence the reply in -2.
    
    I want to see if I can reproduce this one.
    
    Regs,
      Dan
480.10CSC32::BUTTERWORTHGun Control is a steady hand.Fri Nov 18 1994 20:4544
    >you are perfectly right. But they also want to monitor X.25 routers and
    >items which really do not have any physical console line.
    > So, for my opinion, this should be possible the same way as applies to
    >peripheral devices.
    
    I just reread this and I have to ask the question: Are there
    applications that the customer currently uses to talk to these
    routers and other boxes that don't have a console. Example: Our own
    LPS20's don't have a physical console but there is an application 
    called LPS$CONSOLE that is used to interface with these systems as in 
    effect they have a soft console. The right way to monitor LPS20's
    is not to use a pseduo terminal with a watch command but rather a 
    pseudo-terminal that performs MCR LPS$CONSOLE. This is the way it
    was don with VCS. If these boxes do indeed have a "console" application
    similar to LPS$CONSOLE then use that instead of the WATCH interface.
    
    And Phil's "gues" as to what is happening is right on. I just
    reproduced this and here is what has happened:
    
    The pseduo-terminal process that is running the watch command is 
    attempting to write a line of data to TT: which is of course the
    FTA device itself. The controller daemon is attempting to place a
    message into the mailbox that the was created for the watch image 
    to read from and this mailbox is full as the watch process hasn't been
    able to process the messages quickly enough. This puts the controller
    into RWMBX which means it can't read messages from the pseduo terminal
    that the watch command is trying to output nor can it read any messages
    from any other node that it happens to be controlling! I don't think
    this would happen if the pseduo-node that is running watch  was under 
    control of a different daemon than that which is running the node
    or nodes we are trying to watch. This would remove the circularity of
    the I/O situation described above. Bear in mind you would never want to 
    use nodename ALL as you are now watching the pseudo-node(s) that is/are 
    also running the WATCH command!!! So as Dave was alluding too earlier
    you have created an infinite loop if you use nodename ALL. This would
    also make the proposed workaround of forcing the pseduo-node(s) running
    WATCH to a different child daemon than the nodes your WATCHing
    impossible to achieve.
    
    
    The scary part is I understand what I just wrote .....;-}
    
    Regs,
       Dan
480.11DECAUX::VNATIG::KARASEKThomas KARASEK @AUIMon Nov 21 1994 10:0534
Hi Dan,

Your 'scary' description of what happens seems to summarize this problem
exactly. 

>    I just reread this and I have to ask the question: Are there
>    applications that the customer currently uses to talk to these
>    routers and other boxes that don't have a console. 

They don't have anything like that yet. Is I mentioned in one of the replies
before, they only want to extract specific operator messages from one system
and feed them into some application (like pseudo terminal), making another box
in the C3-interface react on these messages.

>    Bear in mind you would never want to 
>    use nodename ALL as you are now watching the pseudo-node(s) that is/are 
>    also running the WATCH command!!!

O.K. Please forget about 'nodename ALL', since this was only a mistake in the
first note, and neither me nor the customer did use it that way. In fact we
both are specifying one single system only.

>    I don't think
>    this would happen if the pseduo-node that is running watch  was under 
>    control of a different daemon than that which is running the node
>    or nodes we are trying to watch.

I agree. This sounds pretty logical. But I suspect this means not to use the
'watch'-command at all, and write an own daemon instead. I would be thankful
about any ideas how this could be managed.

		Thanks for your investigations so far,

                    regards, Tom.
480.12CSC32::BUTTERWORTHGun Control is a steady hand.Tue Nov 22 1994 20:5019
    >But I suspect this means not to use the 'watch'-command at all, and write
    > an own daemon instead. I would be thankful about any ideas how this 
    >could be managed.
    
    I'm not sure what you mean by "write an own Daemon". If you mean write
    your own console controller daemon I would say that it is unnecesary
    and difficult at best. When PCM starts up the database is read and
    the parent daemon will start up one child controller daemon for
    each 16 systems in the database. Now this value can be adjusted with
    the config editor but *IT IS TOTALLY UNSUPPORTED TO DO SO*. The bottom
    line is if you change it and something breaks don't expect it to be
    fixed. The magic CC Editor commands are SET/SHOW HIDDEN.
    
    A question: You say you are looking for certain opcom messages. Are
    these opcom messages somehow related to the routers and other boxes
    your are trying to manage?
    
    Regs,
      Dan
480.13DECAUX::VNATIG::KARASEKThomas KARASEK @AUIWed Nov 23 1994 13:0915
Hi Dan !

I did use the *strictly unsupported* method to creat one child controller
process per system for my test configuration (1 real system, 1 pseudo terminal)
and id really did work around the RWMBX-problem (as expected).
 
 Though this certainly means a lot of overhead, it could be used for a
workaround at the customer's site as well. Since we only have to make sure, that
the pseudo terminal and the system it is watching are using different daemons,
we could use a more efficient way to accomplish this.
So, what is the criteria for relating the systems to a daemon ?
(Order of appearance in the configuration script, alphabetical order, ...)

				Thanks, Tom.
 
480.14OPG::PHILIPAnd through the square window...Wed Nov 23 1994 13:4514
Tom,

  The systems get assigned to daemons in the order they appear
  in the database, disabled systems are skipped when the assignment
  is done.

  There certainly is nothing wrong with setting hosts per controller
  to 1 however, it will eat a lot of process slots up.

  I still dont understand why you need to create this "feedback" loop
  though.

Cheers,
Phil
480.15CSC32::BUTTERWORTHGun Control is a steady hand.Wed Nov 23 1994 20:3024
    >  I still dont understand why you need to create this "feedback" loop
    >  though.
    
    Phil,
      Here's why I think he wishes to do this. Sites have always wanted
    event notification on peripherals such that the C3 icon for the
    peripherals changes color just as a "regular" service node would.
    Let's say we have a cluster with a TA90 tape drive with device name
    $1$MUA0:. We could create a pseudo-node called TA90 and a scan profile
    to search for strings with $1$mua0 as the opcom messages that are sent
    to nodes in a cluster would of course contain the device name. So
    we use a pseudo-node and it's scan profile to search for
    strings generated by opcom that concern the tape drive. 
    Unless the PCM engine is part of the same cluster that it monitors of
    is the standalone node that owns the tape drive then we can't just do a 
    spawn command and repl/enable. We'll be forced to use WATCH to get the 
    messages. The only other option I can think of is a DECNet task-to-task
    setup which would be kind of messy and could generate *lot's* of
    ethernet traffic if something really wierd happens that causes a flurry
    of messages.
    
    Regs,
      Dan
    from one of the nodes inthe cluste
480.16OPG::PHILIPAnd through the square window...Wed Nov 23 1994 21:3910
Dan,

  I think I see now, so, if you were to define the "subsystem" field
  correctly for each event and we then used that to create a "hierarchical"
  C3 display such that you could "zoom" into a system and see a bunch of icons 
  representing each of the subsystems, would that do what you and your
  customers want? 

Cheers,
Phil
480.17CSC32::BUTTERWORTHGun Control is a steady hand.Thu Nov 24 1994 00:418
    I think that would be a *GREAT* use for the sub-system field but some 
    customers have already started using it for other logical groupings of
    events. For example, VMSCluster CNXMAN messages can be placed into a
    "Cluster" subsystem. I can see somebody griping about it. How about
    a peripheral field ala the VCS C3 and using that trigger the icon
    color?
    
    Dan
480.18OPG::PHILIPAnd through the square window...Thu Nov 24 1994 09:5414
Dan,

  I dont want to add new fields to the evnt records at this stage for the
  next release so I cant put in a peripheral field.

  The intention of the subsystem field was in readiness for what I proposed,
  I have asked Dave to put this in the C3 already but he has a lot of other
  stuff to do, so, its just a matter of time and priorities.

  How anyone can think that a "cluster" is a subsystem is beyond me!!

Cheers,
Phil

480.19.15 describes exactly what we were looking for ...DECAUX::VNATIG::KARASEKThomas KARASEK @AUIThu Nov 24 1994 12:5524
One example of a customer's configuration:

6 systems
4 HSCs
1 translan bridge

(All of the above are physically connected to PCM.)

 1 X.25-router (--> = 1 pseudo terminal)
19 Satellite VAXstations   (--> = 1 pseudo terminal)

So one pseudo terminal is used for event notification of all the satellites.
All we want to do is trigger on a few events, like "lost connection to ...",
"established connection to ...".

To accomplish this, we need to watch one of the bootnodes in the pseudo terminal
(without logging data to file again) - and trigger on the specified events.

In the meantime I did use the suggestion to create one controller process per
configured system as a workaround. This seems to work pretty well (at least for
that limited number of nodes.) 
 However, I think this would be worth being addressed in future releases.

			Thanks & regards, Tom. 
480.20CSC32::BUTTERWORTHGun Control is a steady hand.Mon Nov 28 1994 18:3213
    >However, I think this would be worth being addressed in future releases.
    
    I'm not sure I agree after seeing the real configuration that you
    posted in -1. I see an ulterior motive on the customers part and that
    is he wants event notification on 19 nodes without having to buy
    19 licenses and hooking them up them up the right way. 
    
    Phil?
    
    Regs,
      Dan
    
    
480.21OPG::PHILIPAnd through the square window...Mon Nov 28 1994 21:146
  I'm with you on this one Dan, the customer should really be doing this the 
  "right" way, that is with seperate licences and real connections.

  Cheers,
  Phil
480.22OPCO::TSG_SJMComing live to you from RoseberyTue Nov 29 1994 03:317
    FWIW..I was having nightly problems with processes in RWMBX. I have
    used the "unsupported fix", and changed the number of systems per
    control process down to 5, and havn't had a problem since, or am I just
    covering something up.
    
    Thanks
    Steve
480.23CSC32::BUTTERWORTHGun Control is a steady hand.Tue Nov 29 1994 18:5512
    >or am I just covering something up.
    
    part of me says yes - part of me says no. There probably is a design
    limitation that is rearing it's ugly head when you have a reasonably
    busy system and each controller is handling 16 nodes. Decrease the
    amount of work on each controller and no more problem. Is it a
    bug/design limitation and the code should be changed or is do
    we really need to reevaluate the amount of work each controller
    should have to do? I really haven't made up my mind on this one!
    
    Regs,
      Dan
480.24OPG::PHILIPAnd through the square window...Tue Nov 29 1994 20:1515
  There were some tradeoffs made when we ported our code from ULTRIX to
  OpenVMS, these were based on the time it would take to port the code as
  opposed to the efficiency of the result. Based upon our experiences V2.0
  *should* be a lot less memory hungry, resilient and hopefully faster.

  So, the bottom line is that you are covering up some of our design
  deficiencies, lowering the hosts per controller from the default of 16 should
  be no problem if you have enough system resources (memory etc) for all those
  extra processes which will get created. Beware though that raising the hosts
  per process above 16 could have dramatic effects based upon open file limits
  etc. 

Cheers,
Phil
480.25RE- .20, .21VNOTSC::KARASEKThomas KARASEK @AUIWed Nov 30 1994 11:3614
Hi Phil, Dan !

I don't quite agree on that, since the RWMBX-problem is not due to the fact,
that events for more than one nodes are scanned, but to the usage of the
'CONSOLE WATCH' - command.

However, I don't think this is a license violation as well, since all the
information is derived from a physically connected (and fully licensed) system.
We could also setup whatever event scan and action routine we want on that
system, without overcomming license agreements.
 In fact we are consuming 1 additional license for the pseudo terminal, on which
the specific events are actually scanned.

				regards, Tom.
480.26CSC32::BUTTERWORTHGun Control is a steady hand.Wed Nov 30 1994 19:1729
    >I don't quite agree on that, since the RWMBX-problem is not due to the
    >fact, that events for more than one nodes are scanned, but to the usage of
    >the 'CONSOLE WATCH' - command.
    
    That wasn't the point actually. I've had RWMBX problems even when a
    site isn't doing the WTACH trick and it was due to a similar problem
    and typically seen on busy systems.
    
    >However, I don't think this is a license violation as well, since all
    >the information is derived from a physically connected (and fully licensed)
    >system.
    
    I never said it was a license violation but rather a way to get around
    buying more licenses and hooking these systems up the right way.
    
    >We could also setup whatever event scan and action routine we want on
    >that system, without overcomming license agreements.
    
    No argument here at all. This is a question of whats the customer real
    motive.
    
    > In fact we are consuming 1 additional license for the pseudo terminal,
    >on which the specific events are actually scanned.
    
    I'm well aware of this.
    
    
    Regs,
      Dan