[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:	POLYCENTER Console Manager
Notice:	Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:	CSC32::BUTTERWORTH

Created:	Thu Aug 06 1992
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	1541
Total number of notes:	6564

480.0. "Issues about DEFMBXQUOTA ?" by DECAUX::VNATIG::KARASEK (Thomas KARASEK @AUI) Wed Nov 16 1994 20:15

Hi !

On the VAX/VMS-platform (V1.5 ECO-1) a customer ran into the following problem.
He did create a pseudo terminal executing a command file which only does a
$ CONSOLE WATCH/ALL /OUT=TT

After a few lines of output arriving in consequence, the 'Console Ctrl 01' -
process went into 'RWMBX'.
I reproduced this configuration in our office, and found out that raising
DEFMBXQUOTA would solve the problem (at least for a while).

Are there any recommended parameter settings for using the WATCH interface ?

			Thanks, Tom.

T.R	Title	User	Personal Name	Date	Lines
480.1	Loop	ZENDIA::DBIGELOW	Innovate, Integrate, Evaporate	`Wed Nov 16 1994 21:21`	14
	Tom, If I read you correctly, the customer is in an infinite loop. As the console data comes out of the pseudoterminal, it gets put back out to the pseudoterminal which gets put out to the pseudoterminal which gets put out to ... and so forth. You could suggest to the customer that he/she replace the "ALL" with a list of systems. In that way, they are not watching themselve (the pseudo terminal). Dave
480.2	The actually use a list of systems ...	DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Thu Nov 17 1994 10:49`	20
	Dave, sorry for that misleading information. In fact they do use a list of systems along with the 'console watch' - command. I have only a single system configured which will be watched and output through the pseudo terminal. This "system" is in fact connected to a decserver-port (Port #1) which itself is connected to another port of that decserver (port #2). All I do is just copy small textfiles to port #2 which will simulate console input to port #1. It only takes a few lines of text (e.g. 20 to 30 lines) until the control daemon will fall into RWMBX. I did raise bytlm to 300000 and pagfilquota to 200000 for the account, CM is run of. DEFMBXQUOTA is currently 10000 (for the default=1056 didn't work at all). Since the customer absolutely wants to watch the incoming data he is very concerned about that problem. Any help is strongly appreciated, regards, Tom.
480.3		OPG::PHILIP	And through the square window...	`Thu Nov 17 1994 11:19`	24
	Tom, My guess is that there is a condition where the watch process has received some data, which it is trying to log with PCM, however, the watch process has had its mailbox fill up and so PCM has hung in RWMBX, now, because PCM is in this state, it cannot service the data that the watch interface has put into the pseudo-terminal, so its all in a deadlock and hung up!!!! In this situation, raising quotas will only postpone the inevitable. I am not sure that I would want to support this kind of "abuse" of the software. I am also not sure what the customer is trying to do here!! Why do they want to log their console output twice (once normally via the console connection and a second time via the "watch")? Please explain what your customer is trying to do, we may be able to come up with a more efficient mechanism for them. Cheers, Phil
480.4		DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Thu Nov 17 1994 13:01`	13
	Phil, Thanks for your quick answer. I just did verify what the customer really wants to do: There are workstations in his cluster, which do not have any physical console connection. The idea is to capture the console data of these stations via pseudo devices. Redirecting the console output to a decserver port would be a rather costly solution. Do you have any simplier ideas ? Thanks, Tom.
480.5		OPG::PHILIP	And through the square window...	`Thu Nov 17 1994 13:35`	9
	Tom, I am sorry, I am a little confused!! How was the customer getting the output into a pseudo-terminal from the workstations if they didnt want to use a DECserver? Cheers, Phil
480.6	some more confusion ...	DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Thu Nov 17 1994 13:58`	21
	The way it worked up to VCS V1.4 was rather simple: The OPCOM-messages of these systems will show up on other cluster members as well. (e.g. %%%%%%%%%%% OPCOM 9-NOV-1994 11:43:51.50 %%%%%%%%%%% Message from user AUDIT$SERVER on VNORHM ) This messages were filtered according to the nodename, and then directed to a pseudo terminal. So in fact this is not the 'real' console output, but it may be enough for tracing most of the system events. The only thing the customer wants, is to record a very limited set of events (or OPCOM-messages) and feed them to a dummy system, which has its own icon and does react on events (by changing its color). I'm getting more and more convinced, that the simpliest solution would be to set up action routines for 'really' connected system, which just take the events an put them to the appropriate FTA-device of our dummy-system. cheers, Tom.
480.7		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Thu Nov 17 1994 19:12`	9
	I guess I don't understand why hooking up the real consoles of these workstations is so costly? Decservers are cheap theses days and so is deconnect cable and bear in mind that a node that uses a pseudo-terminal still requires a license. If you alreayd paying for a license then for my money I'd rather spend a little money and hook up the real consoles! Regards, Dan
480.8	same as monitoring peripheral devices ...	DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Fri Nov 18 1994 10:28`	18
	Dan, you are perfectly right. But they also want to monitor X.25 routers and items which really do not have any physical console line. So, for my opinion, this should be possible the same way as applies to peripheral devices. In fact we do nothing else than what John Becker did suggest in your TIMA article: "[PLY_CM] Monitoring A Peripheral Device for PCM Event Notification" It does work perfectly, unless there are a number of lines coming in in short sequence. It looks like the pseudo terminal is not able to keep up with the speed of incomming data. This will result in hanging the control daemon into the RWMBX-deadlock. Since this does not only affect that specific pseudo terminal, but the whole PCM-interface, this should be considered as a serious bug, which at least two of our customers are really concerned about. regards, Tom.
480.9		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Fri Nov 18 1994 17:58`	14
	>It does work perfectly, unless there are a number of lines coming in in short >sequence. It looks like the pseudo terminal is not able to keep up with the >speed of incomming data. This will result in hanging the control daemon into the >RWMBX-deadlock. Since this does not only affect that specific pseudo terminal, >but the whole PCM-interface, this should be considered as a serious bug, which >at least two of our customers are really concerned about. I agree in that this particular problem needs to be analyzed and fixed. I just wanted to understand the real need here hence the reply in -2. I want to see if I can reproduce this one. Regs, Dan
480.10		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Fri Nov 18 1994 20:45`	44
	>you are perfectly right. But they also want to monitor X.25 routers and >items which really do not have any physical console line. > So, for my opinion, this should be possible the same way as applies to >peripheral devices. I just reread this and I have to ask the question: Are there applications that the customer currently uses to talk to these routers and other boxes that don't have a console. Example: Our own LPS20's don't have a physical console but there is an application called LPS$CONSOLE that is used to interface with these systems as in effect they have a soft console. The right way to monitor LPS20's is not to use a pseduo terminal with a watch command but rather a pseudo-terminal that performs MCR LPS$CONSOLE. This is the way it was don with VCS. If these boxes do indeed have a "console" application similar to LPS$CONSOLE then use that instead of the WATCH interface. And Phil's "gues" as to what is happening is right on. I just reproduced this and here is what has happened: The pseduo-terminal process that is running the watch command is attempting to write a line of data to TT: which is of course the FTA device itself. The controller daemon is attempting to place a message into the mailbox that the was created for the watch image to read from and this mailbox is full as the watch process hasn't been able to process the messages quickly enough. This puts the controller into RWMBX which means it can't read messages from the pseduo terminal that the watch command is trying to output nor can it read any messages from any other node that it happens to be controlling! I don't think this would happen if the pseduo-node that is running watch was under control of a different daemon than that which is running the node or nodes we are trying to watch. This would remove the circularity of the I/O situation described above. Bear in mind you would never want to use nodename ALL as you are now watching the pseudo-node(s) that is/are also running the WATCH command!!! So as Dave was alluding too earlier you have created an infinite loop if you use nodename ALL. This would also make the proposed workaround of forcing the pseduo-node(s) running WATCH to a different child daemon than the nodes your WATCHing impossible to achieve. The scary part is I understand what I just wrote .....;-} Regs, Dan
480.11		DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Mon Nov 21 1994 10:05`	34
	Hi Dan, Your 'scary' description of what happens seems to summarize this problem exactly. > I just reread this and I have to ask the question: Are there > applications that the customer currently uses to talk to these > routers and other boxes that don't have a console. They don't have anything like that yet. Is I mentioned in one of the replies before, they only want to extract specific operator messages from one system and feed them into some application (like pseudo terminal), making another box in the C3-interface react on these messages. > Bear in mind you would never want to > use nodename ALL as you are now watching the pseudo-node(s) that is/are > also running the WATCH command!!! O.K. Please forget about 'nodename ALL', since this was only a mistake in the first note, and neither me nor the customer did use it that way. In fact we both are specifying one single system only. > I don't think > this would happen if the pseduo-node that is running watch was under > control of a different daemon than that which is running the node > or nodes we are trying to watch. I agree. This sounds pretty logical. But I suspect this means not to use the 'watch'-command at all, and write an own daemon instead. I would be thankful about any ideas how this could be managed. Thanks for your investigations so far, regards, Tom.
480.12		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Tue Nov 22 1994 20:50`	19
	>But I suspect this means not to use the 'watch'-command at all, and write > an own daemon instead. I would be thankful about any ideas how this >could be managed. I'm not sure what you mean by "write an own Daemon". If you mean write your own console controller daemon I would say that it is unnecesary and difficult at best. When PCM starts up the database is read and the parent daemon will start up one child controller daemon for each 16 systems in the database. Now this value can be adjusted with the config editor but IT IS TOTALLY UNSUPPORTED TO DO SO. The bottom line is if you change it and something breaks don't expect it to be fixed. The magic CC Editor commands are SET/SHOW HIDDEN. A question: You say you are looking for certain opcom messages. Are these opcom messages somehow related to the routers and other boxes your are trying to manage? Regs, Dan
480.13		DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Wed Nov 23 1994 13:09`	15
	Hi Dan ! I did use the strictly unsupported method to creat one child controller process per system for my test configuration (1 real system, 1 pseudo terminal) and id really did work around the RWMBX-problem (as expected). Though this certainly means a lot of overhead, it could be used for a workaround at the customer's site as well. Since we only have to make sure, that the pseudo terminal and the system it is watching are using different daemons, we could use a more efficient way to accomplish this. So, what is the criteria for relating the systems to a daemon ? (Order of appearance in the configuration script, alphabetical order, ...) Thanks, Tom.
480.14		OPG::PHILIP	And through the square window...	`Wed Nov 23 1994 13:45`	14
	Tom, The systems get assigned to daemons in the order they appear in the database, disabled systems are skipped when the assignment is done. There certainly is nothing wrong with setting hosts per controller to 1 however, it will eat a lot of process slots up. I still dont understand why you need to create this "feedback" loop though. Cheers, Phil
480.15		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Wed Nov 23 1994 20:30`	24
	> I still dont understand why you need to create this "feedback" loop > though. Phil, Here's why I think he wishes to do this. Sites have always wanted event notification on peripherals such that the C3 icon for the peripherals changes color just as a "regular" service node would. Let's say we have a cluster with a TA90 tape drive with device name $1$MUA0:. We could create a pseudo-node called TA90 and a scan profile to search for strings with $1$mua0 as the opcom messages that are sent to nodes in a cluster would of course contain the device name. So we use a pseudo-node and it's scan profile to search for strings generated by opcom that concern the tape drive. Unless the PCM engine is part of the same cluster that it monitors of is the standalone node that owns the tape drive then we can't just do a spawn command and repl/enable. We'll be forced to use WATCH to get the messages. The only other option I can think of is a DECNet task-to-task setup which would be kind of messy and could generate lot's of ethernet traffic if something really wierd happens that causes a flurry of messages. Regs, Dan from one of the nodes inthe cluste
480.16		OPG::PHILIP	And through the square window...	`Wed Nov 23 1994 21:39`	10
	Dan, I think I see now, so, if you were to define the "subsystem" field correctly for each event and we then used that to create a "hierarchical" C3 display such that you could "zoom" into a system and see a bunch of icons representing each of the subsystems, would that do what you and your customers want? Cheers, Phil
480.17		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Thu Nov 24 1994 00:41`	8
	I think that would be a GREAT use for the sub-system field but some customers have already started using it for other logical groupings of events. For example, VMSCluster CNXMAN messages can be placed into a "Cluster" subsystem. I can see somebody griping about it. How about a peripheral field ala the VCS C3 and using that trigger the icon color? Dan
480.18		OPG::PHILIP	And through the square window...	`Thu Nov 24 1994 09:54`	14
	Dan, I dont want to add new fields to the evnt records at this stage for the next release so I cant put in a peripheral field. The intention of the subsystem field was in readiness for what I proposed, I have asked Dave to put this in the C3 already but he has a lot of other stuff to do, so, its just a matter of time and priorities. How anyone can think that a "cluster" is a subsystem is beyond me!! Cheers, Phil
480.19	.15 describes exactly what we were looking for ...	DECAUX::VNATIG::KARASEK	Thomas KARASEK @AUI	`Thu Nov 24 1994 12:55`	24
	One example of a customer's configuration: 6 systems 4 HSCs 1 translan bridge (All of the above are physically connected to PCM.) 1 X.25-router (--> = 1 pseudo terminal) 19 Satellite VAXstations (--> = 1 pseudo terminal) So one pseudo terminal is used for event notification of all the satellites. All we want to do is trigger on a few events, like "lost connection to ...", "established connection to ...". To accomplish this, we need to watch one of the bootnodes in the pseudo terminal (without logging data to file again) - and trigger on the specified events. In the meantime I did use the suggestion to create one controller process per configured system as a workaround. This seems to work pretty well (at least for that limited number of nodes.) However, I think this would be worth being addressed in future releases. Thanks & regards, Tom.
480.20		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Mon Nov 28 1994 18:32`	13
	>However, I think this would be worth being addressed in future releases. I'm not sure I agree after seeing the real configuration that you posted in -1. I see an ulterior motive on the customers part and that is he wants event notification on 19 nodes without having to buy 19 licenses and hooking them up them up the right way. Phil? Regs, Dan
480.21		OPG::PHILIP	And through the square window...	`Mon Nov 28 1994 21:14`	6
	I'm with you on this one Dan, the customer should really be doing this the "right" way, that is with seperate licences and real connections. Cheers, Phil
480.22		OPCO::TSG_SJM	Coming live to you from Rosebery	`Tue Nov 29 1994 03:31`	7
	FWIW..I was having nightly problems with processes in RWMBX. I have used the "unsupported fix", and changed the number of systems per control process down to 5, and havn't had a problem since, or am I just covering something up. Thanks Steve
480.23		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Tue Nov 29 1994 18:55`	12
	>or am I just covering something up. part of me says yes - part of me says no. There probably is a design limitation that is rearing it's ugly head when you have a reasonably busy system and each controller is handling 16 nodes. Decrease the amount of work on each controller and no more problem. Is it a bug/design limitation and the code should be changed or is do we really need to reevaluate the amount of work each controller should have to do? I really haven't made up my mind on this one! Regs, Dan
480.24		OPG::PHILIP	And through the square window...	`Tue Nov 29 1994 20:15`	15
	There were some tradeoffs made when we ported our code from ULTRIX to OpenVMS, these were based on the time it would take to port the code as opposed to the efficiency of the result. Based upon our experiences V2.0 should be a lot less memory hungry, resilient and hopefully faster. So, the bottom line is that you are covering up some of our design deficiencies, lowering the hosts per controller from the default of 16 should be no problem if you have enough system resources (memory etc) for all those extra processes which will get created. Beware though that raising the hosts per process above 16 could have dramatic effects based upon open file limits etc. Cheers, Phil
480.25	RE- .20, .21	VNOTSC::KARASEK	Thomas KARASEK @AUI	`Wed Nov 30 1994 11:36`	14
	Hi Phil, Dan ! I don't quite agree on that, since the RWMBX-problem is not due to the fact, that events for more than one nodes are scanned, but to the usage of the 'CONSOLE WATCH' - command. However, I don't think this is a license violation as well, since all the information is derived from a physically connected (and fully licensed) system. We could also setup whatever event scan and action routine we want on that system, without overcomming license agreements. In fact we are consuming 1 additional license for the pseudo terminal, on which the specific events are actually scanned. regards, Tom.
480.26		CSC32::BUTTERWORTH	Gun Control is a steady hand.	`Wed Nov 30 1994 19:17`	29
	>I don't quite agree on that, since the RWMBX-problem is not due to the >fact, that events for more than one nodes are scanned, but to the usage of >the 'CONSOLE WATCH' - command. That wasn't the point actually. I've had RWMBX problems even when a site isn't doing the WTACH trick and it was due to a similar problem and typically seen on busy systems. >However, I don't think this is a license violation as well, since all >the information is derived from a physically connected (and fully licensed) >system. I never said it was a license violation but rather a way to get around buying more licenses and hooking these systems up the right way. >We could also setup whatever event scan and action routine we want on >that system, without overcomming license agreements. No argument here at all. This is a question of whats the customer real motive. > In fact we are consuming 1 additional license for the pseudo terminal, >on which the specific events are actually scanned. I'm well aware of this. Regs, Dan