[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:	POLYCENTER Console Manager
Notice:	Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:	CSC32::BUTTERWORTH

Created:	Thu Aug 06 1992
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	1541
Total number of notes:	6564

999.0. "Memory Fault - core dump on monitor interface" by 52246::ALBERTO (Alberto Yanguas. Systems Integration Spain) Fri Sep 22 1995 04:10

Hello,

I'm involved in a customer management platform implementation, wich includes
PCM.

I have installed PCM V1.6 ECO1 on Digital UNIX V3.2

The monitor interface (launched either from c3 or  console -c/-m) is having
a persistent problem.

After some seconds of start the monitor interface it crashes with
"memory fault - core dumped".

It usually happen when exercising the log review by pressing the advance
and back keys. Some other times has happened some seconds after connect to
the system console line. Even some time happened without doing anything, 
i.e. just invoke console -m and wait for the core dump.

I have clean all log files from the console lines, to start log files from 
scratch, and have restarted PCM and rebooted the system for many times.

The problem is still happening.

Could you please tell me how to investigate the problem ?
Is there any information that I can save for you to analyze the problem ?

Thank you in advance for your support

Regards
Alberto Yanguas
Systems Integration - Spain

T.R	Title	User	Personal Name	Date	Lines
999.1	Any help please!?	52246::ALBERTO	Alberto Yanguas. Systems Integration Spain	`Wed Sep 27 1995 09:09`	38
	Hello, Please I need help. Could anybody answer to .0 ? I have seen following behaviour: After a core dump of the monitor/connect interface, I connect to the same system from another terminal. Each caracter that is typed under connect interface is echoed twice. But the managed system respods correctly to the typed command and through the monitor interface you see the log with good appearence (i.e. without echo twice). After a second core dump, I enter again the connect interface and each caracter that I type is echoed three times, but the managed system still receives only one time as he interprets commands correctly. Restarting the console daemons clears the problem but the customer still have a monitor/connect that is inoperative. Another observation: - Start Console Manager - invoque a connection: console -c monti2 - just wait and see... - after some minutes the connection interface is core dumped - shutdown console manager On logfile /var/opt/console/tmp/CONSOLE_CHILD_001.LOG you can read: Write error on Local socket /var/opt/console/tmp/CONSOLE_CTRL_MONTI2 Bad file number Please, could you give me any assessment on what to do? Regards Alberto
999.2	Seems same problem on note 800	52246::ALBERTO	Alberto Yanguas. Systems Integration Spain	`Wed Sep 27 1995 09:55`	19
	Hi again, The problem described here in .0 is quite similar to the one described in note 800, that was aparently solved with a newer field test kit. The kits that I have installed have been taken from PCMSVR::DISK$PCM:[PCM] and are the following: DCROSF161.TAR;1 7520/7521 21-JUL-1995 14:40:25.00 PCM_V16_OSF.TAR;1 10620/10620 21-JUN-1995 11:40:59.00 Are these the final SSB kits ? Is there any additional patch ? Rgds Alberto P.D. Please Any answer from anybody ?
999.3		ZENDIA::DBIGELOW	Innovate, Integrate, Evaporate	`Wed Sep 27 1995 12:36`	24
	Alberto, You can try the following: 1. shutdown PCM. Make sure there are no lingering processes hanging around after you shutdown PCM ps -wax \| grep console dxconsole daemons are not part of PCM, so do not kill them. 2. Delete all files in /var/opt/console/tmp 3. Restart PCM. I don't know why you are geting the character echo'ed 2-3 times. If it is still doing this, then look into the setting on the terminal server. Make sure that local echo is not set. You appear to be using the latest kit version. Dave
999.4	Is AlphaServer 2100 supported hardware ?	52246::ALBERTO	Alberto Yanguas. Systems Integration Spain	`Mon Oct 02 1995 11:58`	17
	Thanks for your suggestions, but that only clears the problem untill it happens again. It hhapens only with some console lines. The rest appear to be stable and reliable although no extensive testing has been done yet. Other issue: The management platform of the customer (including PCM) is running on a Alphaserver 2100 4/275. This type of system does not appear on the list of supported hardware on SPD ! Is really the 2100 not supported for run PCM software ? Rgds Alberto
999.5		ZENDIA::DBIGELOW	Innovate, Integrate, Evaporate	`Mon Oct 02 1995 15:16`	8
	Alberto, Also, make sure that the terminal server's software is up to date. Regarding the 2100, PCM was probabily released before the 2100 was. It should work just fine on a 2100. Dave
999.6	Problem increase !	52246::ALBERTO	Alberto Yanguas. Systems Integration Spain	`Wed Oct 04 1995 08:26`	27
	The problem still persists and it seems is growing. Although, I will upgrade to the last DEC UNIX V3.2C and also will convince the customer to upgrade terminal servers to the last revision, I think the problem may be in another place. The console manager daemons work fine. There are abount 20 systems defined, using telnet connections, and the output of consoles appears correctly on the log files and I can execute console -x without any problems. BUT, THE PROBLEM IS that I cannot mantain a CONNECT/MONITOR session for more than a minute without crassing the process with a core dump. Even I invoke just console -m (and type a show systems) and some time in a few seconds (without doing anything else) the program core dumps. Is it possible to check whether I've got the right programs ? For example by checking the checksum of the programs involved when invoking console -m or console -c ? Is of any use the core files that I have saved ? Hope any additional help regards Alberto
999.7	Another one here...could it be resources??	29067::SCHLABS		`Thu Oct 12 1995 12:42`	8
	I am talking with a gentleman who is seeing the same thing, but only on one system. He gets the memory fault, core dumped on the connect. This is a heavily loaded oracle server. Could resources be a problem?? If so, what should we look at? thanks, greg
999.8	Problem solved !	52246::ALBERTO	Alberto Yanguas. S.I. Spain	`Mon Oct 16 1995 11:20`	42
	After some extensive testing I think I've isolated when this failure occurs. The console_conmon program appears to core dump (in less than one minute after it has been invoked) when either or both the following situations occurs: a) There are no log files (.EVENTS, .LOG, *.TIMES) for a configured system. This may happen if the files are deleted or moved to another directory, or when a console link never establises for a new configured system. (log files are not created until a console link is initially established). b) There is a system with telnet connection (I never tested with LAT) whose terminal server is unknown (i.e. not in /etc/hosts). This was my intentional fault trying to emulate VCS funcionality to have a peripheral icon on the C3 display. I defined a system with a connection to a dummy server name. To avoid that, I have redefined the connection to pseudo-terminal and have a convenient script that loops to avoid link disconnections. Question: Is there a more direct way to represent 'peripheral' or other kind of icons without beeing a configured system ? (Oh... dear VCS...!) After I created empty log files for the systems whose console link does not still work, and avoid to use dummy server for the connection information, the core dump problem seems to have gone away, and a monitor session has lasted several hours without any core dump. I'm able to reproduce the core dump if either or both of above situations occur. The environment is DEC UNIX V3.2C and PCM V1.6 + ECO 1 I think this problems/workarounds could be investigated or at least documented as a restriction. Best Regards Alberto
999.9		29067::BUTTERWORTH	Gun Control is a steady hand.	`Tue Oct 17 1995 17:34`	33
	>a) There are no log files (.EVENTS, .LOG, .TIMES) for a configured system. > This may happen if the files are deleted or moved to another directory, > or when a console link never establises for a new configured system. > (log files are not created until a console link is initially established). >b) There is a system with telnet connection (I never tested with LAT) whose > terminal server is unknown (i.e. not in /etc/hosts). As you have correctly surmised, the files aren't created until we have successfully connected to a node at least once. Since item "b" indicates that you had intentionally configured a system that could never be connected too then it caused the symptom in item "a". The point is item "a" is the real* problem and this isnot the way it's supposed to work. It's broken! > This was my intentional fault trying to emulate VCS funcionality to have > a peripheral icon on the C3 display. I defined a system with a connection > to a dummy server name. > To avoid that, I have redefined the connection to pseudo-terminal and have > a convenient script that loops to avoid link disconnections. > Question: Is there a more direct way to represent 'peripheral' or other > kind of icons without beeing a configured system ? (Oh... dear VCS...!) If I understand your questiont he answer is yes. Turn on EDut mode in the C3, place the pointer inthe backgroundof the C3 and hold down MB3. You can then create a peripheral icon from that menu. Regards, Dan
999.10	Don't you think is a bug ?	52246::ALBERTO	Alberto Yanguas. S.I. Spain	`Mon Oct 30 1995 12:42`	45
	>>a) There are no log files (.EVENTS, .LOG, .TIMES) for a configured system. > >> This may happen if the files are deleted or moved to another directory, >> or when a console link never establises for a new configured system. >> (log files are not created until a console link is initially established). > >>b) There is a system with telnet connection (I never tested with LAT) whose >> terminal server is unknown (i.e. not in /etc/hosts). > > As you have correctly surmised, the files aren't created until we have > successfully connected to a node at least once. Since item "b" > indicates that you had intentionally configured a system that could > never be connected too then it caused the symptom in item "a". The > point is item "a" is the real* problem and this isnot the way it's > supposed to work. It's broken! > OK, but imagine that you add a new system with a valid server connection data, then you do a console reconfiguration, but the server is unreachable. Until the server is reachable you have the situation a) and there is a probabilty that a console monitor user gets a core dump and leaves the product in a not very reliable status until a full daemons restart. Also imagine that you decide to change the location of log files, re-edit the configuration file and restart PCM without moving previous logfiles. Then all console connections 'unreachable' will have missing logfiles. I think the above is 'the way some people is supposed to work' (at least me), although these situations does not happen every day. I think that the console monitor program should not core dump when there are missing logfiles. If a patch to avoid this core dump is not going to be worked, at least some kind of 'restriction or known bugs documentation' could be generated, to avoid 'frustration' on other people that could do same things. (It took almost one month for this to be resolved, and I spent many days testing for this problem, and upgrading from UNIX V3.2B to V3.2C and reinstalling products, because 'this was mandatory' according to my local CSC support in order to be able to escalate the problem...!) Regards Alberto
999.11	<	29067::BUTTERWORTH	Gun Control is a steady hand.	`Mon Oct 30 1995 17:03`	7
	Alberto, I think you misunderstood me. I said "it's broken" in -2 which means it is a bug. Regards, dan
999.12		52246::ALBERTO	Alberto Yanguas. S.I. Spain	`Tue Oct 31 1995 14:37`	6
	Sorry, Dan. I really misunderstood you. Thank you for your support. Regards Alberto