[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1226.0. "Alarm notification stopps for domain" by MFRNW1::SCHUSTER (Karl Schuster @MFR Network Services) Wed Jul 10 1991 06:48

I run BMS V1.1, SNMP AM V1.0, Alarms Patch for BMS is installed, VMS V5.4.

I have set up 2 domains ( 2 independant domains, with no parent-child 
relation ). The first domain is for DECnet with about 40 rules with 5 
Minute interval polling, the second domain is for SNMP with about 20 rules
with 1 minute polling interval. Polling runs in Batch with 2 independant
batchprocess ( 1 for each domain ). The processes are restarted every morning.

Alarmnotifications on the MAP work fine for about 2 - 3 days, and then I
get the error messages in the Iconic Maps ( which are always active ).

	Notification being stopped for Domain ...
	unexpected condition returned to notification FM
	exception encountered - event manager reported a lost event

( The DECnet event process is not active ).

The messages appear again, as soon as I restart the Maps.

After a Systemreboot alarming works again for about 2-3 days.

Is this a bug, or a problem with Quotas, Sysgen Parameters, ... ?

Regards,
Karl Schuster

T.R	Title	User	Personal Name	Date	Lines
1226.1	bug in 1.1 kit	TOOK::CALLANDER	Jill Callander DTN 226-5316	`Fri Jul 12 1991 09:11`	12
	We recently found that there is a bug in the notification system that occurs when the event manager gets an event overflow condition. The AMs and FMs are using different internal codes for specifying this "condition", which is causing the PMs problems. The only thing that concerns me about your seeing it is that you didn't mention anything about getting a large number of events. Would you have any guesstimate as to how many alarms have fired (total) since the PM was started. Also could you try starting the FCL and leaving that up for a few days as well with the two NOTIFY DOMAIN xxx commands running and see what happens there as well (do they die at the same time if started at the same time). thanks for the input more when we know something.
1226.2	more info .....	MFRNW1::SCHUSTER	Karl Schuster @MFR Network Services	`Mon Jul 15 1991 09:42`	14
	Just some more detailed info about the Problem: 1. We dont use DECnet eventlogging at all - only synchronous polling with alarm rules. 2. we get a lot of alarms fired, because some components are often offline ( in total we have > 10000 alarms fired within 3 days ) 3. today there occured another error: notification being stopped for domain ... unknown DNS error Karl
1226.3	question on batch	TOOK::CALLANDER	Jill Callander DTN 226-5316	`Mon Jul 15 1991 15:36`	32
	When a process makes a request of the event manger, that request remains active unless requested to termiante. Killing a batch job will leave the reqeust active with no recipient for the data, causing the manger to finally run out of memory and terminate. To stope the event manager you must terminate ALL mcc sessions running, including those in batch. during regular operation of mcc please do not do "stop..." or "delete/ent..." commands to terminate mcc batch jobs, especially if they are event sources or requestors. I am not certain if this is what yoiu are doing, but it will cause the overall results that you are seeing (this has been discussed at length in a number of other notes in the conference). If you want to run an alarms batch job, and restart it every morning, try something like .COM enable rules show mcc 0 all char, at start=(when you want to terminate procedure) disable rules exit mcc Then run this .COM from inside a dcl procedure that just keeps looping around running it (and example of a procedure to do this I believe can be found using keyword TOOLS). Could you try killing all mcc sessions (including batch jobs) and giving a try to restarting your rules and event manager and see if your problem goes away. thanks jill
1226.4	more info	MFRNW1::SCHUSTER	Karl Schuster @MFR Network Services	`Thu Jul 18 1991 09:30`	17
	Normally we do not $stop/id or $stop/entry of the alarmbatchjob, but it might have happened a few times during testing. The alarmbatchjob is like: enable rules show mcc 0 alarm rule * all counter, in domain xyz, to file=xyz.log, at every 01:00:00 until 23:00:00 The job is restarted for the next day 05:00:00. The IMPM Windows remain permanently open. What we did NOT do in the .com file is: disable rules I will change the .com file and add "disable rules", as well as terminate Map Windows before starting and stopping alarm processes. More info next week. Karl
1226.5	Disable is Automatic with normal shut-down	TOOK::ORENSTEIN		`Thu Jul 18 1991 10:28`	12
	Is there a difference between MCC executing the command EXIT and MCC running out of commands to execute (the end of the com file)? I don't believe so. If there is no difference then the DISABLE RULES is done automatically. The MCC exit handler will automatically send an Alert to all threads: this includes the threads in ALARMS that have the outstanding GETEVENT. Once the thread is notified, it will become disabled (and hence the rule will become disabled). aud...
1226.6	o.k.	MFRNW1::SCHUSTER	Karl Schuster @MFR Network Services	`Mon Jul 29 1991 04:22`	4
	Now, as I DISABLE the rules before the image exit, the error did not occur for 1 and a half week. It seems to be the solution. Thanks, Karl
1226.7	another : Notification stopped ...	HLRG02::SYSTEM	Incredible but . . . not true .	`Wed Jul 22 1992 08:55`	24
	Hi, A customer has: - VMS V5.4-3 - DECmcc V1.1.0 He creates some alarm rules and enables them. When he enables notification notification than he gets the following error: Notification being stopped for domain ... %MCC-E-UNSUPP-OP, unsupported operation. And he gets in the window where he starts mcc the following: %SYSTEM-F-ACCVIO access violation, reason mask=00, virtual address=0000000C, PC=000F0980, PSL=03C00004 Can anybody give me some help ? ? Regards, /-/ Henk.
1226.8	any DNA5 entities in the domain	TOOK::CALLANDER	MCC = My Constant Companion	`Fri Jul 24 1992 12:48`	6
	There wea re few problems with DNA5 handling... without getting into the details first let's find out if they were using any. If not I would appreciate a list of the contents of the domain. thanks jill