[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1153.0. "Memory leak problem" by CCIIS1::ROGGEBAND (_ �hili��e _) Mon Jun 17 1991 09:39

    We are running into some problems with the home-brewed PING AM when
    setting up alarm rules. There seems to be a memory leak somewhere, and
    I can't work out where it comes from.
    
    The AM was written using the design framewrok.
    
    The way the AM works is :
    
    From the main thread :
    
    - Create a thread to send an ICMP echo request, and "read" the echo
    reply (which may never come back)... After every I/O call :
    
    Test the status, check the IOSB and check mcc_thread_test_alert, and if
    necessary, clean up (deassign channel) and exit.
    
    - Create a thread which sets up a timer....
    
    - mcc_thread_join_Any
    
    Depending on which thread finishes  : mcc_thread_send_alert to the
    other one (so it knows it can stop working), and mcc_thread_delete it !
    Is there anything else I should do to insure 
    
    When I perform "SHOW ALL STATUS" op's, I don't run into problems, but
    perhaps I haven't done enough of those. When I set up an alarm rule,
    the following may happen :
    
    - C warns me about a memory allocation failure (happened once or twice)
    
    - If I try to look at the rules status :
    
    MCC> show mcc 0 alarms rule * all status, in domain test_icmp
    
    MCC 0 ALARMS RULE ping_divers_01 AT 17-JUN-1991 13:57:53 Status
    
    Examination of attributes shows: 
    			State = 
    %NONAME-E-NOMSG, Messagenumber 0326D17A 
    %NONAME-E-NOMSG, Message number 0326D162 
    
    MCC> show mcc 0 alarms rule * all status, in domain test_icmp 
    %NONAME-E-NOMSG,Message number 0326D17A 
    %LIB-F-INSVIRMEM, insufficient virtual memory
    %NONAME-F-NOMSG, Message number 0326E44C
    
    - Most common is this one :
    
    Alarms internal error (Rule Driver): Could not delete callargs->p_hand
    le :  52876210
    
    - I also got something like "failed to encode ICMP package a few times.
    
    I noticed that when I enable notification, the last notification I
    receive just before all hell breaks loose it about 3 ~ 4 minutes old.
    Could it be that I've jammed up the event manager (if it's used between
    the Alarms FM and notification services...)
    
    I noticed the design framework uses to calloc to obtain memory but
    releases it via an mcc_free routine. Why is this ? Is there a way I can
    monitor internal memory usage ? does the mcc_free routines return a
    status ? 
    
    Any directions to look into will be greatly appreciated.
    
    Amicalement,
    
    Philippe.
    
    
    [EOB]

T.R	Title	User	Personal Name	Date	Lines
1153.1	Design Framework use of calloc'	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Mon Jun 17 1991 09:56`	22
	As far as the Design Framework goes, it should always be using: mcc_calloc & mcc_free If it is using only 'calloc', then it's a bug. I'll look over the code to make sure - and QAR the Toolkit if I find any problems. The 'mcc_free' routine does not return any status. 8( If you are developing on VMS, you can use 'LIB$SHOW_VM' to determine the amount of memory allocated at any given time -- but you can get odd results when threads are running (because memory can be allocated/freed by other threads of execution). Are you deleting you threads? If you forget this, the thread context blocks and the thread stacks are left around; the stacks can be hugh! I think the 'Alarms internal error (Rule Driver)...' has been corrected for the next release of Alarms. Keith DECmcc Toolkit
1153.2	Memory ok , but notif overruns	CCIIS1::ROGGEBAND	_ �hili��e _	`Mon Jun 17 1991 13:06`	19
	Thanks for the quick response, Keith, I did have a loose thread hanging around. My understanding was that threads were deleted when they terminated ? Adding in a systematic mcc_thread_delete solved the problem. I have also replaced the "calloc" calls with mcc_calloc for esthetic purposes. I still have occasional problem with notifications, it seems notif services run into some kind of data overrun if too many rules keep firing at short intervals (~ 60 - 90 per minute on a single entity). The message is quite clear and says that notification is aborted and terminated for my domain. Is there a known limit to the number of notifications which can be handled in a given time on a suitably configured VS3100 ? Philippe.
1153.3	Not sure of any Notification/Events limits	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Mon Jun 17 1991 14:05`	23
	I don't know any hard-and-fast rules for the number of events per minute on a vs3100 - but the Event Manager Pool is a fixed size and can become full. The size of the Event Reports can limit the number of events which fit into the Event Pool. When I was working on the Alarms Team, I found I got Event Lost messages at about 300 - 400 Events 'put' but not 'got'. This was 6 months ago - and things could have changed. With regards to deleting threads - there are a few rules. >>> If you have a child thread with no parent (ie, a detached thread) - the thread must delete it self when it is done. It must get it's own thread id with 'mcc_thread_get_self' then call 'mcc_thread_delete'. >>> When a parent thread is joining with a child thread, the child thread must not delete itself - the parent must delete the child when the join completes. If the child deleted itself then returned, the parent would receive an 'existence' error for the thread-status argument of the join routine. Keith Roberts DECmcc Toolkit Team
1153.4	bug in 1.1 notificatoin FM response code	TOOK::CALLANDER	Jill Callander DTN 226-5316	`Mon Jun 17 1991 17:34`	11
	We recently found a bug in the notification FM/system et.al. The notification FM was expecting the no_more response to be one code, while most of the other modules used a different code -- therefore the bad error message and fatal handling, the code we got back was unexpected. We also found another problem where we were returning the wrong CVR in an overrun condition (exception instead of resposne), and that also caused a bad error message and notification to abort. These will both be fixed in the first 1.2 release. jill
1153.5		CCIIS1::ROGGEBAND	_ �hili��e _	`Tue Jun 18 1991 04:47`	12
	Thanks for the responses, Jill & Keith, Keith, you mention something about the alarms rule driver being fixed in the next release : �Alarms internal error (Rule Driver): Could not delete callargs->p_hand �le : 52876210 Can you tell me what it's due to, as it seems to resurface... Is there a workaround ? Or is it something in my code ? �hR.
1153.6	Alarms Internal Error (Rule Driver)	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Tue Jun 18 1991 09:55`	4
	The 'Alarms Internal Error (Rule Driver)' has something to do with Events (I think) - I'll ask the Alarms PL to answer this one. Keith
1153.7	Sorry about that...	WAKEME::ANIL		`Tue Jun 18 1991 11:46`	30
	Hi �hili��e, RE: �Alarms internal error (Rule Driver): Could not delete calorics->p_handle : 52876210 If you look in to MCC_MSG.H you will find that the number 52876210 translates to MCC_S_INV_HANDLE_STATE. In rule driver I had some printfs that will write the above message to the screen when rule driver detects a bug. What this means is that you are witnessing a bug. Ed Hronik, who was working on alarms for a short time was able to trace this bug to event manager. When a thread term alert is issued as a side effect of issuing EXIT command event manager would delete its data structure without waiting for Alarms to send a handle cancel request. Alarms would then try to send the request to cancel the call, event manager would indicate that the handle is not valid any more! Now from Alarms perspective the handle is valid! We found the source of this bug after V1.1 shipped. Next release will have the bug fix. Yes, it was my mistake not to put the printf statements under debug logical control. I don't have any work around for this problem. Other than seeing the above error message, are you experiencing any other side effects? Thanks, - Anil Navkal