[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1153.0. "Memory leak problem" by CCIIS1::ROGGEBAND (_ �hili��e _) Mon Jun 17 1991 09:39

    We are running into some problems with the home-brewed PING AM when
    setting up alarm rules. There seems to be a memory leak somewhere, and
    I can't work out where it comes from.
    
    The AM was written using the design framewrok.
    
    The way the AM works is :
    
    From the main thread :
    
    - Create a thread to send an ICMP echo request, and "read" the echo
    reply (which may never come back)... After every I/O call :
    
    Test the status, check the IOSB and check mcc_thread_test_alert, and if
    necessary, clean up (deassign channel) and exit.
    
    - Create a thread which sets up a timer....
    
    - mcc_thread_join_Any
    
    Depending on which thread finishes  : mcc_thread_send_alert to the
    other one (so it knows it can stop working), and mcc_thread_delete it !
    Is there anything else I should do to insure 
    
    When I perform "SHOW ALL STATUS" op's, I don't run into problems, but
    perhaps I haven't done enough of those. When I set up an alarm rule,
    the following may happen :
    
    - C warns me about a memory allocation failure (happened once or twice)
    
    - If I try to look at the rules status :
    
    MCC> show mcc 0 alarms rule * all status, in domain test_icmp
    
    MCC 0 ALARMS RULE ping_divers_01 AT 17-JUN-1991 13:57:53 Status
    
    Examination of attributes shows: 
    			State = 
    %NONAME-E-NOMSG, Messagenumber 0326D17A 
    %NONAME-E-NOMSG, Message number 0326D162 
    
    MCC> show mcc 0 alarms rule * all status, in domain test_icmp 
    %NONAME-E-NOMSG,Message number 0326D17A 
    %LIB-F-INSVIRMEM, insufficient virtual memory
    %NONAME-F-NOMSG, Message number 0326E44C
    
    - Most common is this one :
    
    Alarms internal error (Rule Driver): Could not delete callargs->p_hand
    le :  52876210
    
    - I also got something like "failed to encode ICMP package a few times.
    
    I noticed that when I enable notification, the last notification I
    receive just before all hell breaks loose it about 3 ~ 4 minutes old.
    Could it be that I've jammed up the event manager (if it's used between
    the Alarms FM and notification services...)
    
    I noticed the design framework uses to calloc to obtain memory but
    releases it via an mcc_free routine. Why is this ? Is there a way I can
    monitor internal memory usage ? does the mcc_free routines return a
    status ? 
    
    Any directions to look into will be greatly appreciated.
    
    Amicalement,
    
    Philippe.
    
    
    [EOB]
     
    
T.RTitleUserPersonal
Name
DateLines
1153.1Design Framework use of calloc'NANOVX::ROBERTSKeith Roberts - DECmcc Toolkit TeamMon Jun 17 1991 09:5622
As far as the Design Framework goes, it should always be using:

	mcc_calloc  &  mcc_free

If it is using only 'calloc', then it's a bug.  I'll look over the code to
make sure - and QAR the Toolkit if I find any problems.

The 'mcc_free' routine does not return any status.  8(

If you are developing on VMS, you can use 'LIB$SHOW_VM' to determine
the amount of memory allocated at any given time -- but you can get odd
results when threads are running (because memory can be allocated/freed
by other threads of execution).

Are you deleting you threads?  If you forget this, the thread context blocks
and the thread stacks are left around; the stacks can be hugh!

I think the 'Alarms internal error (Rule Driver)...' has been corrected
for the next release of Alarms.

Keith
DECmcc Toolkit
1153.2Memory ok , but notif overrunsCCIIS1::ROGGEBAND_ �hili��e _Mon Jun 17 1991 13:0619
    Thanks for the quick response, Keith,
    
    I did have a loose thread hanging around. My understanding was that
    threads were deleted when they terminated ? Adding in a systematic
    mcc_thread_delete solved the problem.
    
    I have also replaced the "calloc" calls with mcc_calloc for esthetic
    purposes.
    
    I still have occasional problem with notifications, it seems notif
    services run into some kind of data overrun if too many rules keep
    firing at short intervals (~ 60 - 90 per minute on a single entity).
    The message is quite clear and says that notification is aborted and
    terminated for my domain. 
    
    Is there a known limit to the number of notifications which can be
    handled in a given time on a suitably configured VS3100 ?
    
    Philippe.
1153.3Not sure of any Notification/Events limitsNANOVX::ROBERTSKeith Roberts - DECmcc Toolkit TeamMon Jun 17 1991 14:0523
I don't know any hard-and-fast rules for the number of events per minute
on a vs3100 - but the Event Manager Pool is a fixed size and can become full.

The size of the Event Reports can limit the number of events which fit into
the Event Pool.  When I was working on the Alarms Team, I found I got Event
Lost messages at about 300 - 400 Events 'put' but not 'got'.  This was 6
months ago - and things could have changed.

With regards to deleting threads - there are a few rules.

>>> If you have a child thread with no parent (ie, a detached thread) - the
    thread must delete it self when it is done.  It must get it's own thread
    id with 'mcc_thread_get_self' then call 'mcc_thread_delete'.

>>> When a parent thread is joining with a child thread, the child thread must
    not delete itself - the parent must delete the child when the join
    completes.

    If the child deleted itself then returned, the parent would receive an
    'existence' error for the thread-status argument of the join routine.

Keith Roberts
DECmcc Toolkit Team
1153.4bug in 1.1 notificatoin FM response codeTOOK::CALLANDERJill Callander DTN 226-5316Mon Jun 17 1991 17:3411
We recently found a bug in the notification FM/system et.al. The notification
FM was expecting the no_more response to be one code, while most of the other 
modules used a different code -- therefore the bad error message and fatal
handling, the code we got back was unexpected.

We also found another problem where we were returning the wrong CVR in an
overrun condition (exception instead of resposne), and that also caused a
bad error message  and notification to abort. These will both be fixed in
the first 1.2 release.

jill
1153.5CCIIS1::ROGGEBAND_ �hili��e _Tue Jun 18 1991 04:4712
    Thanks for the responses, Jill & Keith,
    
    Keith, you mention something about the alarms rule driver being fixed
    in the next release : 
    
    �Alarms internal error (Rule Driver): Could not delete callargs->p_hand
    �le :  52876210
    
    Can you tell me what it's due to, as it seems to resurface... Is there
    a workaround ? Or is it something in my code ?
    
    �hR.
1153.6Alarms Internal Error (Rule Driver)NANOVX::ROBERTSKeith Roberts - DECmcc Toolkit TeamTue Jun 18 1991 09:554
The 'Alarms Internal Error (Rule Driver)' has something to do with Events
(I think) - I'll ask the Alarms PL to answer this one.

Keith
1153.7Sorry about that...WAKEME::ANILTue Jun 18 1991 11:4630
  Hi �hili��e, 

  RE:

�Alarms internal error (Rule Driver): Could not delete calorics->p_handle :  52876210
  	
 If you look in to MCC_MSG.H you will find that the number 52876210 translates to
 MCC_S_INV_HANDLE_STATE.  In rule driver I had some printfs that will write the 
 above 	 message to the screen when rule driver detects a bug. What this means 
 is that you are witnessing a bug. Ed Hronik, who was working on alarms for a short
 time was able to trace this bug to event manager. 
  
 When a thread term alert is issued as a side effect of issuing EXIT command 
 event manager would delete its data structure without waiting for Alarms to 
 send a handle cancel request. Alarms would then try to send the request to cancel 
 the call, event manager would indicate that the handle is not valid any more!
 
	
 Now from Alarms perspective the handle is valid! We found the source of this bug
 after V1.1 shipped. Next release will have the bug fix.

 Yes, it was my mistake not to put the printf statements under debug logical 
 control. I don't have any work around for this problem. Other than seeing
 the above error message, are you experiencing any other side effects?

 Thanks,

 - Anil Navkal