[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

3591.0. "ILV decoding causes memory exception" by CCIIS1::ROGGEBAND (_ �hili��e _) Wed Aug 19 1992 10:47

    Hi,

    I'm encountering the following problem when decoding an ILV buffer :

    Context : DECmcc T1.2.7 , Ultrix 

    The attribute I'm decoding is passed in an attribList as an argument of
    a SET  directive. It is defined in MSL as a SET OF RECORD , the RECORD
    contains two  Unsigned Integer fields.

    The AM copies the in_p buffer to a message queue and passes it to an
    external application which uses calleable MCC to decode the buffer. 

    This is the ILV dump of the In_P buffer :

    [  0 ]  ( 
        [   1 ]  (	
             [    6039 ]  5	      
                     [  1 ]		34  --  4	     
     		     [  3 ]		00	      
     		     [  4 ]  (		  
    			[  1 ]  (		       
    			    [   1 ] 		02		      
     			    [   2 ]             02 
    			   ) 
                    ) 
                ) 
           ) 
      )


    This dump matches the MSL codes and values I specified in the SET
    command.

    In the application which receives the buffer from the AM, I build an
    MCC descriptor pointing to the receiving buffer : Here is the dump of
    the descriptor. 

    mcc_b_flags 	= 0 
    mcc_b_class	  	= 1 
    mcc_b_ver	  	= 1 
    mcc_a_link 	  	= 0
    mcc_l_id	  	= 0 
    mcc_w_maxstrlen 	= 34 
    mcc_w_curlen  	= 34 
    mcc_b_dtype	  	= 0 
    mcc_l_dt	  	= 127 
    mcc_a_pointer 	= 268744206

    
    Symptom :

    I use the following calls :

    mcc_ilv_get_param_begin 	/* Open ILV buffer */

    mcc_ilv_get_id 		/* Position context on Attrb List */

    mcc_ilv_get_cons_begin	/* Open Attrib list with ILV Mode
    				   set 	to MCC_K_ILV_LIST_VALUE */

    mcc_ilv_fnd_id ( with IDcode = 6039) /* Position context on
    					attribute wanted */

    NOTE : all above calls return status = MCC_S_NORMAL

    mcc_ilv_get_cons_begin	/* Open SET OF construction with
    				    ILV mode set to MCC_K_NATIVE_VALUE */

    this call causes the process to exit with the following message :

    Exception: Invalid memory address (dce / thd)

    I have a hunch that my ILV context block may have got corrupted, but as
    it  is an undocumented opaque structure, I'm not sure on where to look.
    I tried  replacing the call to mcc_ilv_fnd_id by a call to
    mcc_ilv_get_id (as I only  have one attribute, I got a MCC_S_NORMAL
    with IDcode = 6039), but the same  problem happened.

    I also tried inserting a call to mcc_list_get_datatype after the 
    mcc_ilv_fnd_id, that cause the process to crash as well.

    I've dumped the buffer (using both ILV_dump and a printf just to check)
    once the external application got it, and it is strictly identical to
    the buffer the AM received. 

    Any ideas on where to look? 

    Regards,

    �hR.
T.RTitleUserPersonal
Name
DateLines
3591.1V1.2 is out...TOOK::MINTZErik Mintz, dtn 226-5033Wed Aug 19 1992 12:423
Perhaps someone will have some ideas on this, but we don't have
anyone assigned to track problems in the field test code any more.

3591.2URGENT: Same problem with V1.2CCIIS1::ROGGEBAND_ �hili��e _Wed Sep 09 1992 11:367
    Hello,
    
    We've now upgraded to V1.2, we still have the problem described in the
    base note. Any idea on where to look, additional traces we could turn
    on etc.. ?
    
    �HR.
3591.3I'll enter a qarTOOK::KOHLSRuth KohlsWed Sep 09 1992 12:5726
Hello.

I will enter a qar and take a look at it, but I have other urgent work ahead 
of it.  In the mean time, please confirm that the 5 after the [ 6093 ]
is a misprint.  It should be a (.  

Regards,

Ruth Kohls


    [  0 ]  ( 
        [   1 ]  (	
             [    6039 ]  5	      
                     [  1 ]		34  --  4	     
     		     [  3 ]		00	      
     		     [  4 ]  (		  
    			[  1 ]  (		       
    			    [   1 ] 		02		      
     			    [   2 ]             02 
    			   ) 
                    ) 
                ) 
           ) 
      )

3591.4qar 3415TOOK::KOHLSRuth KohlsWed Sep 09 1992 13:051
entered as qar 3415 against the exec in mcc_internal. RK
3591.5CCIIS1::ROGGEBAND_ �hili��e _Thu Sep 10 1992 06:4110
    Re: .3:
    
    Yep, it's a typo ! on french keyboards, 5 = Shift '(' !!
    
    By the way, I've never used QAR, how do I get informed on the status of
    the QAR ?
    
    Thanks for your time,
    
    Philippe.
3591.6Answer usually postedTOOK::MINTZLKG2-2 near pole X3, cube 6072, dtn 226-5033Thu Sep 10 1992 08:507
When a QAR from a note is answered, the answer is usually cross
posted to the original note.

To check directly you would need a QAR system account  (don't bother
at the moment; we may be switching QAR systems soon).

-- Erik
3591.7a thought on the problemTOOK::KOHLSRuth KohlsMon Sep 14 1992 11:546
Just a thought.  Have you tried increasing the STACK memory that your
application has available?  Those ILV calls need an additional context area,
which is on the stack, for the list mode manipulations.

Ruth K.

3591.8CCIIS1::ROGGEBAND_ �hili��e _Fri Sep 18 1992 08:009
    Ruth,
    
    I asked my customer to increase the stack size. He set it to the
    maximum value using the "limit" command. I am not an expert on how to
    set stack sizes, is this the way to do it ?
    
    The problem is still there....
    
    �hR.
3591.9ProbablyTOOK::KOHLSRuth KohlsMon Sep 21 1992 15:5419
>    I asked my customer to increase the stack size. He set it to the
>    maximum value using the "limit" command. I am not an expert on how to
>    set stack sizes, is this the way to do it ?

I think yes, its a csh command, look in the Man pages for more. The expert 
I asked said they should use "setrlimit" (in my ancient Ultrix32 manuals, 
these are in volume 2, system calls).  To me, it looks like the difference is 
an internal system call (setrlimit) vs. a csh command (limit) with similar 
effects but different scope of application.

The scope may be a factor.

My expert also says, the application may have other structures that 
are getting in the way of increasing stack (such as shared memory).
 
I will keep looking and asking.

Ruth K.

3591.10pb exist in "callable" mcc, not DECmcc MMTAEC::WEBERTue Sep 22 1992 10:2822
  
    I have also seen this problem.
    The funny thing is that if the ***ilv_*** instructions are part of a DECmcc
    MM and the Stack size defined for this MM, then the execution is OK.
    The same code, when being linked to "callable mcc" fails.

    What I understand of it, is that when this instruction is executed in the
    MM, the thread which executes has been created with an accurate
    stack size.
    When running as a separate program, the thread that executes the
    code is that of the main program, that is, the initial thread (or that
    of the main program). There is no multithreading in that part of
    their code. 
    So the solution would be to dimension the thread for the main program, so
    I've told them to try the setrlimit service.
        
    One possible alternate solution that I see is to run the decode/encode ilv
    part of the program in a brand new (created) thread with an accurate
    stack size. Correct ?
    
    Florence

3591.11Yes, we got that far.TOOK::KOHLSRuth KohlsWed Sep 23 1992 11:2067
Florence,
  
>    I have also seen this problem.
>    The funny thing is that if the ***ilv_*** instructions are part of a DECmcc
>    MM and the Stack size defined for this MM, then the execution is OK.
>    The same code, when being linked to "callable mcc" fails.

Thank you for entering the discussion! 
ILV is designed to use only the memory given to it by the caller or the
stack.  If the caller gives permission, it may re-allocate the data buffers.
The ILV calls reported in .0 do not involve  the latter sort of buffer,
they involve moving context pointers and allocating additional, temporary,
context areas on the stack.  This, and the reported error message is why 
I believe that stack memory in the single threaded application is the problem.

>    What I understand of it, is that when this instruction is executed in the
>    MM, the thread which executes has been created with an accurate
>    stack size.

>    When running as a separate program, the thread that executes the
>    code is that of the main program, that is, the initial thread (or that
>    of the main program). There is no multithreading in that part of
>    their code. 
>    So the solution would be to dimension the thread for the main program, so
>    I've told them to try the setrlimit service.
 
Looking in the Man pages at setrlimit and sbrk, I think that the using the
setrlimit function in the single-threaded application will increase the 
stack that that program is _allowed_ to use.  I think that the brk or sbrk
functions must then be called to actually add to the stack.

You might also try using the csh service "limit" to increase the stack
size granted to all processes and their children, before testing. However, I
think this has been tried. (see the previous replies).

I am NOT sure, this is my interpretation of the man pages.  I hope someone
will either confirm or correct me.
       
>    One possible alternate solution that I see is to run the decode/encode ilv
>    part of the program in a brand new (created) thread with an accurate
>    stack size. Correct ?
    
Probably, but it ought not be necessary. 

I would look at the current stack size and at what is on the stack now.

(The structures that ILV needs are not large, and it doesn't need that many.  
In the examples given in .0, the additional ILV context area size requirement 
is certainly less than 1 K, and I would not be surprized to find that 
ILV's additional stack requirement was around 1/2 K.) 

What I would look for is large data structures, such as maximum sized
Latin1Strings (65535 bytes), allocated on the stack.

Also, a core dump of the problem to look at would be nice. To get that:

setenv MCC_LOG 0x40000 
and then create the crash.
(note, that is Hex 40 000)

Please, do not post a core dump in this notes file, and do not mail me a core 
dump!  Send me mail saying where I can copy it from!

Here's hoping were on the right track.

Ruth Kohls
(TOOK::KOHLS)
3591.12setrlimit solved th problemTAEC::WEBERFri Sep 25 1992 13:1312
    Ruth,

    Thanks for you clear explanations. It is possible that they allocate
    Latin1Strings on the stack.
    The use of the setrlimit service did solve their problem.
    About the core dump, if it is usefull for you, I can see with
    P. Roggeband how you can get one (he is geographically near the
    customer, I am not;
	and also you'll be sure I will not mail you a core !!!!     :-) ).
    
    Florence
    
3591.13OK, but I'm not closing the QAR yetTOOK::KOHLSRuth KohlsTue Sep 29 1992 11:4122
Hello, Florence,

>    The use of the setrlimit service did solve their problem.
Thank you, that's good to know.

>    About the core dump, if it is usefull for you, I can see with
>    P. Roggeband how you can get one (he is geographically near the
>    customer, I am not;
I mailed him a request for a core dump last week. I'm not sure how much use
it would be now--your customer could probably better use the time and space
figuring out why the stack size was a problem. (Could be anything from the 
kernel configuration to simple over-use of stack.)


>	and also you'll be sure I will not mail you a core !!!!     :-) ).

Well, I was pretty sure about you... (;-))

Thanks again, 

Ruth K.

3591.14CCIIS1::ROGGEBAND_ �hili��e _Wed Dec 16 1992 05:393
    The problem is back ! What is the status of this QAR ?
    
    PhR.
3591.15TOOK::SWISTJim Swist LKG2-2/T2 DTN 226-7102Wed Dec 16 1992 16:079
The fact that setrlimit was used to get around the problem in V1.2, and this is
a default rather than thread stack,  indicates an enormous stack consumption
probably associated with placing MCC_T_Latin1string datatypes on the stack (these
are defined as char[65536]).

It would appear that some slight increase in V1.3 stack usage has brought the 
problem back.   I think at this point the source program should be changed to
fix the problem.