[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::languages

Title:Languages
Notice:Speaking In Tongues
Moderator:TLE::TOKLAS::FELDMAN
Created:Sat Jan 25 1986
Last Modified:Wed May 21 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:394
Total number of notes:2683

304.0. "VAX Procedure Calling Standard Questions" by DSM::SCHWARTZ (DSM Engineering) Wed Mar 13 1991 18:03

I have a question for the various language compiler folks regarding
the VAX Procedure Calling Standard (VPCS) and function values.

The VPCS states:

==========================================================================

A function value is returned in register R0 if its data type can be
represented in 32 bits, or in R0 and R1 if its data type can be
represented in 64 bits, provided the data type is not a string data type.
                        ------------------------------------------------

In all other cases (the function value needs more than 64 bits, the data
type is a string, the size of the value can vary from call to call,
and so on) the actual argument list and the formal argument list
are shifted one entry. The new, first entry (4(AP)) is reserved for
the function value.

==========================================================================
  
I have determined empirically that you can call external functions
which return the address of a descriptor, the address of an ASCIZ string,
or the address of any arbitrary data structure as a function value in R0.
If the VPCS allows you to return an address as a function value in R0,
then why exclude the address of a string descriptor, or an ASCIZ string?
The VAX C compiler is quite happy to let you do this, and in fact does not
use 4(AP) to return the function value as the VPCS states!

I have not had a chance to try any other compilers.

I do not want to violate the VPCS for external calls. However, my layered
product (VAX DSM) includes an external function calling facility.  VAX DSM
has always closely adhered to the VPCS for call-out to external routines
(and call-in from external routines). We have received customer requests
for enhancements to allow returning strings as external routine function
values. This is particularly important now that there is so much code
being written in C.

Is it safe for us to assume that the language compilers will always
return function value string addresses (ASCIZ and descriptors) in R0?
This seems to be the case with VAX C and the MIT Xwindows C binding
at least.

What can we say in our documentation about the VPCS, that VAX DSM
(and everyone else apparently) is bending the rules?

Am I interpreting the VPCS incorrectly?  It seems rather explicit
with regard to returning strings.

Thanks in advance,

        David Schwartz
        Project Leader
        VAX DSM Engineering

T.RTitleUserPersonal
Name
DateLines
304.1Ok to return address, I thinkMOIRA::FAIMANlight upon the figured leafWed Mar 13 1991 22:0821
    Without digging out the standard, it would appear to me that it
    would be valid to return an *address* (even the address of a string
    or descriptor) in R0, and that the cases where the extra parameter
    is required are those where the string *value* is returned.  Then,
    the calling routine must allocate a buffer for the data and pass the
    buffer address to the called routine, which must copy the data into
    the buffer. 
    
    However, if a called routine is going to return an address in R0,
    then it must be the address of permanently valid data.  For example,
    it might be the address of read-only descriptor that described a
    read-only string, but not the address of a descriptor that described
    a string in global storage, since the returned data might then
    become invalid when that global data was modified.
    
    In general, I believe that the normal way of returning string data
    is "by descriptor, write only", where the argument might be an
    explicit argument, or might be a "hidden first argument" function
    return value.
    
    	-Neil
304.2VAXC, VCPS, etc.LENO::GRIERmjg's holistic computing agencyThu Mar 14 1991 16:1526
   VAX C doesn't really make much effort at all to follow the letter or
the spirit of the VCPS, but if you're careful enough you can do it.  It
follows the way the original C implementations were done, for better or
worse.

   It's not "normal" in callable routines to return the address of a
descriptor.  If the descriptor is dynamically allocated (the descriptor
itself, not the data!) how do you deallocate it?  You'd have to provide
a function to deallocate the descriptor when the caller is done with it,
or clearly document how to call LIB$FREE_VM or something to deallocate it.
Either way, it's messy.

   You'll find that generated code for languages like Pascal and Ada
will use 4(ap) to store the address of large structured return values.

   The "correct" thing to do - which is what people expect in most cases,
is that you pass in the address of a string descriptor (allocated by the
caller, not the callee) into which the data is copied.  If you're feeling
low-level, you should specify that it should be a fixed length string
descriptor, in which case you just look at the first longword as the 
length of the buffer and the second as the address.  If you're feeling
high-level and friendly, you should use LIB$SCOPY_DX or another LIB$ string
function to move the data into the user's buffer so that the correct
semantics around truncation, allocation and format are followed.

					-mjg
304.3JAMMER::JACKMarty JackThu Mar 14 1991 16:502
>descriptor, in which case you just look at the first longword as the 
    Important note:  That should be word not longword.
304.4Function Versus Procedure?DSM::SCHWARTZDSM EngineeringThu Mar 14 1991 17:5117
The technique VAX DSM currently uses for passing strings as "output"
parameters, is to pass an empty dynamic string descriptor, and have
the user call LIB$SCOPY_DXDX (etc.) to fill it in.  VAX DSM copies the
string to internal storage and calls LIB$SFREE1_DD.

This method is fine for external procedure calls.  However, .0 was
concerned with external function calls, where the function returns
a value (as opposed to a VMS status code as the "procedure value").

Re .2: Under what conditions will the Pascal compiler use 4(AP) to
return a function value?  I was unable to cause this to happen.

Sounds like its OK to violate the VPCS as long as you are calling
external routines written in C.

	David

304.5Ok...LENO::GRIERmjg's holistic computing agencyFri Mar 15 1991 00:0141
Re: Pascal:

   I believe if you just have it return a structure larger than 4 or 8
bytes it will pass it as a fake first parameter.

   Let's see...

MODULE X;

    TYPE
        BigString = VARYING [65535] OF CHAR;

    [GLOBAL] FUNCTION ReturnABigString (A : INTEGER) : BigString;

        VAR
            Indx : INTEGER;
            t : BigString VALUE ZERO;

        BEGIN
            FOR Indx := 1 TO A DO
                t := t + 'X';

            ReturnABigString := t;
        END;

END.

   That passes back the result in 4(AP).  (When you read the machine code,
remember that R12 = AP)

Re: reading the word:

   Is that correct, for utilities which expect a fixed length string
descriptor?  Maybe learning by example is wrong, but I seem to recall
seeing code examples where the first longword is assumed to be the entire
length, rather than just the first word.  (Of course, you're right, you
should just use the first word so that you're compatible with RTL string
descriptors where the address points to the real start of data.  I can't
think of where I've seen specific examples of this.)

					-mjg
304.6TLE::BRETTFri Mar 15 1991 07:347
    You may have seen it in internal code produced by the Ada compiler.  We
    use normal CLASS S/CLASS SB descriptors for imported or exported
    intervals, but internally when passing long strings we use a modified
    style of the descriptor without class or dtype field, with the width
    spread right across the first longword.
    
    /Bevin
304.7OZROCK::MCGINTYTruffle prefers viMon Mar 18 1991 16:5712
    The first longword in a string descriptor contains the length, in a
    word, followed by two bytes containing the class and type of the string.

	 -------------------------------
	| class | type  |    length     |
	|-------------------------------|
	|            address            |
	 -------------------------------

    The definitions of the fields and constants are contained in $dscdef
    in sys$library:starlet.mlb.
304.8In certain places...LENO::GRIERmjg's holistic computing agencyTue Mar 19 1991 21:3019
Re: .6:

   Yup, that's the structure, but any code which doesn't layer on the
RTL (i.e. system services, pretty much any inner-mode code,) just treats
string descriptors ala..

DESC:	.LONG 5
	.ADDRESS Y

Y:	.ASCII /Hello/

   Without any cares in the world about classes or types or whatever.

   User-mode code should *always* use the LIB$/STR$ stuff to work with
descriptors to be friendly, but there's a large body of code which most VMS
folks have to interact with which doesn't have any more complex notion
of descriptors than the MACRO-32 I outlined above.

					-mjg
304.9The original questionDSM::SCHWARTZDSM EngineeringWed Mar 20 1991 11:376
We seem to have forgotten my original question which asks how the
various language compilers return strings as external function values,
or receive strings as external function values.

Is C the only language that "violates" the calling standard in this respect?

304.10TLE::BRETTWed Mar 20 1991 13:2019
    This is a very tricky area of the calling std.
    
    It is NOT correct to describe VAX C as "violating" it.
    
    The VAX Calling Std basically allows you to return string results in
    one way - by reference via 4(ap).  There is NO standard way of
    returning a string result whose size is not already known to the
    caller.
    
    Ada and PL/I return such strings by having the caller pass in some form
    of descriptor, and by either heap allocating the result or fiddling
    around so that the sp is not reset by the return, and have the called
    routine fill in the descriptor saying where the result is.
    
    C never returns strings, it returns "char *"'s POINTERS TO STRINGS [and
    to the disgust of programmers interested in re-entrancy these are often
    pointers to OWN storage].
    
    /Bevin
304.11Well, in C there is...LENO::GRIERmjg's holistic computing agencyWed Mar 20 1991 19:2026
Re: no standard way to return arbitrary lengthed strings:

   In C, part of the implied semantics are that the strings are null-
terminated, so returning the address of the first character of the string
is completely reasonable.  In a C-only environment.

   The VPCS is truly minimal, and there's a degree of following it "to the
letter" and "in spirit".  With the exception of requiring certain descriptor
formats and how the parameters *are* passed (i.e. not in registers for CALLS
and CALLG linkages,) the "to the letter" interpretation of the VPCS says
very little.

   The utility is in following it in spirit.  C isn't very friendly towards
following the VCPS in spirit, even if the flexibility of the language does
permit following it to the letter.  Other VMS languages tend to generate
code which does a fair job of producing entry points which follow both the
spirit, as well as the letter, of the law.  (Ada, Pascal, COBOL and BASIC
to name a couple.  All the C comments apply equally to BLISS, because they
have roughly equivalent views of the world in terms of calling in the
CALLS/CALLG sense...)

   My personal rule-of-thumb is that if SDL can quickly and easily generate
an accurate description of the interface which is usable without trickery
from those languages, then you're following the VPCS in spirit.

					-mjg
304.12don't forget dynamic descriptorsSAUTER::SAUTERJohn SauterThu Mar 21 1991 11:207
    re: .10
    
    I believe the calling standard allows string results to be returned
    by descriptor, also using 4(ap).  If the caller doesn't know what
    the length of the string will be then he can pass a dynamic
    descriptor.
        John Sauter
304.13Aside: CLASS=0, DTYPE=0 *is* defined.SKYLRK::WHEELERLLLloyd WheelerThu Mar 21 1991 12:3820
    Re .6, .7, .8: (and a couple others)
    
    To close on the implied "hack" of using the first *long*word of a
    descriptor as the length:
    
    I remember reading in the VPCCHS that DTYPE and CLASS fields of zero
    *do* have a defined meaning.  If these fields are zero, the called
    routine is supposed to assume that the argument data is of the
    "correct" type.  (See the description of DSC$K_DTYPE_Z.)
    
    In practice, at least for the RTL routines (and, in a funny, NUL-padded
    way, for the System Services) this means that the argument will be
    treated as a static string.
    
    Of course, using the *entire* longword as a length field would be a
    violation of the VPCCHS.  (And, of course, this makes no difference if
    a single product controls the code used in both the caller and the
    called routines.)
    
    Lloyd
304.14ThanksDSM::SCHWARTZDSM EngineeringMon Mar 25 1991 14:4573
RE: .11

>   My personal rule-of-thumb is that if SDL can quickly and easily generate
> an accurate description of the interface which is usable without trickery
> from those languages, then you're following the VPCS in spirit.

I tried some experiments with SDL before entering the base note.
DESCRIPTOR and RTL_STR_DESC are not supported as RETURN keywords:

	MODULE test;

	ENTRY test1 RETURNS DESCRIPTOR;

	ENTRY test2 RETURNS RTL_STR_DESC;

	END_MODULE;

$ sdl/lang=pascal TEST1
%SDL-E-UNDEFSYM, Undefined symbol DESCRIPTOR [Line 3]
Error on line 5 column 32:  Replaced reserved-word "rtl_str_desc" with
        identifier
%SDL-E-INVNAME, Item name is invalid
%PLI-F-ERROR, PL/I ERROR condition.
-SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=00000004, PC
=0009A06A, PSL=03C000A4
$

However, SDL was perfectly happy with the following:

	MODULE test;

	ENTRY test1 RETURNS CHARACTER;

	ENTRY test2 RETURNS CHARACTER VARYING;

	ENTRY test3 RETURNS ADDRESS;

	ENTRY test4 RETURNS OCTAWORD;

	END_MODULE;

$ sdl/lang=pascal TEST2

MODULE TEST2 ;

[HIDDEN] TYPE   (**** Pre-declared data types ****)
        $OCTA = [OCTA,UNSAFE] RECORD
                L0,L1,L2:UNSIGNED; L3:INTEGER; END;
        $DEFTYP = [UNSAFE] INTEGER;
        $DEFPTR = [UNSAFE] ^$DEFTYP;

(*** MODULE test ***)

[HIDDEN] TYPE   (**** SDL-Generated type names ****)
        test$$typ1 = VARYING [1] OF CHAR;

[ASYNCHRONOUS] FUNCTION test1 : CHAR; EXTERNAL;

[ASYNCHRONOUS] FUNCTION test2 : test$$typ1; EXTERNAL;

[ASYNCHRONOUS] FUNCTION test3 : $DEFPTR; EXTERNAL;

[ASYNCHRONOUS] FUNCTION test4 : $OCTA; EXTERNAL;

END.

Thanks for all the feedback.  The bottom line appears to be that if a
language (compiler) doesn't enforce the VPCS on external calls, and our
customers want to call external functions that don't adhere to the VPCS,
then VAX DSM should not prevent them from doing so!

	David