[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::unix_objsym

Title:Digital UNIX Object File/Symbol Table Notes Conference
Moderator:SMURF::LOWELL
Created:Mon Nov 25 1996
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:71
Total number of notes:314

64.0. "Subsystem control of INIT/FINI Order" by AOSG::LOWELL () Wed Feb 12 1997 14:47

        Proposal for Subsystem Control of INIT/FINI
                 Order for Digital UNIX

                   Randy Van Lowell
                   Randy Van Meyers

                     February 1997



1. Problem



The current mechanism for recognizing initialization (INIT) and
termination (FINI) routines and establishing their execution order
within an executable or shared library does not take into account
dependencies between subsystems and user-written INIT/FINI
routines. 

In particular, the DECC++ compiler needs to generate INIT routines
which run prior to user-defined INIT routines. The DECPASCAL
compiler may also depend on the ability to control INIT execution
order. 

The linker (ld) currently identifies INIT routines by recognizing
names which start with the prefix "__init_". Ld determines the
execution order for the INIT routines it recognizes by the order they
are encountered within an object's external symbol table and the
ordering of objects on the command line. In V4.0, ld also began to
take into account the ordering of archive libraries on the command
line. The INIT routines from each archive are executed in the reverse
order of their occurence on the command line. For example, 


      ld x.o y.o z.o libabc.a libdef.a

      INIT order:
                    libdef.a
                    libabc.a
                    x.o
                    y.o
                    z.o


The problem faced by C++ is that compiler-generated INIT routine
and user-written INIT routines may occur in each .o file. The
ordering described above does not provide a method for ensuring
that all compiler-generated INIT routines are run before any
user-written INIT routines. 

Proposal

  1.The linker will recognize new prefixes for
    subsystem-generated INIT/FINI routines. 


           __INIT_ALPHANAME
           __FINI_ALPHANAME


  2.INIT routines recognized with the "__INIT_" prefix will
    always run prior to any INIT routines recognized with the
    "__init_" prefix within the same executable or shared library.
    Likewise, FINI routines recognized with the "__FINI_" prefix
    will always run after FINI routines recognized with the
    "__fini_" prefix. 

  3.The linker's "-no_prefix_recognition" switch will not apply
    to subsystem generated init routines. 

  4.INIT routines added by hand with the linker's "-init" switch
    will be grouped with the user written INIT routines.
    Subsystem generated INIT routines will be executed before
    any INIT routines added by hand. Likewise, subsystem
    generated FINI routines will be executed after any FINI
    routines added by hand. 

  5.INIT routines added by ld for exception-handling, speculative
    execution and thread-local storage will still run prior to all
    other INIT routines. The associated FINI routines will
    continue to run last. 

  6.All routines recognized with the "__INIT_" prefix will be
    executed in alphabetic order, and all routines recognized with
    the "__FINI_" prefix will be executed in reverse alphabetic
    order. 

  7.Although, any random combination of characters in
    ALPHANAME could be used to control the initialization order
    of dependent subsystems, the following convention must be
    adhered to. It will ensure that there can be no established set of
    ALPHANAMES for which another ALPHANAME cannot be
    inserted at any point in the order. 

    A variable-length hex-string will be encoded in the
    ALPHANAME of subsystem-generated INIT and FINI
    routines. The hex-string will contain one or more hex-digits
    followed by an underscore (_). Hex-digits include only the
    characters: '0' through '9' and 'A' through 'F'. Furthermore,
    the last hex-digit in the string cannot be an 'F'. This ensures
    that any given order remains accessible for insertion at any
    point. 

         Example:     __INIT_8_cxx_vtable_int

Discussion

The linker will determine the order of subsystem-generated INIT
and FINI routines by lexically sorting the portions of the names
following the prefix. 

This new INIT/FINI recognition does not affect execution order of
INIT or FINI routines outside of the executable or shared library.
INIT and FINI routines will still be run as a group for each
executable and shared library. The relative order of each group of
INIT or FINI routines is still determined by the loader's assessment
of recorded shared library dependencies. 

The new INIT/FINI recognition will be documented (unlike its VMS
counterpart). ISV's and end users may exploit this new feature,
however, we'll include cautionary statements in the documention
warning against undefined results that may occur due to improper
ordering of INIT routines. 

The DUDE group, which owns the mechanism, will assign sequence
id's upon request. 

Ron Brender's explanation of the variable-length hex strings: 

        Given any two existing "codes", you need to be able to
        allocate a code that preceeds both, comes between them,
        or follows both in the sorting order. 

    This is really quite easy. Without loss of generality, assume
    we have an alphabet of three characters for names: two digits
    "0" and "1" and one non-digit such as "_"; moreover, let that
    be the lexical order (ie 0, 1, _). Let us define a scheme for
    allocating digit strings, bearing in mind that every digit string
    will be followed by _ (which may occur in a variety of
    non-digit flavors). 

        - Let the first digit string be 0_ 
        - A string that preceeds 0_ is 00_ 
        - A string that follows 0_ is 10_ 

    First, lets show that you never want to assign a digit string the
    ends in a 1. Suppose you did: it would have the form 

            <prefix>1_

    Any longer string, such as <prefix>10_ and <prefix>11_
    preceeds <prefix>1_. There remains the possibility that there
    is a shorter string that will collate after <prefix>1_, but that is
    only possible if the <prefix> itself contains a zero. Most
    importantly, there are at most a finite and bounded number of
    possible such shorter strings -- contrary to our
    goal/requirement to be completely open-ended. 

    It follows that every assigned sequence ends in 0_, that is, has
    the form 

            <prefix>0_

    and, that it is always possible to create a predecessor,
    <prefix>00_, and a successor, <prefix>10_ for any such string.

    Next, suppose N and M are any two adjacent digit strings in
    the current set of assignments (adjacent means that there is no
    other string in the set that is between N and M). The worst
    case occurs when pair of strings are already "almost equal" as
    in either 

            <prefix1>00_
            <prefix1>0_

    or 

            <prefix2>0_
            <prefix2>10_

    But we see that even in these cases it possible to generate new
    strings that preceed, are between, or follow both elements of
    the pair. For example: 

            Previous        New
            --------        ---

                            <prefix1>000_
            <prefix1>00_
                            <prefix1>010_
            <prefix1>0_
                            <prefix1>10_

    or

                            <prefix2>00_
            <prefix2>0_
                            <prefix2>100_
            <prefix2>10_
                            <prefix2>110_

    QED 

    This scheme may feel odd but it really isn't as long as you stop
    trying to think of the digits as being "numbers" rather than
    "just some characters". 

    As presented using just 0 and 1s, it may appear that code
    strings will grow rapidly in length. But this problem is much
    reduced by using hexadecimal (in which case the critical rule
    is that no string can end in an F). 
T.RTitleUserPersonal
Name
DateLines
64.1approvedSMURF::LOWELLTue Feb 25 1997 15:354
This proposal was reviewed and approved at the 2/24/97 OF/STWG
meeting.

OF/STWG ECO ID=12