[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::alpha_callstd

Title:ALPHA Calling Standard
Notice:Digital Restricted and Confidential
Moderator:BEGIN::MDAVISR
Created:Thu Jun 29 1989
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:113
Total number of notes:847

109.0. "What alignment is assumed for global/external variables?" by GEMGRP::BRENDER (Ron Brender) Tue Jan 28 1997 13:32

Subj:   Calling standard ECOs for Unix and NT

         <<< TURRIS::DISK$NOTES_PACK:[NOTES$LIBRARY]CXXC_BUGS.NOTE;5 >>>
                                 -< CXXC_BUGS >-
================================================================================
Note 4084.4  decc_bugs 1256: Alignment difference between acc and DEC C   4 of 4
DECC::OUELLETTE                                      10 lines  24-JAN-1997 18:13
--------------------------------------------------------------------------------
This aught to be in the Unix and Windows NT calling standards,
but is (as far as I can tell) not there.  This appears to be
as the result of an ECO which fell through the cracks.
The C and C++ compilers have been (for quite some time now)
generating code that assumes that globals are QUAD aligned.
The code would really suck if this assumption weren't met.
I believe that fix is to ECO the QUAD alignment requirements
into the calling standard post haste.

Roland.

T.RTitleUserPersonal
Name
DateLines
109.1Where does it say we assume quad alignment?GEMGRP::BRENDERRon BrenderTue Jan 28 1997 13:3824
Summarizing several mail exchanges between Kent Glossop and myself...

--------------------------------------------------------------------------------
From:   SMOP::glossop  "Kent Glossop  24 Jan 1997 10:46:09 -0500"
To:     Bill,Ron
CC:
Subj:   I found several old discussions searching through my mail...

Unfortunately, I don't see any references to any place this is actually
documented.  You might ask Mike Rickabaugh, because I have a vague
recollection that it was determined to be a more system-type item
(since assembly programmers needed to obey the rules too.)

Kent
--------------------------------------------------------------------------------
From:   GEMEVN::BRENDER "Ron 603-881-2088; DTN 381-2088  24-Jan-1997 1036 -0400"
To:     SMOP::glossop
CC:     NOYCE,BRENDER
Subj:   RE: Thinking about it a bit more

I just did a search of the .SDML so I can confirm that such a rule is
definitely not in the calling standard.

I really doubt that this ever got formalized...
109.2Assembler generated globals !quad alignedSMURF::GAFJerry Feldman, Unix Dev. Environment, DTN:381-2970Tue Jan 28 1997 15:2213
    ASAXP aligns variables on natural boundaries, eg. quad on quad, long on
    long, etc. I am wondering if we already have a potential problem where
    a variable is defined in an assembler program and used in a C or C++
    program. Let's say a character array on a byte boundary. 
    
    I have 2 concerns, the first is the above case where a C program using
    data defined in an assembly module, the other being backward
    compatibility. By changing the assembler to align globals on a quad
    boundary would cause some errors where a programmer may have created a
    structure which relied on the natural alignments, which is well
    documented in the asaxp help file.
    
    
109.3It's not just performanceWIBBIN::NOYCEPulling weeds, pickin&#039; stonesTue Jan 28 1997 15:2917
For the following source:
	extern char c;
	int fetch() { return c; }

I believe GEM feels entitled to generate

FETCH:	LDAH	R0, h^c(R31)
	LDQ_U	R0, l^c(R0)
	AND	R0, 0xFF, R0
	RET	R26

Notice that this code fetches the *wrong byte* if
c is not quadword-aligned.

Finding an actual failing case is probably harder,
but it's easy to find cases that cause unnecessary
alignment traps.
109.4definitely not just performance...GEMEVN::GLOSSOPOnly the paranoid surviveTue Jan 28 1997 15:3713
To re-enforce this, the final peepholer "known bits analysis" will
be making the presumption that the low 3 bits of global symbols are
exactly 000 (or more specifically, that statically allocated symbols
have trusted alignment, where globals are a special case where
the minimum alignment happens to be quad.)  i.e. code that is generated
that might do things like blbs on a register value loaded with a symbol
address would be removed as dead code, inserts/extracts with known bits
for the middle operand can be converted to directly use literal operands,
etc.

(This code is temporarily disabled for Unix in bl33 - it's already there
on NT in bl34 - but I expect it will be getting re-enabled for bl36
for Unix.)
109.5STAR::BENSONMy other fiddle is a StradWed Jan 29 1997 09:4510
    Is there a reason to believe this won't be a problem on VMS as well,
    once compilers are on newer GEM BLs? I don't recall any external 
    alignment guidelines. Or is the VMS GEM just more conservative?
    
    My first reaction to .4 is that this should be tied to OPT level, and
    possibly off by default. My second reaction is that those kinds of 
    optimizations which really require cross-module knowledge are more
    safely left to OM..
    
    Tom
109.6No problem, VMS just gets slower codeWIBBIN::NOYCEPulling weeds, pickin&#039; stonesWed Jan 29 1997 14:184
> Or is the VMS GEM just more conservative?

Yes, on VMS a global is allowed to be allocated on an arbitrary
byte boundary, and GEM knows it.
109.7GEMEVN::GLOSSOPOnly the paranoid surviveWed Jan 29 1997 16:57118
> Or is the VMS GEM just more conservative?

Yes, VMS is more conservative, given a number of different factors
(pre-existing Macro interactions with compiled code, Bliss layout
expectations, etc.)

Something I found searching old mail that might be related to this topic
was that  DECwindows ran into a two-shorts-packed-to-quad problem back
in 1991 or early 1992.  There were some mail exchanges about quad aligning
data on VMS, but it was decided this wasn't a practical presumption
for compilers to make about externally allocated data.  Note that DEC C/VMS
does allocate data on strong boundaries - appears to be octaword - but
compilers only presume the natural alignment for data not allocated
by the compilation unit.

On Unix and NT, compiler-allocated global storage was allocated to quadwords
since FRS for both systems (some pre-FRS acc compilers only did long alignment
for ints, if I remember correctly.)  The primary reason for this originally
was to avoid architectural interaction problems (e.g. a volatile and
a non-volatile data item packed to the same quadword, where the non-volatile
one wound up being accessed via ldq_u/stq_u, overwriting the volatile one.)

Furthermore, since compilers have always quad aligned data on these platforms,
and didn't want to sacrifice performance, they used knowledge about alignment
for externals as well.  This happens both for presuming "just" probably
quad aligned (generating ldq/bic 255/stq) for storing 0 to a global character,
and in various addressing calculations (where it is not just a matter
of taking alignment faults.)

NT has since changed to longword granularity, but the generated code still
uses quad alignment presumptions in various ways, which would make it
incompatible to back off to shorter alignments.

Also, note that there are really two distinct issues (apart from
the documentation issue):

    - Are all global values (required to be) quadword aligned?

      The answer is yes, unless a special compiler option is used
      to defeat this behavior (NT only).  (The rule here is any module
      that defines data using this option can only have references
      from modules that also use the option - e.g. a whole application.)

    - Are all globals padded to a quadword?

      This is closely related, but it really boils down to whether or
      not you trust your data declarations.  (For example, one module
      might declare char foo[1], really meaning "the start of an array",
      while another might declare it as char foo[8].)  Currently,
      the compilers will not (as far as I know) take advantage
      of the padding unless the storage is actually declared
      in the current module.  e.g.

	char c; /* = 10; */
	int foo()
	{
	  c = 0;
	}

      will generate a stq if the initializer is present, and
      a ldq/bic/stq if not (ldl/bic/stl for NT at the moment).

There appear to be several "unresolved" threads.

For NT, there is a thread in decwet::nt-perf-war (note 41.*) that talks
some about quad vs. less-than-quad alignment.  There was a proposal
to back off from quad, but since existing generated code would break
or get alignment faults, this didn't happen.

For Unix, this topic came up (I thought) about a year ago, and (I thought)
it was resolved that this was a "system issue" and should be documented
in the assembly programmers manual or a similar place.  (A search of
a relatively current one didn't find anything, however.)  Unfortunately,
I don't see any old mail that talks about documentation, only that
"it is the property" (both acc and DEC C have the same behavior in this
area on Unix.)

Summary of current situation for both Unix and WNT:

    - Compilers are quad aligning, and presuming quad alignment for,
      global data

    - It doesn't appear to be documented in the appropriate places
      (calling standard and/or system documentation)

    - Compilers are not presuming padding for global data (e.g.
      stq without merge to store to global character)

Personally, it seems to me that we're paying the storage price, but not
fully exploiting the opportunity (by generating code presuming alignment
that would generate alignment faults, but not presuming the padding
to go with it.)

My recommendation is:

    - The calling standards should state that the default operating
      environment is that all global data must be quadword aligned
      and padded (reflecting all existing HLL compiled code.)
      This should also be documented in the assembly language guide
      for Unix.

    - The compilers should move to presuming that externally declared
      global data is padded at non-zero opt levels, unless some option
      is specified stating that this is a bad presumption (could be
      K&R mode or a new -unpadded_externals option.)

      (Basically, this is the case where we suffered because the architecture
      didn't have small accesses, used alignment to fix the problem, but then
      haven't completely exploited the knowledge that alignment has made
      available.)

(If we were to introduce a new OS now that only ran on ev56+ that didn't
have old object interaction requirements, globals/externals should only
be naturally aligned, presuming the OS didn't otherwise care for compatibility
issues.  However, that would also require that the OS and compilation system
honor byte granularity for all references outside of a given object -
i.e. all code would need to be generated using sb/sw where appropriate -
no expanded stq_u or stl.)
109.8call for an NT voteDECCXL::OUELLETTEWed Jan 29 1997 19:4211
There is a documentation deadline for the VC5.0 kit.
It is Friday 31 January.
If we would like to see this documented in the main Visual C documentation
some time in the next two years, we need to tell Abe Klagsbrun by then.
[Possiblly delivering an updated calling standard and assembler documentation.]

Is it possible to line all these NT ducks up on Thursday?
We can presumably take the Unix ducks at a more leisurely pace.

Your friendly trouble maker,
Roland.
109.9Please vote re following proposed ECOGEMGRP::BRENDERRon BrenderFri Feb 07 1997 09:5314
Well, I wasn't able to respond to Roland's last minute heads up in .-1, but
I am trying to follow up and resolve this topic. Following is the proposed ECO,
for both WNT and UNIX, to take care of it. The only point that isn't covered
in the earlier discussion in this thread concerns the global alignment for
128-bit floating and floating complex, which I left at octaword.

Under the assumption that this item is non-controvertial (often a surprisingly
damning assertion!) I will set a relatively short ballot period.

>>>>>     Please cast your ballots in this conference no later than      <<<<<
>>>>>                                                                    <<<<<
>>>>>			Monday, 17 February                              <<<<<

The proposed ECO text follows in the next reply...
109.10Proposed ECOGEMGRP::BRENDERRon BrenderFri Feb 07 1997 09:5446
			  WNT Call Standard ECO 7
			 UNIX Call Standard ECO 5
				regarding
			Alignment of Global Variables


Abstract
--------

The alignment of global variables is specified as being at least quadword.


Proposal
--------

In the WNT Call Standard, Section 4.2.1, page 4-10 (Jan 1997)...
In the UNIX Call Standard, Section 4.2.1, page 4-14 (Mar 1996)...

    ...replace the second paragraph with the following:

	To avoid such performance degradation, all data values on Alpha
	systems should be naturally aligned. Moreover, global data values
	should be at least quadword aligned (IEEE floating extended X and
	IEEE floating extended precision complex data should be octaword
	aligned whether global or not). Table 4-3 shows the data
	alignment requirements for non-global data.

    Note: for the UNIX document, the parenthetical clause begins

	...(extended precision real and extended precision complex data...

    to match other document terminology conventions.


Discussion
----------

The justification for this change is well set forth in ALPHA_CALLSTD
note stream 109, especially 109.7. Note in particular that this requirement
is already implemented and assumed in practice so that this ECO is just a
codification of existing practice rather than a change. That discussion did
not mention extended floating data; however, there seems no reason to weaken
the already existing octaword alignment for extended floating. (See
ALPHA_CALLSTD 76.11, and to a lessor extent, 72.13 topic 2, regarding
alignment for extended floating [aka X-floating]).
109.11Not just performance (see .3, .4)WIBBIN::NOYCEPulling weeds, pickin&#039; stonesFri Feb 07 1997 10:2811
NO.

The text says "to improve performance, you SHOULD align globals."
This is not strong enough, and gives the wrong reasons.

Compilers assume global data is aligned on these platforms, so that
they don't need to generate slow sequences to cope with granularity
issues for neighboring data.  If the assumption is wrong, the compiled
code will get wrong answers.

Therefore, globals MUST be aligned, to ensure correct results.
109.12Where is NT today?DECWET::MVBMonty VanderBiltFri Feb 07 1997 10:5218
RE .7

NT has since changed to longword granularity, but the generated code still
uses quad alignment presumptions in various ways, which would make it
incompatible to back off to shorter alignments.

...

For NT, there is a thread in decwet::nt-perf-war (note 41.*) that talks
some about quad vs. less-than-quad alignment.  There was a proposal
to back off from quad, but since existing generated code would break
or get alignment faults, this didn't happen.

======

On the surface these two statements seem contradictory. What is the actual
situation with NT today. I've read perf-war note 41 and talked to some folks
here but haven't gotten a clear understanding.
109.13allocation alignment != granularityWIBBIN::NOYCEPulling weeds, pickin&#039; stonesFri Feb 07 1997 11:067
NT still uses quadword-aligned allocation.

NT still presumes longword granularity of multiprocessor access.
That is, it presumes that if two threads updated data in different
aligned longwords, the hardware keeps the threads separate; if two
threads update data in the same aligned longword, it's up to the
software to coordinate the updates.
109.14GEMEVN::GLOSSOPOnly the paranoid surviveFri Feb 07 1997 11:2370
I don't understand what you think is contradictory.

    - The default granularity is long, which means that you can only modify
      a longword potentially containing real data if it is in fact modified
      by the original code.

    - The presumption for alignment of globals is quad, which means that
      sequences like ldq/merge/stq can be used for cases where granularity
      isn't violated.  (Note that granularity only applies to stores,
      not loads, so basically any useful transforms that can use quad
      alignment for globals might do so.)

Neither of these contradicts any text you quoted.

For example, take the following cases:

// load can be a ldq - loads aren't sensitive to granularity.  Would get
// an alignment fault if storage was only longword aligned.
//
extern char c;
int cval() {
    return c;
}

// This generates a ldq/stq due to quad alignment of globals, in spite
// of granularity, since both longwords of the target are being stored.
// Again, this will trigger alignment faults.
//
struct A { int a; int b; };
extern A first, second;
void assign() {
    second = first;
}

// This generates a ldah/ldq_u/extbl.  The extbl does not include any
// bias that would be required if the global were unaligned (would
// require an extra lda instruction to materialize the address.)
// This case would start silently getting the wrong answer if s were
// allocated to a boundary weaker than a quadword.
//
extern unsigned char s[100];
int sval(int i) {
    return s[i];
}

?cval@@YAHXZ::
        ldah    t0, h^?c@@3DA(zero)                     ; t0, h^?c@@3DA(zero)
        ldq     t0, l^?c@@3DA(t0)                       ; t0, l^?c@@3DA(t0)
        sll     t0, 56, v0                              ; t0, 56, v0
        sra     v0, 56, v0                              ; v0, 56, v0
        ret     ra                                      ; ra
        nop                                             ;
        nop                                             ;
        nop                                             ;

?assign@@YAXXZ::
        ldah    t0, h^?first@@3UA@@A(zero)              ; t0, h^?first@@3UA@@A(zero)
        ldah    t1, h^?second@@3UA@@A(zero)             ; t1, h^?second@@3UA@@A(zero)
        ldq     t0, l^?first@@3UA@@A(t0)                ; t0, l^?first@@3UA@@A(t0)
        stq     t0, l^?second@@3UA@@A(t1)               ; t0, l^?second@@3UA@@A(t1)
        ret     ra                                      ; ra
        nop                                             ;
        nop                                             ;
        nop                                             ;

?sval@@YAHH@Z::
        ldah    a0, h^?s@@3PAEA(a0)                     ; a0, h^?s@@3PAEA(a0)
        ldq_u   t0, l^?s@@3PAEA(a0)                     ; t0, l^?s@@3PAEA(a0)
        extbl   t0, a0, v0                              ; t0, a0, v0
        ret     ra                                      ; ra
109.15Need to update Assembler manualSTEVEN::hobbsSteven HobbsFri Feb 07 1997 11:2918
In other places of the calling standard we outlaw similar behavior for
standard interfaces and calls between languages but we allow that
behavior to be used in nonstandard, language-specific conventions
(eg. record layout conventions in section 4.2.3).  In this particular
case the rule can be used by object module optimization tools so the
rule cannot be broken even for language-specific conventions.

The assembler and the BLISS language have syntax that allows external
definitions that force nonaligned global variables.  These assembler
and BLISS programs will now become illegal and we should document this
restriction in the Digital Unix Assembly Language Programmer's Guide
(and the corresponding Alpha NT manual, if any).  (It would probably
be good to document this rule in any future update of the BLISS User's
Guide and to modify the BLISS compiler to give an error message on
illegal GLOBAL BIND declarations ;-).  I do not believe that any other
language can force a violation of this rule (although Digital Pascal
does have syntax that tells a compiler to assume that this rule has
been broken for particular externals).
109.16DECC::OUELLETTEFri Feb 07 1997 11:4311
I'm not so sure that GEM aligns long doubles to more than QUAD or makes
other assumptions about them.  Comments from GEM and/or Fortran please?
My C/C++ backround on NT (where long double was an alias for double)
kept me from learning these details.  Also Paul Winalski may have some
insight about the linker and GEM OM NTCOFF about this.

Not that I'm entitled to vote on this...  but beyond Bill's MUST
detail and getting the IEEE X stuff right, this looks good and is
what I wanted.

R.
109.17More assembler issuesSMURF::GAFJerry Feldman, Unix Dev. Environment, DTN:381-2970Fri Feb 07 1997 13:2140
    WRT: Steve Hobbs' statement in .15. 
    The NT Assembler document is it's help file. The help file is currently
    being revised for VC++5.0. The wording regarding alignments in the
    asaxp help file is identical to the wording in the DUX Assembly
    Language Programmers' Guide. 
    
    The assembler issues are:
    First, the assembly language programmer has full control over
    alignment, and can place a quad integer on a byte boundary if that is
    desired.
    
    The default alignment for all locally defined variables, whether local
    or global is based upon the natural alignment of the type. 
    
    My concern is more for existing assembly language programs that may
    make assumptions based upon the historical alignment. 
    
    Also, the assembler does not associate symbols and data. In a data
    section, the assembler could force a public symbol to be quad aligned.
    But, where NT allows data to be interspresed with code in the text
    section, there could be some interesting problems:
        .globl fubar2
    	.globl fubar
    	.ent   fubar
    fubar:	#let's asume that we force entry points to be aligned
    	<some code>
    	# We are on a word boundary
    fubar2:	#If we align quad here, 
    	        #what happens if the code in fubar falls through to fubar2.
                #The assembler would generate a nop prior to fubar2.
    	
    I don't really see any showstopping issues here, nor do I see any
    difficulty implmementing this ECO in the assembler. The issue is that
    this has the effect of causing some behavioral changes in the generated
    code which may be undesirable in some low level, high performance
    code. 
    	<more code>
    
    
    
109.18GEMEVN::GLOSSOPOnly the paranoid surviveFri Feb 07 1997 16:1124
Let's be clear here - we are writing down things the compilers already
do and expect.  Quad alignment is not new behavior or a new expectation
for the compilers.  The only proposed new behavior was to actually use
knowledge of padding to avoid loading/storing the padding for updates
of small globals.

Note that nothing prevents "private agreements" between assembler modules,
just like private linkages are allowed between routines.

(i.e., this is a "standard agreement constraint", like standard parameter
list layout, that must be obeyed if you expect standard code to interact,
rather than an "invariant" requirement like the contents of the sp and gp
registers.)

Just like non-standard linkage declarations are provided for "consenting
modules", the proposal was that the unaligned attribute be the equivalent
for global storage on the C side.

Note that I don't think the assembler should change at all, unless it
is to add a warning that a global symbol's low-order bits won't be <000>.
(i.e. I don't believe you should do any automatic alignment that isn't
done already, though others might believe that should happen.  Of course,
I'm still someone that thinks of assemblers as symbolic machine code,
not something that does scheduling for you, etc.)
109.19OK, revised ECO proposal changing "should" to "must"GEMGRP::BRENDERRon BrenderFri Feb 07 1997 16:5946
			  WNT Call Standard ECO 7
			 UNIX Call Standard ECO 5
				regarding
			Alignment of Global Variables


Abstract
--------

The alignment of global variables is specified as being at least quadword.


Proposal
--------

In the WNT Call Standard, Section 4.2.1, page 4-10 (Jan 1997)...
In the UNIX Call Standard, Section 4.2.1, page 4-14 (Mar 1996)...

    ...replace the second paragraph with the following:

	To avoid such performance degradation, all data values on Alpha
	systems must be naturally aligned. Moreover, global data values
	must be at least quadword aligned (IEEE floating extended X and
	IEEE floating extended precision complex data must be octaword
	aligned, whether global or not). Table 4-3 shows the data
	alignment requirements for non-global data.

    Note: for the UNIX document, the parenthetical clause begins

	...(extended precision real and extended precision complex data...

    to match other document terminology conventions.


Discussion
----------

The justification for this change is well set forth in ALPHA_CALLSTD
note stream 109, especially 109.7. Note in particular that this requirement
is already implemented and assumed in practice so that this ECO is just a
codification of existing practice rather than a change. That discussion did
not mention extended floating data; however, there seems no reason to weaken
the already existing octaword alignment for extended floating. (See
ALPHA_CALLSTD 76.11, and to a lessor extent, 72.13 topic 2, regarding
alignment for extended floating [aka X-floating]).
109.20ECO should be "invariant" ruleSTEVEN::hobbsSteven HobbsFri Feb 07 1997 20:0239
In .18 Kent Glossop says:.

> Note that nothing prevents "private agreements" between assembler modules,
> just like private linkages are allowed between routines.

> (i.e., this is a "standard agreement constraint", like standard parameter
> list layout, that must be obeyed if you expect standard code to interact,
> rather than an "invariant" requirement like the contents of the sp and gp
> registers.)

> Just like non-standard linkage declarations are provided for "consenting
> modules", the proposal was that the unaligned attribute be the equivalent
> for global storage on the C side.

The calling standard has not been very clear when "private agreements"
allow "consenting modules" to break the rules and when the standard
specifies an "invariant" requirement that may never be broken.

I disagree with Kent in that I believe that this ECO is specifying an
"invariant" that may not be broken by consenting modules.  Kent
mentions that compilers already follow this rule and the change will
allow compilers to take advantage of it.  I also believe that the
wording of this ECO will allow NTOM to generate code that assumes
certain alignments when optimizing object modules and that the linker
and loader will also be allowed to assume certain alignments when
setting up references to independently loaded DLLs.  If the NTOM, the
linker, and the image activator (loader?) are allowed to take
advantage of these rules then separate assembly language modules may
not violate the rule when defining shared global symbols.

If only compilers are required to assume these rules have been
followed, and if assembly code (and assembly code combined with
globals, but not externals, from consenting BLISS modules :-) is
allowed to violate the rules then we must specify in the calling
standard that object optimizers, linkers, loaders, and image
activators may not assume that the rule is followed.

I prefer no restrictions on which software components may assume the
rules are followed.
109.21I believe "Excel"/existing practise has a veto to invariantGEMEVN::GLOSSOPOnly the paranoid surviveFri Feb 07 1997 22:5374
Maybe we should be more clear about "private agreement" vs "invariant",
since some prior problems have been directly related to which category
things are in (for example, the issue around NT linkages and transfer-
point available "volatile" registers.)

Personally, I believe the statements at the the beginning of the calling
standard make it quite clear that unless things are *explicitly* stated
to be invariant, they are merely cooperative convention.

    "Compiler writers are encouraged to make optimal use of such
    optimizations as appropriate while always ensuring that procedures
    outside the compilation unit can proceed as if the letter
    of the standard were met."

which I would take to mean that anything that is not *explicitly* stated
as an invariant, isn't.

(Remarkably like like the 10th amendment: The powers not delegated
to the System by the Calling Standard, not prohibited by it
to the Applications, are reserved to the Applications respectively,
or to the coders. :-) )

i.e. if you want a byte-aligned global, or a procedure with 20 parameters
in registers, or some other truly bizarre thing, that's *your* business.

Personally, I don't believe this ECO can fall in the invariant category
because there are significant applications that are built (NT) that are
of the consent variety that don't fit the quadword model.  If the image
activator (for example) were modified to presume this as an invariant,
a real, existing (important) application would break, which I consider
a totally unacceptable change in the current defacto rules.

(From a number of perspectives, I believe it is desirable to have as few
invariant rules as possible to limit constraints on future transformations
and people attempting to make fast special-purpose code.  If we were starting
from scratch, the architecture would have byte/word ops, and there wouldn't
be any global alignment restrictions.  But we aren't, so it is worth codifying
the implications of the rules that were initially implemented to allow them
to be used to maximum advantage - within the bounds that currently exist.)

FWIW - I believe the only things that are really invariants are run-time
things that can be determined if exception goes off at an arbitrary point,
and "link-time" items that can be determined from analysis of the code
and associated procedure descriptors.

    - sp validity
    - gp validity (Unix)
    - r28 killed for external calls
    - stack probing
    - accurate PD information and conformance (e.g. NT reverse executability)
      (note that the implementation actions for things are normally "as if")
    - [if ECOed] complete PD coverage for code

plus and object language rules.

Note that the calling standard does not prohibit any of the following (as
far as I know):

    - unrelocated branches (i.e. materializing an offset and jumping PC-rel)
    - placing data in the code stream
    - self-modifying code
    - using code as data

etc.  One can argue that there should be a set of "realistic" rules to allow
post-processing, but I think that is distinctly different from the set
of rules that the loader and run-time system *must* follow, which is that
*only* the invariants must be valid.  (For example, I was told that the FX!32
interpreter uses one of the items in the above bullet list, and other apps use
at least one of the other 3.  I do NOT believe that the loader/run-time system
is allowed to retro-actively invalidate these.  I DO believe that a set
of consent rules for post-processing is eminently reasonable, and that set
can be more restrictive than currently implemented, with the statement that
you may need to rebuild and re-examine your code for suitability for post-
processing.)
109.22YES.WIDTH::MDAVISMark Davis - compiler maniacMon Feb 10 1997 10:2250
The intent and the wording are fine.


Comments on some other postings:

.3:
> For the following source:
> 	extern char c;
> 	int fetch() { return c; }
> 
> I believe GEM feels entitled to generate
> 
> FETCH:	LDAH	R0, h^c(R31)
> 	LDQ_U	R0, l^c(R0)
> 	AND	R0, 0xFF, R0
> 	RET	R26

Why does GEM use ldq_u instead of ldq if it knows the address is
aligned?  It would be safer to use ldq, because it would always get 
the correct value (via kernel fixup if the address is unaligned).
This would fix some of the possible mistakes from assuming alignedness -
it would NOT fix expressions using the Address itself, such as
	1 & (&c)
which will be assumed to be 0.


.7 and others:  Kent asks for assurance of padding in addition to 
quadword alignment.  I believe this is a Language issue, not a calling
standard issue.  E.g., in C, if I say:
	extern char c;
can I match that with
	char c[8];  ??

It is up to the language (and the compiler strictness) to determine 
whether these are the same, or "illegal".

If this is illegal, then the (new) calling std rule of quadword alignment
of globals will prevent different globals from overlapping in a 
quadword.  Hmm, OK, I see the problem; without a padding rule, you can
pack a LOCAL into the same quadword as a global:

	char c;
	static char d;	// shares quadword with c


.17: Assembler changes:  The assembler shouldn't do any automatic
alignment for globals.  The only "changes" should be:

	1. document the calling std rule
	2. maybe issue a warning if global isn't quad aligned.
109.23Users need freedom to pessimize local dataWIBBIN::NOYCEPulling weeds, pickin&#039; stonesMon Feb 10 1997 11:1114
You changed too many SHOULDs to MUSTs.  The text should :-) read

	To avoid such performance degradation, all data values on Alpha
	systems SHOULD be naturally aligned. Moreover, global data values
	MUST be at least quadword aligned ...

I don't know offhand what we should say about X_float.  If we ever get
hardware support for it, it's likely to trap if they're not octa-aligned.
But I don't know if that's enough to say they MUST be octa-aligned, or
only SHOULD be.  If the hardware were simply to ignore address<3>, we would
have a reason to say MUST, but that seems unlikely to me.

I'm pretty sure that GEM and the math library do not currently support
X_complex, though there's been a (tiny) bit of call for it.
109.24DECCXL::OUELLETTEMon Feb 10 1997 11:283
The part about xfloat that I was confused on was the argument passing
part where only QUAD alignment is required (how do you get local IEEEX's
on the stack OCTA aligned then?)
109.25SMURF::RICKABAUGHMike Rickabaugh Quo flamma est?Mon Feb 10 1997 13:1110
I have little problem with this ECO (with Bill's last wording) mainly
because we've been doing it since V1.2.

The exception to that is Xfloat.  All link-time commons are quadword
aligned independent of the type of common.  So any Xfloat common isn't
currently aligned to octaword.

The UNIX linker and loader would need to address this to conform.

-mike
109.26Ok with asaxpSMURF::GAFJerry Feldman, Unix Dev. Environment, DTN:381-2970Mon Feb 10 1997 17:154
    I also support this ECO. The assembler does not need to force
    alignments on programmers other than what is currently document. It is,
    and should be the code writer's responsibility to comply with the
    calling standard. 
109.27Calling std nec./not suff, for paddingGEMEVN::GLOSSOPOnly the paranoid surviveMon Feb 10 1997 17:4744
> I believe GEM feels entitled to generate
> 
> FETCH:	LDAH	R0, h^c(R31)
> 	LDQ_U	R0, l^c(R0)
> 	AND	R0, 0xFF, R0
> 	RET	R26

> Why does GEM use ldq_u instead of ldq if it knows the address is
> aligned?

In general, I wouldn't expect that it *probably* would in this case.
However, "probably" is different that "will".  (Given cost-based
pattern selection, different contexts might yield results.  There is
a slight bias for ldq over ldq_u given scheduling impacts, etc., however
there definitely hasn't been any attempt to ensure that there are always
lower-cost aligned quad load patterns, since ldq_u doesn't cost any
more at the processor level to execute than a ldq.)

>  It would be safer to use ldq, because it would always get 
> the correct value (via kernel fixup if the address is unaligned).

GEM has no concept of "safer" - either it *is* safe (and violations are
considered bugs) or it isn't.  If it is safe, it may happen as a result
of pattern costing or other transforms...

(This has come up repeatedly with GEM - there's a big difference between
saying something "probably won't" occur due to costing, and *definitely*
won't - the latter requiring pattern tests in the code generator and guards
elsewhere to ensure that inappropriate patterns/transforms don't get selected -
e.g. the work that has been done to support the two "flavors" of volatile,
byte and long granularity, etc..)

(A variety of things have happened because we try to write fairly general
transforms.  One example that comes to mind from final was general loop
rotation code that tried to remove unconditional branches that rotated
a procedure entry with no prolog code into the middle of a while loop...)

> .7 and others:  Kent asks for assurance of padding in addition to 
> quadword alignment.  I believe this is a Language issue, not a calling
> standard issue.

It's both - the calling standard part is necessary, but not sufficient,
to allow this optimization (which should probably be treated in the same
category as -ansi_args on Unix - maybe -ansi_globals?)
109.28YesGEMGRP::GROVEThu Feb 13 1997 10:122
    Yes, with Bill's most recent amendments for "should" and "must".
    /Rich
109.29STEVEN::hobbsSteven HobbsThu Feb 13 1997 13:2814
I will vote Yes if:

(1) X-float and other octaword aligned data *must* be quad aligned but
*should* be octa aligned.  We have existing Unix objects that have
X-floats that are not quad aligned.  We should not obsolete these
existing object files without a good reason.  If someone believes we
will add a load-octa instruction to Alpha then that is a good reason
to change my request to require octa alignment and to require these
existing objects to be recompiled before they will run on the future
system.

(2) The rule applies only to standard interfaces.  A nonstandard
interface (written in assembler or Visual C with the special Excel
switch) is allowed to use nonstandard alignments.
109.30DECWET::GLOSSONThu Feb 13 1997 14:392
I'll vote yes as well, with the must/should
replacement as suggested.
109.31ECO draft 3 -- and division of the questionGEMGRP::BRENDERRon BrenderThu Feb 13 1997 14:4776
OK, I'm going to exercize a little moderatorial (that's not a word, but you
know what I mean) discretion and divide the question in order to make progress.

Q1: Require (at least) quadword alignment for globals

Q2: Reduce the required alignment for 128-bit extended floating point (aka
    X-floating) to quadword (at least for globals?).

Following is the revised ECO (one more time), hopefully with the right number
of "should"s and "must"s in the right places. It is otherwise worded as it
was before because the octaword requirement is not new with this ECO, it merely
follows from what was previously approved in ECOs 38/VMS, 1/UNIX and 1/WNT
(see Note 76).

Because there appears to be no controversy regarding Question 1, I am going
to continue this ballot until the previously announced closure date of
Monday, 17 February. Please vote (and I will so interpret votes), if you
have not already, as not prejudicing the octaword alignment question one way
or the other. I will start a new note on that topic (if necessary).

The question of 128-alignment for commons has been placed on the
Agenda of the UNIX Object File and Symbol Table Working Group (OFSTWG) for its
next meeting on Monday. (All other sections, including TLS sections, already
provide octaword alignment so only commons are at issue.) If that group
should decide to change the UNIX rules to provide octaword alignment for
commons, then we can consider Question 2 to be "resolved".

Here is the now thrice revised ECO...


			  WNT Call Standard ECO 7
			 UNIX Call Standard ECO 5
				regarding
			Alignment of Global Variables


Abstract
--------

The alignment of global variables is specified as being at least quadword.


Proposal
--------

In the WNT Call Standard, Section 4.2.1, page 4-10 (Jan 1997)...
In the UNIX Call Standard, Section 4.2.1, page 4-14 (Mar 1996)...

    ...replace the second paragraph with the following:

	To avoid such performance degradation, all data values on Alpha
	systems should be naturally aligned. Moreover, global data values
	must be at least quadword aligned (IEEE floating extended X and
	IEEE floating extended precision complex data must be octaword
	aligned, whether global or not). Table 4-3 shows the data
	alignment requirements for non-global data.

    Note: for the UNIX document, the parenthetical clause begins

	...(extended precision real and extended precision complex data...

    to match other document terminology conventions.


Discussion
----------

The justification for this change is well set forth in ALPHA_CALLSTD
note stream 109, especially 109.7. Note in particular that this requirement
is already implemented and assumed in practice so that this ECO is just a
codification of existing practice rather than a change. That discussion did
not mention extended floating data, so this ECO is written in a manner that
is consistent with the prior 128-bit extended floating ECO.

(A suggestion to relax the octaword alignment requirement for extended
floating may be considered as a separate ECO.)
109.32Octaword alignment is SHOULD, not MUST, for X_floatWIBBIN::NOYCEPulling weeds, pickin&#039; stonesThu Feb 13 1997 15:4820
> It is otherwise worded as it
> was before because the octaword requirement is not new with this ECO, it merely
> follows from what was previously approved in ECOs 38/VMS, 1/UNIX and 1/WNT
> (see Note 76).

>  Moreover, global data values
>	must be at least quadword aligned (IEEE floating extended X and
>	IEEE floating extended precision complex data must be octaword
>	aligned, whether global or not).

Note 76 says that "natural alignment" for X_float and X_complex is octaword.
And it reiterates that data *should* be naturally aligned.  Nowhere that I
can find does it say that X_float *must* be naturally aligned -- in fact, it
discusses the fact that X_float members of structs passed by value might not
be naturally aligned.  I think the sentence quoted above, requiring all X_floats
to be octaword aligned, is a new requirement, and I disagree with it.

The requirement we're trying to add is that global data, of whatever type,
*must* be at least quadword aligned.  Let's stick to that question (on which
I vote YES).
109.33STEVEN::hobbsSteven HobbsSat Feb 15 1997 17:0012
I vote NO for the two reasons given in .29.  My vote will change to
YES if we drop the mandatory requirement for octaword alignment of
X-floats and if we change the words "global data values must be at
least quadword aligned" to instead read "global data values shared
across a standard interface must be at least quadword aligned".

We have existing object code files that violate these rules
(eg. Fortran REAL*16 variables on Unix and EXCEL on NT).  The current
ECO allows changes to ld, loader, and the NT equivalents that could
break these existing objects.  I will vote yes if these existing
objects will continue to be supported by the linker and image
activator.
109.34Octaword alignment news from UNIX OF/STWGGEMGRP::BRENDERRon BrenderMon Feb 17 1997 14:0622
The question of octaword alignment for globals was considered at the UNIX
Object File and Symbol Table Working Group this morning. After a brief
discussion, the group endorsed the following suggestion from Mike Rickabaugh:

    Commons that are at least 16-bytes in size, will be octaword (16-byte)
    aligned; others will continue to be quadword aligned.

The rational is as follows:

    Many C programs continue to use the common model for external/global
    variables (rather than the ref/def model), so that there are often
    a large number of commons in a UNIX link -- one for each external
    variable! For the large proportion of externals that are for scalars,
    increasing the alignment from quadword to octaword will increase
    the storage required for no benefit. (Recall that all Alpha scalars
    other than extended floating are a quadword or less in size.) The
    proposal assures octaword alignment when it might matter, but does
    not impose extra storage requirements for typical smaller commons
    (ints, longs, pointers, and so on).

Note that no recompilation is required to achieve this increased alignment --
only relinking.
109.35Final ECO TextFLYBA::BRENDERRon BrenderTue Feb 18 1997 11:2048
			  WNT Call Standard ECO 7
			 UNIX Call Standard ECO 5
				regarding
			Alignment of Global Variables


Abstract
--------

The alignment of global variables is specified as being (at least) quadword.


Proposal
--------

In the WNT Call Standard, Section 4.2.1, page 4-10 (Jan 1997)...
In the UNIX Call Standard, Section 4.2.1, page 4-14 (Mar 1996)...

    ...replace the second paragraph with the following:

	To avoid such performance degradation, all data values on Alpha
	systems should be naturally aligned. Moreover, global data values
	shared across a standard call must be quadword aligned. Table 4-3
	shows the data alignment requirements for non-global data.


Discussion
----------

The justification for this change is well set forth in ALPHA_CALLSTD
note stream 109, especially 109.7. Note in particular that this requirement
is already implemented and assumed in practice so that this ECO is just a
codification of existing practice rather than a change.

That discussion did not mention extended floating data, and this proposal
originally extrapolated the octaword recommendation for extended floating
in general to a "must requirement" for global data in particular. That
extrapolation has generated controversy (see 109.25, .29, .32, .33). I
originally intended to separate the question for extended floating by
retaining the octaword requirement in the ECO, but have now concluded that
it is cleaner and more appropriate to take it out of this ECO so that
this ECO can stand fully supported. (See note 111, where the alignment
requirement of global extended should now be considered.)

This final draft does respond to Steve's other concern by including the
phrase "shared across a standard call" ("standard call" is a defined term in
the call standard, which "standard interface" is not).
109.36APPROVEDFLYBA::BRENDERRon BrenderTue Feb 18 1997 11:266
Having been accepted by the Calling Standard Committee and there being no
issues or suggestions posted (after separating the matter of extended floating
alignment), this ECO is

                                APPROVED

109.37Post deadline nit pickingSTEVEN::hobbsSteven HobbsWed Feb 19 1997 11:4226
I would like to suggest what should be an editorial change to the
wording of this ECO.  I suggest that in the second sentence that
"global data values" be replaced with "global symbols".

Consider the following ANSI Fortran:

	REAL B
	DOUBLE PRECISION A,C
	COMMON /X/ A,B,C

If we align the global symbol X then the ANSI Fortran standard states
that the global data value represented by local symbol C is not
aligned.  I believe that we intended to align the location referenced
by the global symbol and not the components of a global structure.
The original wording outlaws certain ANSI Fortran COMMON blocks (such
as the above) as well as PACKED data structures in languages like
Pascal and Ada.

The alternative is keep the current wording and to specify that
certain COMMON blocks and PACKED global data structures with
misaligned components are limited to private use within one language
and are not supported across standard calls.  I can accept either
this alternative or the suggested editorial change but if we take this
alternative approach then we should include a note in the standard to
make it clear that such standard language elements are not supported
across standard calls.
109.38Suggested textGEMGRP::BRENDERRon BrenderFri Feb 21 1997 16:469
re .37: I think the following text should do the trick...

        To avoid such performance degradation, all data values on Alpha
        systems should be naturally aligned. Moreover, the base address
	of a global variable or aggregate (but not necessarily the
	components of the aggregate)
        shared across a standard call must be quadword aligned. Table 4-3
        shows the data alignment requirements for non-global data.

109.39WNT Call Std Working Draft X1.10 AvailableGEMGRP::BRENDERRon BrenderTue Feb 25 1997 15:195
Working draft X1.10 (with change bars) of the Alpha WNT Calling Standard, which
incorporates ECOs 6 through 8 is available in

        GEMGRP::GEM2$:<BRENDER.PUBLIC>ALPHA-WNT-CALL-STD-X110-970225.PS