[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::languages

Title:Languages
Notice:Speaking In Tongues
Moderator:TLE::TOKLAS::FELDMAN
Created:Sat Jan 25 1986
Last Modified:Wed May 21 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:394
Total number of notes:2683

134.0. "Terminology for elements of programs" by TOKLAS::FELDMAN (PDS, our next success) Wed Apr 01 1987 18:51

    We need to develop terminology to describe the different sorts of
    elements that comprise a program.  So far, we have the following: 
    
    	1. FILE
    	2. COMPILATION UNIT (module, library unit)
    	3. SUBUNIT (routine, procedure, function, task, package, generic unit
    	4. OBJECT (variable, type, constant, common block, subtype, etc.)
    
    Note that there is some overlap -- in some languages, a routine
    is also a compilation unit, for example.  Let's ignore this issue
    for the moment, and concentrate on the terminology.
    
    I feel comfortable with the first two terms; they are both fairly
    concrete and established.  I'm less comfortable with subunit, and
    not at all comfortable with object.  
    
    Are there any suggestions for better terms?  Are there any elements
    that aren't included in the above list?
    
       Gary
    
    PS The motivation for making these distinctions is as follows. 
    We're building a mechanism for associating tagged comments with
    code elements, and we need to describe the restrictions on the tags.
    For example, it may make sense to have a modification history for
    a file or a compilation unit, but probably not on a variable.
    Likewise, it makes sense to have an invariant condition on a variable
    (such as I is always even) but not on a file.
T.RTitleUserPersonal
Name
DateLines
134.1Let's muddy the waters...YIPPEE::DELISLEWed Apr 01 1987 20:3614
    We use still another term: subsystem.  A subsystem is one or more
    files and/or compilation units assigned to a single programmer for
    coding.  It is a "black box" that performs one or more related tasks
    for a program, such as file I/O.  Suppose your program needs to
    talk to a listing file, a journal file, and a shadow file.  Assign
    all these activities to the FIO (file I/O) subsystem of the program.
    The programmer in charge of same will then provide entry points
    such as FIO_OPEN_LIST_FILE, FIO_WRITE_LIST_FILE, etc.  These routines
    are the only place you will ever see calls to routines starting
    with SYS$ or LIB$, etc.  It makes a program very modular and easy
    to change.
    
    Define and assign subsystems, and away you go!
    Uncle Ray 
134.2TLE::BRETTWed Apr 01 1987 21:2192
    Curiously enough, I am least happy with the first two.  I don't
    understand what files have to do with programs, apart from the fact
    that in the 1950's people decided that files would be a convenient
    place to store their program!  If you are going to include them
    for that reason I don't see why you don't include line printers
    as a place where people decided it would be convenient to print
    them...
    
    Modules were largely an invention of the macro assembler people,
    who needed a shell to wrap around the string of unrelated pieces
    in a typical chunk of assembly code.  Pascal modules and programs
    are much closer to packages, except that they are only allowed at
    the outermost level.  If we had been designing a DEBUGGER now we
    would never put this concept in, and I don't know why our tools
    keep on propagating this fundamental mistake.
    
    In Ada, compilation units are a SUBSET of the "subunits", not a
    separate class.  This is also true in FORTRAN.
    
    
    Enough of this background, what are the fundamental concepts in
    main-stream programming languages today?
    
    
    Essentially you have values, sets of values, and containers for
    values.
    
    	Values are typically 

		atomic:	points (only have equal/not-equal operations)
    
    			enumerations {including character} (also have
    				a less-than operation),
    
    			integers, floating point, fixed point
    				{binary and decimal}  (have various
    				arithmetic operations)       
    
    			algorithms (eg: procedures, routines,
    				LISP anythings!?)
                                       
    			labels (esp. Fortran and PL/I)
    
    		composite:
    			sets
    			arrays
    			records (unions and variants)
    			lists, queues, stacks, etc.
    			threads (VMS processes, Ada tasks)
    
    
    		reference:
    			access-types (Ada), pointers (Pascal),
    			aliases (Ada renamings)
                                  
    
    	Sets-of-values are usually called "types" or "classes".  Sometimes
    	some subsetting of a type's values is available (eg Ada's subtypes).
                 
    
    	Containers for values are sometimes "typed" meaning they can
    	only hold values of a particular "type".
    
    	Examples are Pascal variables,
    	Ada variables and constants {X : constant INTEGER := 10;}
    
    	In special cases they hold algorithms (Pascal procedure-parameters,
    	Ada generic-formal-subprograms).
    	                             
    
    
    So, summing up the above, the naming convention should be something
    along the lines of
    
    	Algorithm	(routine, procedure, function, task)
    	Value		(algorithm, integer, etc) 
    
    	Container	(package, variable, some parameters)
    
    	Type		(type)
	Subtype		(subsets of a type)
    
    	Alias		(some parameters, renamings)
                                                                            
    
    /Bevin
    
    PS: I expect the above to be regarded as "too radical", but I seriously
    believe we need to breathe some rationality into the typical irrational
    categorisation that is used.  Ideally tools would be multi-lingual
    rather than common-lingual, so that each person could speak to it
    in the language terms (s)he is most used too.
134.3Keep those cards and letters comingTOKLAS::FELDMANPDS, our next successThu Apr 02 1987 11:5634
    I guess I need to clarify .0 by stating that I'm not trying to build
    a model of program structure  (I wish I were -- it would be a good
    thing to do).  I'm trying to build a model that represents commenting
    styles in existing programs.  In other words, I need to describe
    the different places you might put a comment, and to distinguish
    such places (coarsely) according to what type of comment might belong
    there.
    
    Given that, the idea of subsystem in .1, is excellent, but we need to
    be careful in using it.  My preference is to reserve that term for
    concepts that do not currently map on to single concrete things.
    Specifically, I would use the term subsystem to refer to sets of
    compilation units or subsystems.  Unfortunately, we don't have any
    place to describe such collections, nor are we currently trying
    to provide such a place.  (Again, I think we should, and it would
    be a good thing, but it's not on the list.)
    
    Re: .2
    
    I don't see your proposal as radical at all; I'd love to use it.
    Unfortunately, it doesn't correspond to the problem I'm trying to
    solve.  For example, files have nothing whatsoever to do with the
    Fortran language, but they are used as a unit of organization of
    Fortran programs.  The comment block that might occur at the beginning
    of a Fortran source file isn't associated with any element of the
    Fortran language, but I still need to be able to talk about such
    comment blocks.
    
    I do think the proposal in .2 will help give some insight into the
    terminlogy we need.
    
    Please keep your ideas coming in.
    
       Gary
134.4BACH::VANROGGENFri Apr 03 1987 13:158
    I agree that #4, OBJECT, is definitely undesirable, since it has
    too many other denotations and connotations.
    
    Since you seem to be making the distinction on whether the "thing"
    is mutable (the first three are, but the fourth isn't), how about
    calling them ATOMs?
    
    			---Walter
134.5! coming upMODEL::YARBROUGHFri Apr 03 1987 14:4710
One way of looking at comments is as generalisations of whitespace. As far 
as a language compiler is concerned, the two are (usually) equivalent. It 
is only people, or processing programs with some entirely different extra-
linguistic semantic rules, that 'understand' comments. So a commented 
program contains two (or more) intermixed sets of semantics.

I am reminded of a "Grook" by Piet Hein, about people and mice living in 
a house, each living "inside the walls" that delimit the other's living 
space. The people have the larger rooms, but otherwise their view of the 
house is topologically the same.
134.6How will terms be used?TLE::LUPTONFri Apr 03 1987 18:005
    It isn't clear to me why you need this terminology.  Is this a way you
    wish to document something?  As I understand PDS, it will provide for
    attaching comments to items that SCA understands.  SCA has names for
    these items already; they are the legal values for the /SYMBOL
    qualifier. How will this new terminology be used?
134.7TOKLAS::FELDMANPDS, our next successFri Apr 03 1987 19:387
    SCA has names for the indivdual items, but not for the classes.
    I'd like to avoid stating that an invariant is only allowed on 
    {component, constant, argument, type, variable}, or that functional
    descriptions are allowed on {files, routines, functions, procedures,
    subroutines, tasks, modules, packages, programs}.
    
       Gary
134.8Some thoughtsDENTON::AMARTINAlan H. MartinFri Apr 03 1987 19:4362
Re .0:

I'd like to note that:

In (at least some) languages, you can have multiple files per "compilation
unit", multiple "compilation units" per file, or in the worst case,
files spanning "compilation units" and "compilation units" spanning
files.  Some examples would be MACRO, FORTRAN and BLISS.  Therefore,
if you want to be completely general, you cannot assume a hierarchy
between files and "compilation units".

FWIW, in Fortran-77, your "subunit" would be called a "program unit".
The four kinds of program units are main programs, SUBROUTINE subprograms,
FUNCTION subprograms and BLOCK DATA subprogram, if memory serves.

Since, at least in C, and possibly in other languages, an "object" occupies
space at runtime, it is probably not the best name for a concept which
is intended to encompass datatypes, and possibly other ethereal things.
I acknowledge that no scheme will be perfect because of the profusion
of names used in different languages, but "object" might be a particularly
sticky choice.

Perhaps peering into the reference for a sufficiently sophisticated
language would turn up a good word or two.

Re .1:

Not a bad definition for subsystem.  A related word is "component" (defined
in the Software Development Policies and Procedures manual).  I don't have
a copy at hand, but as an example, I once worked on a product with four
components - a compiler, a I/O system, a subroutine library and a debugger.

I endorse the definition of module from the VAX-11 Software Engineering
Manual:

"
The module should contain

			  THE FUNCTIONALITY,
		       THE WHOLE FUNCTIONALITY
		  AND NOTHING BUT THE FUNCTIONALITY!
"

I realize you are engaged in the design of a product, not something
merely for internal use.  However, any benefits you can glean from the
nomenclature of our internal process should be taken advantage of.

A third related word is "facility".  I'm not how "facility" and "component"
relate to each other, but they are both probably composed of modules.

Re .4:

Are you replying to .0, or what?

If you are talking about .0, I don't see how you think that files, modules
and routines are "mutable", yet variables aren't.  Are you making some kind
of editing, rather than runtime, distinction?

I have seen "atom" used as a synonym for "lexeme" and "token".  It may
not be wise to use that as a name for constructs which are denoted by
more than one lexeme.
				/AHM/THX
134.9BACH::VANROGGENSat Apr 04 1987 15:4822
    Re: .8
    
    The distinction between run-time modifications and edit-time
    modifications of programs does capture some of the notion I was
    trying to convey, but it may be misleading too.
    
    Anyway, say we do think about making textual source changes to a
    program.  If we have a number represented by "234", and we add a
    "5" to it to get "2345", the number hasn't changed--we have a
    different number.  Similarly, if we have some variable named "FOO"
    and change some of its characters, the variable hasn't changed--
    instead we refer to a different variable.  On the other hand, the
    function that uses these numbers or variables has indeed changed.
    And this concept of atomicity isn't just restricted to tokens--
    it applies to syntax as well.  In a number of languages adding a
    pair of parentheses changes whether or not there is a procedure
    call.  The token or its referent may not have changed, but the
    procedure/program certainly has.  This change, btw, is independent
    of whether that procedure or file or whatever is named.
    
    			---Walter
    
134.10booksIPG::HAXBYJohn Haxby -- Definitively WrongSat Aug 01 1987 07:3629
    There are some very precise descriptions of the components of programs,
    though with different names to the ones bandied about here.  The
    best description of a programming language EVER is that of Algol68
    -- in the "Revised Report on the Algorithmic Language Algol 68",
    Van Wijmgaarden et al, published by Springer Verlag.  Unfortunately,
    it is difficult to read and very few people understand it, but it
    is very precise and unambiguous.
    
    Almost as good a description of program elements can be found in
    Babara Liskov and John Guttag's book "Specification and Abstraction
    in Program Design" (I think I've got the title right), its published
    by MIT press.
         
    If you are looking for ways to describe any program, existing or
    not, I would suggest reading the first few chapters of the latter.
    If you want to find out how to *precisely* express something, look
    at the Algol 68 Report (published in 1976, in spite of the title).
                                                           
    
    For what its worth, a 'FILE' is a poor descriptive term for anything.
    In some languages (eg C and Algol 68) it means a data type which
    is used for reading and/or writing bytes to a more-or-less permanent
    storage area provided by the Operating system.  In other languages,
    a 'FILE' is merely a string with some syntax, in yet other languages
    you have 'file-names' and 'streams' which separately describe both
    aspects of 'FILE'.
    
    							jch