[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxuum::document_ft

Title:DOCUMENT T1.0
Notice:**New notesfile (DOCUMENT.NOTE) now available (see note 897)**
Moderator:CLOSET::ADLER
Created:Mon Feb 09 1987
Last Modified:Thu Oct 31 1991
Last Successful Update:Fri Jun 06 1997
Number of topics:897
Total number of notes:4397

699.0. "Way to globally reset word-splitting defaults?" by BLURB::WHARTON () Wed Jul 22 1987 16:17






I am a new DOCUMENT user and am in the process of converting .RNO

files to .SDML files.  In general, I'm very pleased with DOCUMENT

features and output.  However, I've found that one problem...  bad

word breaks...occurs far more often in DOCUMENT output than in output

from the tool I was using before.  I'm writing this note to find out

the best way to approach this problem.



My files contain many oddball words that should not be split.  These

oddball words include product names (for example, Rdb/VMS, DECnet/SNA,

VMS/SNA, CDD/Plus), user-defined names for CDD directories, database

entities, and program variables (for example, _CDD$TOP.DEPT10.ROGERS,

MIDDLE_INITIAL, INPUT_VAR), and language keywords (for example,

BATCH_UPDATE, READ_WRITE, READ_ONLY).  Document seems to assume that

periods, underscores, and slashes have the same significance as a

hyphen; in other words, that these characters provide a dandy place to

split a word.  Is there any way to globally reset these software

assumptions as opposed to maintaining some sort of dictionary (tough

for user-defined words) or flagging each occurrence of a given word?



Related to the word-splitting problem is DOCUMENT's reluctance to

leave a line "too short," even to the point of producing output that

is unusable.  This may be a bug so I am including sample .SDML code

and output.  Please let me know if the following behavior is what I

should consistently expect from DOCUMENT and therefore must manage in

some way:



<LIST>(UNNUMBERED)

 .

 .

 .



<LE>

Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN, 

creates a CDD directory with the full path name 

CDD$TOP.DEPT32.FIELDMAN.PERSONNEL_TEST

 .

 .

 .

<ENDLIST>



Here is the output (doctype S.H).  The first and second lines of the

list element generate a "line too long" warning in the .LOG:



o  Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN,

   creates a CDD directory with the full path name CDD$TOP.DEPT32.FIELDMAN.

   TEST



I believe DOCUMENT software should be able to figure out how much

space it has left on a line and simply move an entire word to the next

line rather than extending words too far into the margin, or (for line





                                  1







2) trashing part of the word (PERSONNEL_) in the output.  Maybe this

is appropriate behavior for processing text marked as an example or

figure.  The behavior seems inappropriate for processing lines of

general text in programming manuals.














































  


















































                                  2

    
T.RTitleUserPersonal
Name
DateLines
699.1If there is a will there is a way.VAXUUM::PELTZ�lvynstar Dun�dainThu Jul 23 1987 14:419
         
         There is a tag called <keep>, the argument to this tag
         is any word that you don't want to have hyphenated.  It
         takes the form of 
         
         <keep>(DONT.HYPHENATE_THIS$WORD)
         
         Chris
699.2TeX HackCOOKIE::WITHERSLe plus ca change...Thu Jul 23 1987 15:1611
    Since, stylistically, I loathe hyphenation, I put the following
    command at the top of every file:
    
    <INCLUDE_TEX_FILE>(NO_HYPHEN.TEX)
    
    and NO_HYPHEN.TEX contains:
    
\pretolerance=100000
    
    Regards,
    BobW
699.3Brute force vs. the perfect answerBOOKIE::GENTParty gone out of bounds -- B52&#039;sThu Jul 23 1987 15:3113
    I'm afraid neither replies .1 nor .2 really answer the original
    question. If you have 10-20 user-defined symbol names per page 
    it is simply not feasible to enclose them all in <KEEP> tags.
    While avoiding all hyphenation makes a bad situation worse.
    (Sorry guys.)
    
    I believe the questions are:
    
    1)  Is there a way to turn off hyphenation at underscores?
    
    2)  Is there a way to turn off hyphenation at embedded periods?
    
    --Andrew
699.4TAG$POST_TRANSLATE may help hereCLOSET::ANKLAMMon Aug 10 1987 16:5057
    
    Andrew is (as usual) correct. We have two ways of controlling the
    hyphenation of words with underscores and slash characters (/).
    Providing special processing when a word has an embedded period
    is something we hadn't yet considered.
    
    Here's the way these are handled.
    
    1. By default, DOCUMENT treats underscore characters and slashes
       the same way that it treats explicit hyphens in a word. If a
       line contains a long string with a lot of _ or / in it, TeX
       will break any line following the _ or the /, but will not 
       introduce a hyphen character to show continuation. This was
       an intentional design decision based on our original specs.
    
    2. This behavior can be overridden in a DESIGN file, if this
       behavior is inappropriate for a specific type of document.
       The following should be put in the DESIGN file:
    
       \def\_{\underscore}
       \def\slash{/}
    
       These could also be put in the DOC$LOCAL_ELEMENTS file to make
       the change system-wide.
    
    3. There is also a way to 'trap' a certain set of specific names
       and keep them from being hyphenated. This method involves the
       TAG$$LOCAL_STRINGS file. (This is not documented for customers;
       the interface is tricky and error-prone.) We have used this in
       previous versions of DOCUMENT and it is still in use in some
       places.
    
       After the tag translator finishes translating the tags in a file,
       but before the text formatter runs, a post-translate routine
       is invoked. This program fixes up special characters and makes
       sure that the text formatter is handed 'clean' text. It also
       looks for a file with the logical name TAG$$LOCAL_STRINGS. If
       this logical is defined, it is assumed to be a file that contains
       a set of strings to be translated and their outputs.
    
       If you look in DOC$LOCAL_FORMATS:CUP$LOCAL_STRINGS.TXT, you will
       see the list of product names that we look for and handle. In
       that file, you see things like:
    
       'MACRO-11'	'\:{MACRO--11}'		/* MACRO-11 */
    
       This input/output/comment sequence is read in during
       post-translation. Any occurrence of MACRO-11 (with a hyphen)
       is translated so the hyphen is an en-dash (which looks better)
       and it uses the shorthand \:{ } to put a hbox around it, so it
       will never be split across lines.
    
    
    I hope that this helps enough to solve some of the problems in .0.
    
    
    patti 
699.5CRAYON::GENTParty gone out of bounds -- B52&#039;sMon Aug 10 1987 17:133
    Thanks Patti. That definitely helps.
    
    --Andrew