[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxuum::document_ft

Title:	DOCUMENT T1.0
Notice:	New notesfile (DOCUMENT.NOTE) now available (see note 897)
Moderator:	CLOSET::ADLER

Created:	Mon Feb 09 1987
Last Modified:	Thu Oct 31 1991
Last Successful Update:	Fri Jun 06 1997
Number of topics:	897
Total number of notes:	4397

699.0. "Way to globally reset word-splitting defaults?" by BLURB::WHARTON () Wed Jul 22 1987 15:17






I am a new DOCUMENT user and am in the process of converting .RNO

files to .SDML files.  In general, I'm very pleased with DOCUMENT

features and output.  However, I've found that one problem...  bad

word breaks...occurs far more often in DOCUMENT output than in output

from the tool I was using before.  I'm writing this note to find out

the best way to approach this problem.



My files contain many oddball words that should not be split.  These

oddball words include product names (for example, Rdb/VMS, DECnet/SNA,

VMS/SNA, CDD/Plus), user-defined names for CDD directories, database

entities, and program variables (for example, _CDD$TOP.DEPT10.ROGERS,

MIDDLE_INITIAL, INPUT_VAR), and language keywords (for example,

BATCH_UPDATE, READ_WRITE, READ_ONLY).  Document seems to assume that

periods, underscores, and slashes have the same significance as a

hyphen; in other words, that these characters provide a dandy place to

split a word.  Is there any way to globally reset these software

assumptions as opposed to maintaining some sort of dictionary (tough

for user-defined words) or flagging each occurrence of a given word?



Related to the word-splitting problem is DOCUMENT's reluctance to

leave a line "too short," even to the point of producing output that

is unusable.  This may be a bug so I am including sample .SDML code

and output.  Please let me know if the following behavior is what I

should consistently expect from DOCUMENT and therefore must manage in

some way:



<LIST>(UNNUMBERED)

 .

 .

 .



<LE>

Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN, 

creates a CDD directory with the full path name 

CDD$TOP.DEPT32.FIELDMAN.PERSONNEL_TEST

 .

 .

 .

<ENDLIST>



Here is the output (doctype S.H).  The first and second lines of the

list element generate a "line too long" warning in the .LOG:



o  Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN,

   creates a CDD directory with the full path name CDD$TOP.DEPT32.FIELDMAN.

   TEST



I believe DOCUMENT software should be able to figure out how much

space it has left on a line and simply move an entire word to the next

line rather than extending words too far into the margin, or (for line





                                  1







2) trashing part of the word (PERSONNEL_) in the output.  Maybe this

is appropriate behavior for processing text marked as an example or

figure.  The behavior seems inappropriate for processing lines of

general text in programming manuals.














































  


















































                                  2

T.R	Title	User	Personal Name	Date	Lines
699.1	If there is a will there is a way.	VAXUUM::PELTZ	�lvynstar Dun�dain	`Thu Jul 23 1987 13:41`	9
	There is a tag called <keep>, the argument to this tag is any word that you don't want to have hyphenated. It takes the form of <keep>(DONT.HYPHENATE_THIS$WORD) Chris
699.2	TeX Hack	COOKIE::WITHERS	Le plus ca change...	`Thu Jul 23 1987 14:16`	11
	Since, stylistically, I loathe hyphenation, I put the following command at the top of every file: <INCLUDE_TEX_FILE>(NO_HYPHEN.TEX) and NO_HYPHEN.TEX contains: \pretolerance=100000 Regards, BobW
699.3	Brute force vs. the perfect answer	BOOKIE::GENT	Party gone out of bounds -- B52's	`Thu Jul 23 1987 14:31`	13
	I'm afraid neither replies .1 nor .2 really answer the original question. If you have 10-20 user-defined symbol names per page it is simply not feasible to enclose them all in <KEEP> tags. While avoiding all hyphenation makes a bad situation worse. (Sorry guys.) I believe the questions are: 1) Is there a way to turn off hyphenation at underscores? 2) Is there a way to turn off hyphenation at embedded periods? --Andrew
699.4	TAG$POST_TRANSLATE may help here	CLOSET::ANKLAM		`Mon Aug 10 1987 15:50`	57
	Andrew is (as usual) correct. We have two ways of controlling the hyphenation of words with underscores and slash characters (/). Providing special processing when a word has an embedded period is something we hadn't yet considered. Here's the way these are handled. 1. By default, DOCUMENT treats underscore characters and slashes the same way that it treats explicit hyphens in a word. If a line contains a long string with a lot of _ or / in it, TeX will break any line following the _ or the /, but will not introduce a hyphen character to show continuation. This was an intentional design decision based on our original specs. 2. This behavior can be overridden in a DESIGN file, if this behavior is inappropriate for a specific type of document. The following should be put in the DESIGN file: \def\_{\underscore} \def\slash{/} These could also be put in the DOC$LOCAL_ELEMENTS file to make the change system-wide. 3. There is also a way to 'trap' a certain set of specific names and keep them from being hyphenated. This method involves the TAG$$LOCAL_STRINGS file. (This is not documented for customers; the interface is tricky and error-prone.) We have used this in previous versions of DOCUMENT and it is still in use in some places. After the tag translator finishes translating the tags in a file, but before the text formatter runs, a post-translate routine is invoked. This program fixes up special characters and makes sure that the text formatter is handed 'clean' text. It also looks for a file with the logical name TAG$$LOCAL_STRINGS. If this logical is defined, it is assumed to be a file that contains a set of strings to be translated and their outputs. If you look in DOC$LOCAL_FORMATS:CUP$LOCAL_STRINGS.TXT, you will see the list of product names that we look for and handle. In that file, you see things like: 'MACRO-11' '\:{MACRO--11}' /* MACRO-11 */ This input/output/comment sequence is read in during post-translation. Any occurrence of MACRO-11 (with a hyphen) is translated so the hyphen is an en-dash (which looks better) and it uses the shorthand \:{ } to put a hbox around it, so it will never be split across lines. I hope that this helps enough to solve some of the problems in .0. patti
699.5		CRAYON::GENT	Party gone out of bounds -- B52's	`Mon Aug 10 1987 16:13`	3
	Thanks Patti. That definitely helps. --Andrew