[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DOCUMENT T1.0 |
Notice: | **New notesfile (DOCUMENT.NOTE) now available (see note 897)** |
Moderator: | CLOSET::ADLER |
|
Created: | Mon Feb 09 1987 |
Last Modified: | Thu Oct 31 1991 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 897 |
Total number of notes: | 4397 |
699.0. "Way to globally reset word-splitting defaults?" by BLURB::WHARTON () Wed Jul 22 1987 16:17
I am a new DOCUMENT user and am in the process of converting .RNO
files to .SDML files. In general, I'm very pleased with DOCUMENT
features and output. However, I've found that one problem... bad
word breaks...occurs far more often in DOCUMENT output than in output
from the tool I was using before. I'm writing this note to find out
the best way to approach this problem.
My files contain many oddball words that should not be split. These
oddball words include product names (for example, Rdb/VMS, DECnet/SNA,
VMS/SNA, CDD/Plus), user-defined names for CDD directories, database
entities, and program variables (for example, _CDD$TOP.DEPT10.ROGERS,
MIDDLE_INITIAL, INPUT_VAR), and language keywords (for example,
BATCH_UPDATE, READ_WRITE, READ_ONLY). Document seems to assume that
periods, underscores, and slashes have the same significance as a
hyphen; in other words, that these characters provide a dandy place to
split a word. Is there any way to globally reset these software
assumptions as opposed to maintaining some sort of dictionary (tough
for user-defined words) or flagging each occurrence of a given word?
Related to the word-splitting problem is DOCUMENT's reluctance to
leave a line "too short," even to the point of producing output that
is unusable. This may be a bug so I am including sample .SDML code
and output. Please let me know if the following behavior is what I
should consistently expect from DOCUMENT and therefore must manage in
some way:
<LIST>(UNNUMBERED)
.
.
.
<LE>
Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN,
creates a CDD directory with the full path name
CDD$TOP.DEPT32.FIELDMAN.PERSONNEL_TEST
.
.
.
<ENDLIST>
Here is the output (doctype S.H). The first and second lines of the
list element generate a "line too long" warning in the .LOG:
o Assuming that CDD$DEFAULT is defined to be CDD$TOP.DEPT32.FIELDMAN,
creates a CDD directory with the full path name CDD$TOP.DEPT32.FIELDMAN.
TEST
I believe DOCUMENT software should be able to figure out how much
space it has left on a line and simply move an entire word to the next
line rather than extending words too far into the margin, or (for line
1
2) trashing part of the word (PERSONNEL_) in the output. Maybe this
is appropriate behavior for processing text marked as an example or
figure. The behavior seems inappropriate for processing lines of
general text in programming manuals.
2
T.R | Title | User | Personal Name | Date | Lines |
---|
699.1 | If there is a will there is a way. | VAXUUM::PELTZ | �lvynstar Dun�dain | Thu Jul 23 1987 14:41 | 9 |
|
There is a tag called <keep>, the argument to this tag
is any word that you don't want to have hyphenated. It
takes the form of
<keep>(DONT.HYPHENATE_THIS$WORD)
Chris
|
699.2 | TeX Hack | COOKIE::WITHERS | Le plus ca change... | Thu Jul 23 1987 15:16 | 11 |
| Since, stylistically, I loathe hyphenation, I put the following
command at the top of every file:
<INCLUDE_TEX_FILE>(NO_HYPHEN.TEX)
and NO_HYPHEN.TEX contains:
\pretolerance=100000
Regards,
BobW
|
699.3 | Brute force vs. the perfect answer | BOOKIE::GENT | Party gone out of bounds -- B52's | Thu Jul 23 1987 15:31 | 13 |
| I'm afraid neither replies .1 nor .2 really answer the original
question. If you have 10-20 user-defined symbol names per page
it is simply not feasible to enclose them all in <KEEP> tags.
While avoiding all hyphenation makes a bad situation worse.
(Sorry guys.)
I believe the questions are:
1) Is there a way to turn off hyphenation at underscores?
2) Is there a way to turn off hyphenation at embedded periods?
--Andrew
|
699.4 | TAG$POST_TRANSLATE may help here | CLOSET::ANKLAM | | Mon Aug 10 1987 16:50 | 57 |
|
Andrew is (as usual) correct. We have two ways of controlling the
hyphenation of words with underscores and slash characters (/).
Providing special processing when a word has an embedded period
is something we hadn't yet considered.
Here's the way these are handled.
1. By default, DOCUMENT treats underscore characters and slashes
the same way that it treats explicit hyphens in a word. If a
line contains a long string with a lot of _ or / in it, TeX
will break any line following the _ or the /, but will not
introduce a hyphen character to show continuation. This was
an intentional design decision based on our original specs.
2. This behavior can be overridden in a DESIGN file, if this
behavior is inappropriate for a specific type of document.
The following should be put in the DESIGN file:
\def\_{\underscore}
\def\slash{/}
These could also be put in the DOC$LOCAL_ELEMENTS file to make
the change system-wide.
3. There is also a way to 'trap' a certain set of specific names
and keep them from being hyphenated. This method involves the
TAG$$LOCAL_STRINGS file. (This is not documented for customers;
the interface is tricky and error-prone.) We have used this in
previous versions of DOCUMENT and it is still in use in some
places.
After the tag translator finishes translating the tags in a file,
but before the text formatter runs, a post-translate routine
is invoked. This program fixes up special characters and makes
sure that the text formatter is handed 'clean' text. It also
looks for a file with the logical name TAG$$LOCAL_STRINGS. If
this logical is defined, it is assumed to be a file that contains
a set of strings to be translated and their outputs.
If you look in DOC$LOCAL_FORMATS:CUP$LOCAL_STRINGS.TXT, you will
see the list of product names that we look for and handle. In
that file, you see things like:
'MACRO-11' '\:{MACRO--11}' /* MACRO-11 */
This input/output/comment sequence is read in during
post-translation. Any occurrence of MACRO-11 (with a hyphen)
is translated so the hyphen is an en-dash (which looks better)
and it uses the shorthand \:{ } to put a hbox around it, so it
will never be split across lines.
I hope that this helps enough to solve some of the problems in .0.
patti
|
699.5 | | CRAYON::GENT | Party gone out of bounds -- B52's | Mon Aug 10 1987 17:13 | 3 |
| Thanks Patti. That definitely helps.
--Andrew
|