[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxuum::document_ft

Title:DOCUMENT T1.0
Notice:**New notesfile (DOCUMENT.NOTE) now available (see note 897)**
Moderator:CLOSET::ADLER
Created:Mon Feb 09 1987
Last Modified:Thu Oct 31 1991
Last Successful Update:Fri Jun 06 1997
Number of topics:897
Total number of notes:4397

675.0. "How do I automate the index process?" by NCADC1::PEREZ (The sensitivity of a dung beetle.) Wed Jul 15 1987 18:58

    I just came back from ?running? an informal Document seminar.  It
    was well received and I look forward [sic] to having our CPU absolutely
    buried by happy people documenting.
    
    HOWEVER, I told the folks that to do an index you had to manually put
    the <X> tag on each entry you wanted in the index, every time you
    wanted it referenced.  For example, if you wanted the word "HORSESHOE"
    indexed and referenced it on 4, 103, 157, 322, you would have to put an
    entry at each of those points in the source.  If a page break occurred
    the reference would be incorrect in the index. 
    
    I D*** near got LYNCHED!!!  The general comment was "No wonder our
    documents have such lousy indexes, if it takes that much work."
    References were made to unnamed primitive products that allow the
    user to put in a list of references to be indexed that are then
    permuted and correctly correlated with the sources. 
    
    Am I wrong?  Does Document do this (hope, hope)?  Is there some
    automatic way to get the references to index elements correct for all
    occurences in a document?  Or, are we doomed to having 20 lines
    of index with only the initial reference for an 8000 line document? 

    Dave P
T.RTitleUserPersonal
Name
DateLines
675.1Modification of <X> required/wished/called for?IJSAPL::KLERKTheo de KlerkWed Jul 15 1987 19:358
 One of those primitive products is Runoff that will accept >string
 to enter "string" in the index. Perhaps as a wish the <X>(text)
 should leave the "text" in the paragraph as well as entering it
 in the index. It sounds like a not too difficult solution to
 implement, though highly incompatible with previously made documents
 that will then suddenly see two "text"s in the final page output

Theo
675.2I agree and disagreeCOOKIE::JOHNSTONWed Jul 15 1987 19:5636
RE: .0

I couldn't let this one go by without the comment that it takes more 
than an automatic feature to make a truly useful index.  You must 
consider more than just where words appear on a given page.  You must 
also think about user concepts and the many different ways in which a 
user will try to look up information.  Take something simple like the 
print command. Users will look in the index for any or all of
the following leaders (or something similar to them):
 
        Commands 
        Command qualifiers
        Output
        PRINT command
        Printing
        Qualifiers

This is a really small example that doesn't even consider logical queue 
names, destinations, offsets, whatever.  You can't anticipate every 
option; and even if you could, you have to make conscious decisions 
about what to include or not because resources for paper, people, time, 
ad nauseum are always scarce.

I don't disagree with having more automatic features, but I think that 
only artificial intelligence can come close to writing as good an index 
as a human can.  And I think that noone should be allowed to write an 
index before having training in it; a poor index only makes a bad book 
worse.  The good news: a good index can go a long ways towards improving 
the usability of a poorly organized book.

Indexes are being mandated corporate-wide; I don't envy the task of 
anyone who has to develop the tools or the training.


Rose

675.3Nevertheless...IJSAPL::KLERKTheo de KlerkThu Jul 16 1987 03:5594
 Considering the .-1 remark, and realizing than processing the same
 file twice, the <X>(text) might get expanded repeatedly into e.g.
 text<X>(text) --> texttext<X>(text).

 So, it might be better to use the Runoff approach and use something
 unique (like >> or @@) to prefix the word that needs to go into
 the index. A six minute Scan hack follows. The out-commented part
 would expand the <X> tag, the currently active part uses the >> approach.

 I agree that I  will do but don't like repeating myself by making <X>
 inserts which are identical to the word just written.

<X>(have fun)

Theo

-------------------------------------------index.scn---------------------

MODULE index_insert;

!++
!
! Author: Theo de Klerk    (IJSAPL::KLERK)
!
! Creation Date:   87 06 16
!
! Modified:
!
! Description:
!
!       This program converts strings in an SDML file 
!       <x>(text)                    --> text<x>(text)
!       <x>(text1 <subentry> text2)  --> text1<x>(text1 <subentry> text2)
!       >>text                       --> text<X>(text)
!--
    SET valid_chars ('!'..'~');

    TOKEN index_token CASELESS {'<X>'};
    TOKEN arg_start {'('};
    TOKEN arg_end {')'};

!...Replace >> by whatever you like as a flag for indexing
    TOKEN add_index {'>>' valid_chars...};
    
    !++
    !
    !    Replace                         With
    !
    !	 <X>(text <subentry> text2)      text<X>(text<subentry>text2)
    !
    !--
!... ***** DEACTIVATED ****
!    MACRO replace_index TRIGGER 
!               {index_token arg_start argument : FIND(')') arg_end};
!    
!    DECLARE n : INTEGER;
!    
!    n = INDEX( UPPER(argument), '<SUBENTRY>' );
!    IF n <> 0 
!    THEN ANSWER argument[1..n], '<X>(',argument,') '; 
!    ELSE ANSWER argument, '<X>(',argument,')';
!    END IF;
!    END MACRO;

    !++
    !
    !     Replace                      With
    !
    !      >>text                      text<X>(text)
    !
    !--

    MACRO add_index_macro TRIGGER 
              {text : add_index };
    
    ANSWER text[3..], '<X>(',text[3..],')';
    END MACRO;

    PROCEDURE main MAIN;

	DECLARE file_name : STRING;

	READ 
            PROMPT ('Enter Document file name (include type): ')
            file_name;

        START SCAN
	    INPUT FILE file_name 
	    OUTPUT FILE file_name;     

    END PROCEDURE /* main */;
 
END MODULE /* INDEX_INSERT */;

675.4No help today, but ...VAXUUM::DEVRIESM.D. -- your Device DoctorThu Jul 16 1987 11:5327
    Somebody *could* write a search-and-replace kind of hack to let
    you search a file (or list of files) for a given word or phrase
    and then (1) duplicate that as the argument to <x>; (2) let you
    enter an argument or tag of your choosing; or (3) leave it alone.
    Such a thing could allow a "global search-and-replace" mode.  In
    conjunction with a global mode, it could also have a
    "search-and-delete" mode, so you could visit all the inserted tags
    and remove them where they weren't appropriate.  (And I hope the
    search algorithm would treat all space elements (space, tab, endline,
    maybe "\") as possible equivalents.)
    
    The ultimate solution, of course, is a combination of training and 
    more creative tools.  The pressure to direct resources to this kind
    of development must come from management, as the mindset shifts
    from "it's the last thing you gotta throw in" to "it's as important as
    what you say in the book" to (eventually?) "it's something you plan
    up-front, like the outline (contents) of the book".  (And kudos
    to those who realize its importance today.)
    
    This change in outlook *will* come, and it won't be far away, 
    because the index and table of contents are likely to be the
    *primary* ways of traversing online documentation.  The
    flip-till-you-see-it method of information retrieval will not be
    acceptable with online documentation, at least not until new techniques
    and technologies are found.
    
    --Mark
675.5To be a TeXnician or a TeXwizard or a TeXlunie...IJSAPL::KLERKTheo de KlerkThu Jul 16 1987 13:146
 Come to think of it, the DECTeX equivalent to <X> could be a macro that
 does both text + \indexentry{text}  (or whatever it is - I did not look)
 If you know that, you could enter it in your local elements and have
 it solved...

Theo
675.6Intelligence, artificial or otherwise - we need itNCADC1::PEREZThe sensitivity of a dung beetle.Mon Jul 20 1987 23:0715
    This has been great, but I think my original question got lost.
    All I want is a simple way to have then indexing find all the
    references to a particular element and put them in the index.  For
    example, as in .2 with the print -- once I've told whatever to find
    "printing" I would like to have the appropriate
    index and subindex entries set up with the page numbers found on
    throughout the document.

    Currently, about all that happens is a quick and dirty TPU run through
    the document indexing the <head and <subhead entries.  I don't know
    about the rest of you, but I can't get anybody to rummage a 100
    page document looking for likely index elements, much less a 3"
    thick design document.
    
    Dave P
675.7AUTHOR::WELLCOMESteveTue Jul 21 1987 10:1117
Re: .6
>    Currently, about all that happens is a quick and dirty TPU run through
>    the document indexing the <head and <subhead entries.  I don't know
>    about the rest of you, but I can't get anybody to rummage a 100
>    page document looking for likely index elements, much less a 3"
>    thick design document.

    I, for one, DO go through manuals looking for index elements.  It
    can get tedious, but I think it's the only way to get a really good
    index.  A good index is hard to write!  A good index involves a
    heck of a lot more than putting in <head> and <subhead> entries,
    or a bunch of keywords.  One has to enter the mind of the likely
    user of the manual and imagine what he is going to want to look
    up and how he's likely to try to do it.  All too many of the indexes
    this company puts out are egregiously bad.  An index needs to be
    WRITTEN, just the way any chapter or appendix in the manual needs
    to be written.  
675.8Automated indexing? No thank you!TLE::SAVAGENeil, @Spit BrookTue Jul 21 1987 10:256
    I agree with Steve: An automated index feature would encourage careless
    indexing.  I would never use it or encourage any writer to use such
    a feature to index a book. 
    
    The solution is to index it as you write it; putting off the indexing
    until the end (as a afterthought) is bad practice!
675.9BETTER indexing, however it is to be done!GLINKA::GREENETue Jul 21 1987 11:5428
    Our CUSTOMERS (remember them?) find that our documentation (for
    both hardware and software) is one of our *real problem areas*.
    I refer here to customers who use our documentation manuals, not
    customers who will be using DOCUMENT -- although they certainly
    will have some of the same concerns, for both their own use and
    their readers.
    
    "Poor indexes" [or indices -- what's proper these days?] is the MOST
    common complaint about DEC documentation.
    
    I am not knowledgeable enough about the various options (past, present,
    or future) for creating an index to "vote."  But it seems essential
    that whatever tools can be created to facilitate better indexing
    is essential.  
    
    If customers can't FIND what they need in the documentation, then
    it doesn't really matter if it is clearly written -- or even there
    at all!  As for the concern about how many ways one could try to
    look up "print,"  that IS a serious problem.  As the customers have
    complained (and no doubt many of us have experienced), "you have
    to know the correct term already before you can find it in the
    index..."
    
    DOCUMENT is great!  Let's not forget the *use* of the end product.
    
    	Penelope
    
    
675.10AUTHOR::WELLCOMESteveTue Jul 21 1987 12:017
    Re: .8
    I guess we all work differently; I generally save the index until 
    the end, because then I know exactly what has to go into it.  I find
    if I try to do it as I go along I have to rewrite and reorganize
    so much it ends up being more work.  The important thing is, if 
    one does do the index at the end, DON'T do it as an afterthought!  
    Leave plenty of time, whenever you do it.
675.11And that includes the ones from "tech writers".NCADC1::PEREZThe sensitivity of a dung beetle.Wed Jul 22 1987 21:4519
    I don't wanna start a holy war over whether or not you should do
    the index first, last, in the middle, etc.  "Quite frankly, I don't
    give a damn.  

    <SET FLAME - FACE REALITY>
        
    However, I agree with .8 or .9 (whichever) -- our indexes are poor --
    which is exactly what I said in the original note.  Its fine to tell a
    bunch of programmers to "leave plenty of time to do the index" but
    SINCE THE WHOLE MANUAL IS CONSIDERED AN AFTERTHOUGHT AND WASTE OF TIME
    its gonna be tough.  Without some automated way to do at least a
    mediocre index, there won't be ANY INDEX. 
    
    I haven't seen a project go out of here in two years with indexes
    on ANY OF THE DOCUMENTS (except for the "poor" ones I did from the
    heads and subheads).  The only reason there's a Table of Contents
    is because its automated!
    
    D
675.12Clean one's own house first.BUNSUP::LITTLETodd Little, NYA SWS, 323-4475Sun Jul 26 1987 16:0015
    re: .11
    
    The following phrase that recently has meant much to me, is somewhat
    applicable:
    
    "Poor planning on your part doesn't necessarily constitute an emergency
     on my part."
    
    A better was to say what I'm getting at is; Don't complain about
    the tools and talk about religion when the problem is within your
    own realm to solve, and in the case of bad planning, should be solved.
    
    -tl
    An X-DCC who feels that 1 in 100 project managers/leaders knows
    what the word "plan" means.
675.13I don't make the news, I just report it.NCADC1::PEREZThe sensitivity of a dung beetle.Tue Jul 28 1987 16:2513
     RE .12
    
>        A better was to say what I'm getting at is; Don't complain about
>    the tools and talk about religion when the problem is within your
>    own realm to solve, and in the case of bad planning, should be solved.

    I wasn't complaining about DOCUMENT.  I like DOCUMENT.  I was simply
    relaying a concern expressed by member(s) of the Sales Support staff.  

    I'd like INDEX to permute again, but other than that, I figure the
    developers will do whats best.
    
    D