[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 7.286::digital

Title:The Digital way of working
Moderator:QUARK::LIONELON
Created:Fri Feb 14 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5321
Total number of notes:139771

4510.0. "Yellow Pages of Internal URLs?" by tennis.ivo.dec.com::KAM (Kam WWSE 714/261.4133 DTN/535.4133 IVO) Thu Mar 28 1996 00:53

Does anyone have the ability to extract off one of the Internet Server
ALL the Internal URLs?

I'm just returning from a two week trip in Asia.  As I start to root
about the network I notice a number of new URLs.  I don't have the time
to look at every conference, note, and reply to see what's available.

Do we have or can we start an official Yellow Pages for our Internal 
URLs, I'm not including the external URLs as I assume everything 
external should be addressable via www.digital.com?  Am I correct in 
this assumption?

We have the Easynotes_Conference to list all the available 
Notesfiles/Conferences.  Can we create some mechanism to keep track of
all the internal URLs?

If what I am looking for is not available, I would like to create 
something that will list all the available internally URLs, whether 
their personnel, private, or public URLs.  

Any ideas on how we can coordinate this effort?  I assume all 
announcements should be coordinated via the INTERNET_TOOLS conference
or should we just continue to use the Easynotes_Conference?


	Regards,

	 kam

Also posted in the INTERNET_TOOLS and DIITAL conferences.
    
T.RTitleUserPersonal
Name
DateLines
4510.1HERON::KAISERThu Mar 28 1996 02:4916
> Does anyone have the ability to extract off one of the Internet Server
> ALL the Internal URLs?

As soon as Alta Vista is cloned for internal use.

> We have the Easynotes_Conference to list all the available
> Notesfiles/Conferences.

EASYNOTES_CONFERENCE doesn't list all available conferences, only the ones
that people have thought to announce there.  There are many conferences not
mentioned there.  But a good web spider will find all interlinked web
pages.  It'll still be possible to set up an isolated island of web pages,
but as soon as someone outside the island links to the island ... wham!
they're indexable.

___Pete
4510.2VANGA::KERRELLsalva res estThu Mar 28 1996 04:016
re.1:

There's an internal search engine off Digital's internal home page - does this
not use a web spider to build the index?

Dave.
4510.3CIM::LORENLoren KonkusThu Mar 28 1996 05:564
    I find that the AIT Announcement Server is pretty useful for finding
    internal stuff. See:
    
    	http://www-ad.mso.dec.com/announce/pa-toc.html
4510.4Digital's Internal World-Wide Web index (DWI)LGP30::FLEISCHERwithout vision the people perish (DTN 227-3978, TAY1)Thu Mar 28 1996 06:3020
re Note 4510.2 by VANGA::KERRELL:

> There's an internal search engine off Digital's internal home page - does this
> not use a web spider to build the index?
  
        I think you're thinking of the Digital Web Indexer,
        http://src-www.pa.dec.com/cgi-bin/dwi, which uses some of the
        same technology as Alta Vista.  

        However, the Digital Web Indexer does not use a spider but
        relies on distributed gatherer programs to send information
        to the index.  (This is similar to the Harvest architecture,
        and similar to the enterprise catalog server recently
        announced by Netscape.)

        Since it depends upon gather programs outside of the direct
        control of the maintainers of the index, coverage is
        inconsistent.

        Bob
4510.5personal spider / indexer ???SAYER::ELMOREthrough the looking glassThu Mar 28 1996 17:1418
    I'd like a slight "spider" variation.  I would like to find a program
    that starts at a given WEB page, or, optionally, takes your own
    hotlinks/bookmarks, and traverses, then indexes every linked page from
    there.

    Ideally you could specify "how many levels deep" to go.

    I've seen [somewhere] some software that wakes up to look at
    hotlinks/bookmarks/history URLs to see if pages have been recently
    updated.  That's close, but I'm looking for an indexer too.  My
    bookmarks are already basically what I need, but I can never remember
    what bookmark contains what piece of information...therefore my
    [personal] need for the [personal] index.
    
    I'm sure I could write a spider script of sorts that follows URLs
    around, but not the indexer.
    
    --Steve
4510.6Intranet Alta Vista Trial Offer!LJSRV2::POWELLTue Apr 02 1996 11:338
    You may have noticed that AltaVista is now under test internally.
    
    Try URL:   altavista.pa.dec.com/ and see what happens!
    I just noticed this entry this week, but don't know how long the test
    will run.  Looks like we're really going to make Alta Vista a product.
    
    Good luck!
    
4510.7yellow pages idea greatSALES::ICS::DIRICOFri May 10 1996 14:5613
    All of the inconsistent search stuff aside, since this web stuff took
    off quickly and now is quite large to pull in and
    control/maintain/organize...
    
    I love the idea of a yellow pages of intranet URLs.  I think as the web
    becomes a more vital way to communicate within the company as
    notesfiles/public directories/email decrease...the yellow pages is a
    key first step to build from.
    
    My first thoughts are that someone from Corporate Communications
    publish this but then again, maybe not.  Any other thoughts?  
    
    Mary Beth
4510.8QUARK::LIONELFree advice is worth every centFri May 10 1996 15:035
Re: .7

See .3

		Steve
4510.9TENNIS::KAMKam WWSE 714/261.4133 DTN/535.4133 IVOFri May 10 1996 15:065
    I'd like to see a yellow pages cuz I can't do a search if I don't know
    what phrase to supply the search engine.  I saw some URL's posted in a
    Notesfiles.  I went to Altavista.pa.dec.com and searched for the
    information and it didn't find it.  Therefore, I'm missing some
    valuable information.
4510.10not quite mission-critical yetLGP30::FLEISCHERwithout vision the people perish (DTN 227-3978, TAY1)Fri May 10 1996 16:1010
re Note 4510.9 by TENNIS::KAM:

>     I went to Altavista.pa.dec.com and searched for the
>     information and it didn't find it.  

        I don't believe that this is maintained as a production
        system, and thus may not always be available, may not be
        updated very often (or at all), etc.

        Bob
4510.11QUARK::LIONELFree advice is worth every centFri May 10 1996 17:043
Kam, have you TRIED the AIT Announcement Server?

			Steve
4510.12TENNIS::KAMKam WWSE 714/261.4133 DTN/535.4133 IVOFri May 10 1996 17:466
    I'm looking for a Digital ONLY Yellow Pages.  This Company has so much
    information that I don't want it cluttered with information outside
    this company.
    
    	Regards,
    
4510.13plugh.ibg.ljo.dec.com::needleMoney talks. Mine says "Good-Bye!"Fri May 10 1996 18:206
The information at altavista.pa.dec.com is in beta test.  It's not maintained
and is not a public service yet.  When it does become public, it would be
reasonable to expect a service of the quality of altavista.digital.com for
the intranet.

j.
4510.14exactly what would you like to see?LGP30::FLEISCHERwithout vision the people perish (DTN 227-3978, TAY1)Fri May 10 1996 19:5568
re Note 4510.0 by tennis.ivo.dec.com::KAM:

> We have the Easynotes_Conference to list all the available 
> Notesfiles/Conferences.  Can we create some mechanism to keep track of
> all the internal URLs?

        What that mechanism might be depends upon what you mean by
        "all the internal URLs".

        Note that the Easynotes_Conference conference does not list
        all of the internal topics and replies, it just lists the
        conferences.

        We do have separate services that actually search the content
        of most of the conferences (e.g., Comet at
        http://encke.alf.dec.com/cgi/v4.2 ).

        You use the former (Easynotes_Conferences) when looking for
        an appropriate conference.  It identifies conference by
        overall topic.

        You use the latter (Comet) when looking for specific notes a
        very specific subject, regardless of the conference
        containing them.

        I suspect that with the Web we need both kinds of service. 
        The nature of the Web makes the analogue of the former, an
        index of topical or thematic collections of pages, a little
        harder to define than does DEC Notes.  However, it probably
        should be an index of home pages (or what we called "front
        pages", as in the first page of a magazine or book) with a
        little description of the overall topic or theme of the
        service to which that page represents the entry.  This is
        what the Announcement Directory set out to be.

        The latter is simply AltaVista -- an index of all web pages
        (not password or otherwise protected) (note that the Comet
        URL listed above also provides an index of most Digital web
        pages).


> If what I am looking for is not available, I would like to create 
> something that will list all the available internally URLs, whether 
> their personnel, private, or public URLs.  
  
        So the question remains:  do you want to index every page as
        an entry in this list, or do you want to list every
        *collection* of related pages (recognizing that some
        significant "collections" may only be one page)?

        The former is a bit easier to do -- it can be done
        automatically, which is what AltaVista does.

        The latter is harder because it requires, for now, human
        intelligence to select the things to be registered, either in
        the form of a central staff, or through conventions followed
        by all who publish on the internal network (e.g., registering
        your own collections).

        If you'd like to do the latter, and implement a more robust
        version of the Announcement Directory, I'd be glad to see you
        do it and I'd offer any help I can.  There's a product in
        there, I'm sure (a number of similar products have been
        announced).  But hurry -- we in the group of which I am a
        part are likely to get our notices this coming week.

        Bob
        [email protected]
4510.15QUARK::LIONELFree advice is worth every centFri May 10 1996 22:105
    Re: .12
    
    Ok, so now I KNOW you haven't looked at it.
    
    			Steve
4510.16http://www-ad.mso.dec.com/announce/pa-toc.htmlLGP30::FLEISCHERwithout vision the people perish (DTN 227-3978, TAY1)Sat May 11 1996 08:4528
follow-on to Note 4510.14:

        One obvious comparison to the Announcement Directory is the
        Yahoo service.  I hesitate to make this comparison because
        Yahoo has full time people, essentially librarians working
        in cyberspace, who carefully construct and maintain a rich
        classification hierarchy.

        The Announcement Directory has no staff to do this, so the
        only classifications provided are those that can be
        automatically determined, e.g., whether a URL is owned by
        Digital or not, and whether it is external to Digital or on
        Digital's Intranet.  We can also do obvious sorts, such as by
        date and title.

        (There are opportunities for the application of advanced
        natural language processing techniques here.)

        It was hoped that the Digital community would provide
        informal maintenance of the entries (anybody can add, and
        actually anybody can delete and replace an entry).  To some
        extent this happens, but it is far from being as
        well-maintained as Yahoo.  (Nobody has it in their job
        description to maintain it.)  On the other hand, its content
        is probably as well-maintained as Easynet_Conferences.

        Bob
        [email protected]
4510.17some referencesLGP30::FLEISCHERwithout vision the people perish (DTN 227-3978, TAY1)Tue May 14 1996 12:4916
        More follow-on:

        Two articles have recently appeared on the Web that address
        aspects of this subject.  One is:

            http://www.cio.com/WebMaster/0596_field.html

        	-- "Finding the Way", by former DECie Tim Horgan

        Another is:

            http://gnn.com/wr/96/05/10/webarch/index.html

        	-- "Revenge of the Librarians"

        Bob