[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference gyro::internet_toolss

Title:Internet Tools
Notice:Report ALL NETSCAPE Problems directly to [email protected].rnet? Read note 448.L for beginner information.
Moderator:teco.mro.dec.com::tecotoo.mro.dec.com::mayer
Created:Fri Jun 25 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4714
Total number of notes:40609

3807.0. "Format of robots.txt?" by WELKIN::ADOERFER (Hi-yo Server, away!) Tue Jul 02 1996 18:17

T.RTitleUserPersonal
Name
DateLines
3807.1CFSCTC::SMITHTom Smith MRO1-3/D12 dtn 297-4751Tue Jul 02 1996 20:051
3807.2A Standard for Robot Exclusion:RANGER::WASSERJohn A. WasserMon Jul 08 1996 11:077
3807.3what Content-type?AUSS::GARSONDECcharity Program OfficeThu Feb 13 1997 01:3310
    re .*                                         
    
    The previously referenced standard (which seems to have moved anyway)
    does not specify what MIME Content-type is expected for robots.txt.
    
    I am using a cut down HTTP server for a couple of legacy wrapper
    situations and it hard-codes "text/html" whereas I would expect a full
    function HTTP server to specify something else. Does anyone know
    what Digital's internal crawlers expect in the way of MIME Content-type?
    Will text/anything do? Do they even check the MIME Content-type?
3807.4I'd expect text/plainHOUBA::MEHERSDamian, http://bigbird.geo.dec.com/Thu Feb 13 1997 02:481
    
3807.5AUSS::GARSONDECcharity Program OfficeTue Feb 18 1997 23:346
    re .3
    
    Partly answering my own question...the new draft of this standard (as
    yet unfinished) says explicitly that the Content-type must be
    "text/plain". It does not however say what happens if the Content-type
    is otherwise.
3807.5AUSS::GARSONDECcharity Program OfficeThu Feb 20 1997 01:308
    re .3
    
    Partly answering my own question...the new draft of this standard (as
    yet unfinished) says explicitly that the Content-type must be "text/plain".
    It does not however say what happens if the Content-type is otherwise.
    
    A possible workaround that was suggested to me offline is to use a
    proxy server.