[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:-={ H A C K E R S }=-
Notice:Write locked - see NOTED::HACKERS
Moderator:DIEHRD::MORRIS
Created:Thu Feb 20 1986
Last Modified:Mon Aug 03 1992
Last Successful Update:Fri Jun 06 1997
Number of topics:680
Total number of notes:5456

364.0. "RMS-F-KEY error on only one node" by FROST::HARRIMAN (No longer a 41 class part) Thu Dec 04 1986 10:07

    This conference seems like the best place to get ideas for going
    about solving this problem, I have not seen anything quite like
    it before.
    
    Here's the background:
    
       We are running a cluster with 3 785's and 1 750, 3 HSC50's and
    about 30 or so RA81's. We are running a rather large third-party
    software package which is written mostly in BASIC and C. It uses
    standard RMS. All nodes in the cluster have this image installed.
    Two of the 785's have 32meg, and the 750 and the other 785 have
    8 meg apiece.
    
    Here's the problem:
    
       Package runs fine on the two 32Meg 785's and the 750. On the
    other 785, however, a specific and repeatable failure occurs when
    trying to get records from sorted files. The files are of relative
    organization, and have many,many records in them.
    
       I have traced the path of the code to SYS$GET. On the working
    nodes, R0 contains RMS-S-SUCCESS. However on the failing node, it
    contains RMS-F-KEY. It is looking for record number 1 in all cases.
    Caveats here are I can run the same image on each node and it only
    fails on one of the nodes.
    
    SYSGEN RMS parameters are identical on all nodes. Some of the
    memory-reliant SYSGEN parameters are, of course, smaller on the
    failing node, but also on the 750, and that works... Hardware has
    been considered, but the failing 785 is running newer rev boards
    than the other systems, and VAXsim doesn't see any errors. System
    has had no BUGCHECKs in over six months, and this problem occurred
    in the past month, since the system was upgraded to a 785 from a
    780. Compiling and linking on the failing node gives the same result.
    (additional caveat there: compiling and linking on failing node
    doesn't work on failing node but it works on the other nodes!)
    
    I'm frankly stumped. The vendor doesn't even understand the problem,
    let alone know anything about it. None of the other systems people
    here have been able to come up with anything, and the problem has
    been put back in my lap. We can't upgrade to a newer version of
    VMS since the package "isn't certified" for use on a newer version.
    That means if I report this via an SPR it has a very good chance
    of not being answered satisfactorily. Besides, it's a third party
    software package, and even though SYS$GET belongs to DEC, I can't
    prove there's something wrong with SYS$GET...arghh.
    
    Do any of you eminently creative persons have any other ideas about
    how to go about finding/fixing this wierdness? I have done myriad
    DEBUG's of this problem, I can reproduce it at will. I know what
    is going into SYS$GET and what is coming out of it. Any and all
    comments will be appreciated.
    
    /an_extremely_confused_paul
    
T.RTitleUserPersonal
Name
DateLines
364.1*** It's a 785 ! ***VAXWRK::SARONybbles 'n bits 'n bits 'n bitsMon Dec 08 1986 11:233
    
    Seems like you haven't heard about 785's. More than likely it's
    a hardware problem (or is it firmware?).
364.2Enlighten MeFROST::HARRIMANNo longer a 41 class partMon Dec 08 1986 12:477
>        Seems like you haven't heard about 785's. More than likely it's
>    a hardware problem (or is it firmware?).

 
    No I hadn't. Could someone who is familiar with this problem give
    me a clue? Or a pointer?
    
364.3I'll send mail to youVAXWRK::SARONybbles 'n bits 'n bits 'n bitsMon Dec 08 1986 16:483
    I'm not sure how restricted (or unrestricted) that information
    is. I'll send you excerpts from the FPR (Nov '86).
 
364.4*DTR* trueFROST::HARRIMANNo longer a 41 class partMon Dec 08 1986 16:522
    Acknowledged and appreciated.