[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | -={ H A C K E R S }=- |
Notice: | Write locked - see NOTED::HACKERS |
Moderator: | DIEHRD::MORRIS |
|
Created: | Thu Feb 20 1986 |
Last Modified: | Mon Aug 03 1992 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 680 |
Total number of notes: | 5456 |
364.0. "RMS-F-KEY error on only one node" by FROST::HARRIMAN (No longer a 41 class part) Thu Dec 04 1986 10:07
This conference seems like the best place to get ideas for going
about solving this problem, I have not seen anything quite like
it before.
Here's the background:
We are running a cluster with 3 785's and 1 750, 3 HSC50's and
about 30 or so RA81's. We are running a rather large third-party
software package which is written mostly in BASIC and C. It uses
standard RMS. All nodes in the cluster have this image installed.
Two of the 785's have 32meg, and the 750 and the other 785 have
8 meg apiece.
Here's the problem:
Package runs fine on the two 32Meg 785's and the 750. On the
other 785, however, a specific and repeatable failure occurs when
trying to get records from sorted files. The files are of relative
organization, and have many,many records in them.
I have traced the path of the code to SYS$GET. On the working
nodes, R0 contains RMS-S-SUCCESS. However on the failing node, it
contains RMS-F-KEY. It is looking for record number 1 in all cases.
Caveats here are I can run the same image on each node and it only
fails on one of the nodes.
SYSGEN RMS parameters are identical on all nodes. Some of the
memory-reliant SYSGEN parameters are, of course, smaller on the
failing node, but also on the 750, and that works... Hardware has
been considered, but the failing 785 is running newer rev boards
than the other systems, and VAXsim doesn't see any errors. System
has had no BUGCHECKs in over six months, and this problem occurred
in the past month, since the system was upgraded to a 785 from a
780. Compiling and linking on the failing node gives the same result.
(additional caveat there: compiling and linking on failing node
doesn't work on failing node but it works on the other nodes!)
I'm frankly stumped. The vendor doesn't even understand the problem,
let alone know anything about it. None of the other systems people
here have been able to come up with anything, and the problem has
been put back in my lap. We can't upgrade to a newer version of
VMS since the package "isn't certified" for use on a newer version.
That means if I report this via an SPR it has a very good chance
of not being answered satisfactorily. Besides, it's a third party
software package, and even though SYS$GET belongs to DEC, I can't
prove there's something wrong with SYS$GET...arghh.
Do any of you eminently creative persons have any other ideas about
how to go about finding/fixing this wierdness? I have done myriad
DEBUG's of this problem, I can reproduce it at will. I know what
is going into SYS$GET and what is coming out of it. Any and all
comments will be appreciated.
/an_extremely_confused_paul
T.R | Title | User | Personal Name | Date | Lines |
---|
364.1 | *** It's a 785 ! *** | VAXWRK::SARO | Nybbles 'n bits 'n bits 'n bits | Mon Dec 08 1986 11:23 | 3 |
|
Seems like you haven't heard about 785's. More than likely it's
a hardware problem (or is it firmware?).
|
364.2 | Enlighten Me | FROST::HARRIMAN | No longer a 41 class part | Mon Dec 08 1986 12:47 | 7 |
| > Seems like you haven't heard about 785's. More than likely it's
> a hardware problem (or is it firmware?).
No I hadn't. Could someone who is familiar with this problem give
me a clue? Or a pointer?
|
364.3 | I'll send mail to you | VAXWRK::SARO | Nybbles 'n bits 'n bits 'n bits | Mon Dec 08 1986 16:48 | 3 |
| I'm not sure how restricted (or unrestricted) that information
is. I'll send you excerpts from the FPR (Nov '86).
|
364.4 | *DTR* true | FROST::HARRIMAN | No longer a 41 class part | Mon Dec 08 1986 16:52 | 2 |
| Acknowledged and appreciated.
|