T.R | Title | User | Personal Name | Date | Lines |
---|
640.1 | version of VMS ? | GALVIA::polomt.ilo.dec.com::duke | | Tue Feb 04 1997 13:24 | 10 |
| Alan,
I've copied the files over. I need to know the version of VMS that the
customer is using so that I can interpret the addresses in the DMP file.
Perhaps you can tell me what customer is having the problem. I've fixed
several server crashes recently and have given Reuters a patched image. I
need to check from the dump file whether that patch could affect this
problem.
thanks,
Ronan
|
640.2 | More info. | KERNEL::CROOKS | Nice work if you can get it.... | Wed Feb 05 1997 12:41 | 10 |
| Hi Ronan
Thanks for the quick response. Yes I am aware of recent fixes
for Reuters, this customer though, is BP (Oil).
They are running OpenVMS v6.2 on a VAX 6000-510
hope this helps
Alan.
|
640.3 | | GALVIA::DUKE | Ronan | Mon Feb 10 1997 14:44 | 26 |
| Hi Alan,
I've been away the last couple of days so I haven't made a lot of progress.
It turns out that we haven't got a VAX machine in the plant here running OVMS
V6.2. What I'd prefer here is that the customer would take the the server
patch that I've sent to Reuters. This patch is linked on V6.1 so it'll run on
the customer's system. That will enable me to correlate the dump addresses
with the linker map file.
Alternatively, I should be able to get V6.2 put up on a VAX over here in the
next day or 2.
The advantage of the first is that some of the fixes in the patch may help the
problem. On the other hand, the customer may not want to do that so I can do
the second.
What do you reckon ?
Ronan
ps From looking at the dump files so far, I'm not that hopeful that any of the
fixes so far will affect the problem. The user seems have a server with a VAS
front end and another server at the rear. The crash seems to happen at a
different place each time. Can the customer give a general description of the
VTX set up ? You can mail me if they'd rather it not be discussed in a
notesfile. If the dump file + linker map file does not provide enough info,
I'll be looking for more specifics anyway in order to replicate the problem.
|
640.4 | BTW... | GALVIA::DUKE | Ronan | Mon Feb 10 1997 14:53 | 5 |
| the server patch is at:
HNGOVR::PUBU$:[duke]vtxsrv.exe;1
-ronan
|
640.5 | | WELKIN::ADOERFER | Hi-yo Server, away! | Mon Feb 10 1997 17:58 | 2 |
| If it helps, I have a small vax running 6.2, there may be enough room
to look at images if you need it.
|
640.6 | | KERNEL::CROOKS | Nice work if you can get it.... | Tue Feb 11 1997 10:46 | 12 |
| Hi Ronan,
I will suggest to the customer they install the patch but
let them know that there are no guarantees with it. For what
its worth it doesnt seem like they have had another occurrence
in the last week or 2 anyway....
will let you know how it goes
thanks for now
Alan.
|
640.7 | | GALVIA::DUKE | Ronan | Tue Feb 11 1997 11:11 | 6 |
| re .-2
Bill,
I need about 8k blocks to copy over the olbs, link them and create the map
file. That would be a big help if there's enough space on your system.
-Ronan
|
640.8 | | GALVIA::DUKE | Ronan | Wed Feb 12 1997 17:35 | 41 |
| Alan,
thanks to Bill, i was able to link the server on a V6.2 system.
i've analyzed the dump of 14-jan-1997
it appears that the server is getting a remote page by first sending a connect
message to a remote server. It has received the confirm message but this
message seems to be incorrect since the accvio happens when the server tries to
unload the confirm message:
GLOBAL ROUTINE SRV$UTL_UNLOAD_CONFIRM( sess:REF $SCB$ ): NOVALUE =
!++
!
! FUNCTIONAL DESCRIPTION:
!
! Unload the goods from a received confirm message.
cmd = .sess[SCB_COMMAND];
sess[SCB_PROTOCOL] = .cmd[VAP_CONF_PROTOCOL]; <-- crash
the cmd variable must be 0.
This is confirmed from the dump file which has
SCB_COMMAND to be 000000
SCB_COMMAND_LEN is 0.
The remote server/back-end application has the name CLVX.
Can you ask the customer what the name CLVX represents ? It will be in the
server startup file as
remote CLVX node::"nn="
It is either a server of a back-end application (ELK) of some sort. If it is
an ELK, it looks likely that the confirm message being sent to the server is not
being created correctly. If it is a server, then we have an interesting
problem, as other VTX servers should know how to create those messages correctly.
-Ronan
Can you
|
640.9 | reply of 9/4 | GALVIA::polomt.ilo.dec.com::duke | | Tue Apr 15 1997 17:58 | 82 |
| From: NAME: Alan Crooks
FUNC: Customer Services
TEL: 01256 373561 <CROOKS@A1CHEFS@RDGMTS@REO>
To: NAME: RONAN DUKE <"A1::RONAN DUKE"@MRGATE@ESSB@ESSB@GALVIA@KERNEL@CHEFS@MRGATE@RDGMTS@REO>
Hi Ronan,
Sorry its been a bit quiet on this one but I've been out
of the office a lot and the customer doesnt chase on it.
To bring you up to date, I suggested access to their system but
that would not be allowed for CLVX (it would be for the others). The
CLVX system seems to be a law unto itself within BP, following is the
mail the customer sent me in response to the suggestion.
I'm not sure there's much we can do about it if they wont
upgrade the v4.1 system.... and although thats not the
perfect answer I think we should stand on that point for
now.
thanks for your help so far I will get back to you with any
progress.
cheers
Alan.
Mail from BP
------------------
Hi Alan,
I think there is almost no chance of getting the CLVX system to upgrade to
VTX V6.2. Over six months ago I tried to persuade them to do so for other
reasons (so that we could use TCP/IP for communications rather than DECnet)
but with no success. Unfortunately we seem to be one of the few sites within
our company that actually attempts to keep (most of) our software up to the
latest versions on VMS systems.
Is it our site or CLVX that Engineering would be wanting to dial in to? If
it's the latter we're almost certainly out of luck again. If it's the former,
can Engineering make use of AES? We no longer have modems attached to our
system for Digital to dial in: we've relied on AES for the last couple of
years.
By the way, we've had one server crash since you sent us the most recent image,
--------------
END
From: NAME: RONAN DUKE
FUNC: Corporate Engineering <RONAN DUKE@A1@ILO>
To: NAME: CROOKS
<"A1CHEFS::CROOKS"@MRGATE@RDGMTS@RDGMTS@CHEFS@KERNEL@GALVIA@MRGATE@ILOV05@ILO>
Hi Alan,
>
>Re your point about customer files, as logistically that would
>be a bit messy getting them across etc, would a dial-in
>achieve the same thing or do you need the files on your system
>to work with.
>
It's handier for me to have the files on my own system from the point of view
of debugging. However, I don't have a copy of a V4.1 server (I don't think
there was a V4.4 of VTX) - we only took over at V6.0 ! - so I think that that
would make debugging a waste of time.
Is it possible for the customer to upgrade the CLVX server ? There does seem
to be some problem with it as most of the crashes so far have been caused as a
result of messages sent back from it to the main server. Is there any reason
for it to be still at V4.1 ? I don't want to piss them off but we could say
that a V4.1 server is unsupported (actually it *is* unsupportable since I don't
have a copy of it). Theoretically, the V6.2 server should be able to
interoperate with a V4.1 server (in Digital (DIGITAL ?), there are still V3.0
servers working with V6.2 servers) but they may have found something specific.
For example, we know of a V6.2 to V5.1 interoperability problem with search
queries.
Ideally, I'd want them to upgrade CLVX to V6.2. I'd give that a fairly good
chance of fixing the problem. If the problem still persists, I could dial in
(I also have a cryptokey and could telnet in if they allowed that) and do some
debugging or, at least, get enough info to set up a similar environment here.
Ronan
|
640.10 | my reply of 15/4 | GALVIA::polomt.ilo.dec.com::duke | | Tue Apr 15 1997 18:29 | 42 |
| Hi Alan,
>
>Sorry its been a bit quiet on this one but I've been out
>of the office a lot and the customer doesnt chase on it.
don't worry about that - i've been pretty busy and it suited me.
>
>To bring you up to date, I suggested access to their system but
>that would not be allowed for CLVX (it would be for the others). The
>CLVX system seems to be a law unto itself within BP, following is the
>mail the customer sent me in response to the suggestion.
>
>I'm not sure there's much we can do about it if they wont
>upgrade the v4.1 system.... and although thats not the
>perfect answer I think we should stand on that point for
>now.
i agree - all the data so far points to CLVX system.
thinking about it some more, i'm not sure what good i can do by dialing
into
their system, since the crash happens fairly irregularly. however, each
time
there's a crash, the dump file gives me a bit more info so that helps and
seems
to be all we can do for now.
>
>thanks for your help so far I will get back to you with any
>progress.
one other point - the customer mentioned another server crash with the new
image. i'd very much like to get a copy of the bugcheck dump file. i
modified
the image to create more info for me (only in the area of unpacking
messages
from CLVX) and I'd be very interested in seeing the results. can you see
if
they've still got this file ? if not, can you ask the customer to send on
any
further dumps ?
thanks,
Ronan
|