[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DEC TCP/IP Services for OpenVMS |
Notice: | Note 2-SSB Kits, 3-FT Kits, 4-Patch Info, 7-QAR System |
Moderator: | ucxaxp.ucx.lkg.dec.com::TIBBERT |
|
Created: | Thu Nov 17 1994 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 5568 |
Total number of notes: | 21492 |
5524.0. "SYSTEM-F-CONNECFAIL and PATHWORKS" by TLE::MICHAUD (Lisa Michaud, DTN 381-0879) Mon May 19 1997 16:10
I'm attempting to solve a problem that started a few months ago, before I
arrived here. As far as I know, nothing on the system changed. I have since
upgraded UCX to V4.1 ECO4 (OpenVMS/VAX is version 7.0), but the problem still
exists. I also changed the send/receive TCP quotas to 250000, because that was
a suggestion given in one of the Notes files I've been searching. Other notes
seem to suggest that it's either a UCX resource problem or a physical network
problem.
People are randomly losing connections to the PATHWORKS file server when doing
builds on their PCs. It's completely random as to when it happens, and it's
happening from multiple PCs running different versions of NT and W95. They
all use TCP/IP to connect to the file server.
Here is what's in the PATHWORKS file server log file:
15-May-97 11:28:21 (netio) Device bg, unit 51 (bg51): Network read AST returned
error status 8412 (%SYSTEM-F-CONNECFAIL, connect to network object timed-out or
failed)
And sometimes the errors are more extensive:
ssn_rqst_trans: IO$_WRITEVBLK channel 480 error 8428(%SYSTEM-F-LINKDISCON,
netwo
rk partner disconnected logical link)
shutdown_socket: IO$M_SHUTDOWN channel 480 error 20(%SYSTEM-F-BADPARAM, bad
para
meter value)
shutdown_socket: IO$_DEACCESS channel 480 error 8412(%SYSTEM-F-CONNECFAIL,
conne
ct to network object timed-out or failed)
ssn_rqst_trans: IO$_WRITEVBLK channel 576 error 8428(%SYSTEM-F-LINKDISCON,
netwo
rk partner disconnected logical link)
shutdown_socket: IO$M_SHUTDOWN channel 576 error 20(%SYSTEM-F-BADPARAM, bad
para
meter value)
shutdown_socket: IO$_DEACCESS channel 576 error 8412(%SYSTEM-F-CONNECFAIL,
conne
ct to network object timed-out or failed)
Is there a way to tell if this is some sort of resource problem, or a physical
network problem? It's possible that someone changed something on the system
before I arrived, either with a SYSGEN or UCX parameter. Here's some UCX info
from the system:
Communication Parameters
Local host: murtl Domain: zko.dec.com
Cluster timer: 5
Maximum Current Peak
Interfaces 20 2 2
Device_sockets 300 21 21
Routes 65535 13 13
Services 200 0 1
Proxies 58
Type: Ethernet Free Maximum Max Bytes Minimum Min Bytes
Large buffers 20 200 377600 10 18880
Small buffers 150 1000 256000 50 12800
IRPs 20 200
Non UCX buffers 10
Remote Terminal
Large buffers: 10
UCBs: 4
Virtual term: disabled
MBUF Summary
Small_static Large_static Small_dynamic Large_dynamic
Total buffers 50 10 50 0
Free 1 8 33 0
Busy
Data 0 2 0 0
Header 5 0 6 0
Socket 16 0 5 0
Prot. control 11 0 6 0
Route 13 0 0 0
Socket name 0 0 0 0
Socket options 0 0 0 0
Fragment reassembly 0 0 0 0
IP address 2 0 0 0
Size of cluster 13056 19136 13120 0
Free Current Peak Waits Drops
Small Buffers 66 67 0 0
Large Buffers 2 10 0 0
IRPs 3 0 3 0 0
Small clusters Large clusters Non UCX buffers
Free 0 0 0
TCP
Connect initiated: 0 Connect accepted: 12
Connect established: 12 Connect closed: 1
Connect dropped: 0 Embry connect drop: 0
Attempt rtt: 87422 Succeeded rtt: 84412
XMT Delayed ACKs: 35238 Connect timeout: 0
ReXMT timeout: 3159 Persist timeout: 0
Keepalive timeout: 0 Keepalive probes: 0
Keepalive drops: 0 Total XMT segments: 127744
XMT segments: 87409 XMT bytes: 12872705
XMT packet reXMT: 3159 XMT bytes reXMT: 483374
XMT ACK only: 37175 XMT window probes: 0
XMT URG only: 0 XMT wind update pack: 0
XMT CTRL segments: 1 Total RCV segments: 95246
RCV segments: 87424 RCV bytes: 5821967
RCV chksum error: 26 RCV bad offset: 0
RCV too short: 0 RCV dup only pack: 1924
RCV dup only bytes: 123939 RCV part dup pack: 0
RCV part dup bytes: 0 RCV bad order pack: 0
RCV bad order bytes: 0 RCV pack after wind: 0
RCV bytes after wind: 0 RCV pack after close: 0
RCV window probes: 0 RCV dup ACKs: 1
RCV ACK for unXMT: 0 RCV ACK segments: 87422
RCV ACK bytes: 12872718 RCV wind update pack: 0
TCP
MTU size segment: disabled
Delay ACK: enabled
Loopback: disabled
Window scale: enabled
Drop timer: 600
Probe timer: 75
Receive Send
Checksum: enabled enabled
Push: disabled disabled
Quota: 250000 250000
If a TCPIPTRACE to one of the PCs would help, how should I set it up? It
would have to run overnight during a build, and I would think the output would
be quite huge. Is there some way I can limit it to only show the necessary
things?
Any hints would be appreciated...
Lisa
T.R | Title | User | Personal Name | Date | Lines |
---|
5524.1 | oops, misleading... | TLE::MICHAUD | Lisa Michaud, DTN 381-0879 | Mon May 19 1997 16:33 | 30 |
| It's misleading that under the "UCX SHOW PROTOCOL TCP" it shows
"connect dropped" as 0 (I had recently restarted UCX). The "connect
dropped" number normally matches the number of SYSTEM-F-CONNECFAIL
messages in the PATHWORKS server log. Here's another TCP snapshot
after some errors have occurred:
Connect initiated: 0 Connect accepted: 20
Connect established: 20 Connect closed: 8
Connect dropped: 7 Embry connect drop: 0
Attempt rtt: 143492 Succeeded rtt: 138621
XMT Delayed ACKs: 62935 Connect timeout: 0
ReXMT timeout: 5131 Persist timeout: 0
Keepalive timeout: 2 Keepalive probes: 2
Keepalive drops: 0 Total XMT segments: 214573
XMT segments: 143440 XMT bytes: 21150670
XMT packet reXMT: 5127 XMT bytes reXMT: 822845
XMT ACK only: 65994 XMT window probes: 0
XMT URG only: 0 XMT wind update pack: 0
XMT CTRL segments: 12 Total RCV segments: 156509
RCV segments: 142533 RCV bytes: 9478894
RCV chksum error: 42 RCV bad offset: 0
RCV too short: 0 RCV dup only pack: 3041
RCV dup only bytes: 195834 RCV part dup pack: 0
RCV part dup bytes: 0 RCV bad order pack: 0
RCV bad order bytes: 0 RCV pack after wind: 0
RCV bytes after wind: 0 RCV pack after close: 0
RCV window probes: 0 RCV dup ACKs: 7
RCV ACK for unXMT: 0 RCV ACK segments: 143447
RCV ACK bytes: 21173736 RCV wind update pack: 3
|
5524.2 | fragmented disk | TLE::MICHAUD | Lisa Michaud | Wed May 28 1997 09:52 | 17 |
| FYI, for anyone that might be encountering the same types of errors-
the problem *in this case* ended up being the disk that they were
accessing. It was very badly fragmented. A complete image backup and
restore fixed the problem. The system must have had a hard time keeping
the connection alive while trying to piece together the fragmented files
(the worst file had *26,500* extents!). Also, the system that they
were attaching to wasn't the system where the actual disk resides, so
that probably added to the problems.
The problem probably didn't start as "suddenly" as they thought- it
just became unbearable a couple of months ago, when the disk got to a
point where it was so fragmented that it was unusable.
Lisa
|