T.R | Title | User | Personal Name | Date | Lines |
---|
516.1 | Does "no response" = "no solution" ?? | HPSTEK::JBATES | John D. Bates | Sat Apr 29 1989 01:40 | 18 |
| After placing the DECW$SERVER_RETRY_WRITE_MIN and
DECW$SERVER_RETRY_WRITE_MAX parameters on our server nodes and
increasing the number of LRPs our 2DBA002 problems "virtually"
went away. The only problems we now see is when we have network
problems.
HOWEVER: I now have a user at another site that has a
configuration very similar to .0 and has done the things that
made us fly and he seems to have a LOT of these errors. Since
this note has not been responded to in almost a month is it safe
to assume that if these things mentioned above don't fix the
problem there are no other fixes to try?
I am going to the user site next week and am not looking forward
to saying "Hey learn to live with it". Any suggestions welcome.
John
|
516.2 | Our favorite - the 0x2dba002 error | 34858::SOCHA | Out in the Field | Mon May 22 1989 14:00 | 39 |
| There have been numerous Notes in this Conference regarding
the infamous 0x2dba002 error which can occur when running applications
remotely. From the internal QAR database for DECwindows, it was stated
that this would not be fixed until DECwindows v2.0.
-.1
>>> Since
>>> this note has not been responded to in almost a month is it safe
>>> to assume that if these things mentioned above don't fix the
>>> problem there are no other fixes to try?
Since noone seems to have a definite fix for those of us afflicted with
this problem, I would like to gather a list here of those actions which
seem to help. Perhaps someone from engineering could indicate whether
this problem is always caused by resource shortages, or whether program
errors can come up under this error.
So far, I have seen the following recommendations:
(1) Set DECW$SERVER_RETRY_WRITE_MIN = 150000
Set DECW$SERVER_RETRY_WRITE_MAX = 3000000
(2) Increasing the system parameter LRPCOUNT.
(3) Increasing the number of DECnet Line Receive Buffers on the workstation.
(4) Increasing the number of global pages.
Unfortunately, the problem still persists. Remote DECwrite applications
are aborted around every 10 minutes, and performance of remote applications
is very jerky. I have observed a large number of DECnet Line User and
System buffers unavailable on the workstations. The DECnet circuit only
shows a few, and there are only a few on the remote client.
So what do we try next??
Kevin
|
516.3 | One catylst found... | 38320::KIRK | Steve Kirk | Mon May 22 1989 18:06 | 10 |
|
No suggestions on parameter settings here.
I have however observed that the ListBox widget exacerbates the
problem. Our product group eventually wrote our own list-box
equivalent because anytime we did a large number of changes to the
contents of a ListBox widget, the connection was lost via the infamous
0x2dba002 error. Rather irritating...
|
516.4 | My collected wisdom on the subject, FWIW | DECWIN::FISHER | Burns Fisher 381-1466, ZKO3-4/W23 | Tue May 23 1989 11:57 | 36 |
| The reason there are not too many answers is that there are not too many
workarounds. I think, after some experimentation, that I would reduce the
retry_min number substantially. For VMS 5.1, try something more like
5000 (one retry every 500 ms.) On VMS version 2, the units of these numbers
have been fixed to be milliseconds, so the number would be more like 500.
As to the max, I'm not sure that 3000000 is reasonable. This means that you
will keep trying (and thus hanging your server) for 5 minutes. Chances
are if it does not work in 30 seconds, it's not going to. I would put
300000 for V5.1.
Note that for VMS V5.2, the numbers will default to what I suggested. You
will want to remove any logical name definitions that you have made, since
the multiplier factor has changed. (Sorry about that, but it was just
plain wrong in V5.1, and it is not documented anyway!) We
did some tests with people who use DECwrite, and they thought these numbers
were reasonable.
If that does not help, the next step is to try the TCP/IP transport (see
the TCPIP keyword in this conference), or wait for the internal field test
of Version 2 DECwindows. TCP will make more efficient use of transport
buffers. Version 2 will improve the transport buffer use efficiency for
all transports, and will give you some "knobs to turn" for the buffer sizes.
You asked about whether this can be caused by program error. The answer is
yes, but the reverse is not true. Bugs can cause the problem, but the problem
does not necessarily imply bugs in the client. If you do anything which
prevents the client from reading for a while, you can cause the problem.
For example, when you respond to a button/menu/whatever and go off and do
some work without periodically looking at the input queue, you are making
the problem more likely to happen. However, it can still happen with a
perfectly ok client which is swamped with requests. Scroll bars with
a "magnifier window" like mail are my favorite way to make this happen,
for example.
Burns
|
516.5 | | 4315::KONING | NI1D @FN42eq | Tue May 23 1989 17:30 | 6 |
| Why does DECnet use buffers less efficiently than TCP? Or to put it
differently, is the DECnet transport going to be fixed up? There aren't
any obvious reasons why one would be less efficient than the other.
paul
|
516.6 | Internally... | STAR::BRANDENBERG | Si vis pacem para bellum | Tue May 23 1989 18:31 | 8 |
|
What Burns meant was that TCP/IP internally buffers more efficiently
than DECnet does. (Actually, it buffers less efficiently but more
effectively.) The DECwindows transports themselves are nearly
identical.
monty
|
516.7 | Under VMS that is... | 56579::thomas | The Code Warrior | Tue May 23 1989 19:32 | 2 |
| Under Ultrix, DECnet and TCP buffer almost identically.
|
516.8 | | ORPHAN::WINALSKI | Paul S. Winalski | Wed May 24 1989 02:01 | 19 |
| My favorite way to make this happen is to start any DECwindows application
with the debugger, say GO in debug, then pull down the Commands menu and
select EXIT. Debug's exit handler gets control before the server has finished
queueing up the flurry of expose events, unmaps, etc. that accompany tearing
down a widget hierarchy. The client isn't dispatching events because it's
stuck in debug. If you ran the application from a DECterm, you can't even
unstick things by giving debug a ^Z or exit, because it's the whole server
that's hung, and it won't deliver stuff to the DECterm any more.
It's more that TCP/IP and DECnet buffer differently than that one works better
than the other. The MIT X server code was designed originally on Unix and a
TCP/IP-based transport. It's transport management code thus fits that
transport very effectively. Had they run on DECnet from day 1, they would
have approached the problem differently and this hang problem might not exist.
It's possible that, under those circumstances, it would have been TCP/IP with
the problems.
--PSW
|
516.9 | | 4315::KONING | NI1D @FN42eq | Wed May 24 1989 12:36 | 4 |
| Re .6: fine, but my question still applies.
paul
|
516.10 | maybe | STAR::BRANDENBERG | Si vis pacem para bellum | Wed May 24 1989 13:27 | 7 |
|
It *may* happen as part of the IPC project. We talked to them and
begged for stream semantics for DECnet IPC connections as an option.
If they have time/resources, they will do it.
m
|
516.11 | | DECWIN::FISHER | Burns Fisher 381-1466, ZKO3-4/W23 | Wed May 24 1989 13:27 | 12 |
| re .5: I will tell you my explanation as a non-expert and let Monty
fill in the gaps and fix up the misconceptions...
DECnet record oriented and TCP is stream-oriented. Given that the flow
of info between client and server is stream-oriented, it favors TCP. The
version 2 DECwindows server has code which compresses more X packets into
DECnet buffers, which makes it look more stream-like.
Monty?
Burns
|
516.12 | | STAR::BRANDENBERG | Si vis pacem para bellum | Wed May 24 1989 13:29 | 5 |
| Burns and I collided but what he says is so. The changes alleviate but
do not eliminate the problem.
m
|
516.13 | Rehashing old warmed over hash | 40470::PETTENGILL | mulp | Wed May 24 1989 23:35 | 7 |
| See the imfamous note 60, response 66; I actually looked at what was going on
with a datascope. DECnet/VAX gets to send 18 datagrams with 10,000 bytes of
quota while VAX Ultrix Connection allows 60 datagrams with 4096 bytes of quota.
Another implementation of DECnet, for example, DECnet Ultrix, may not have the
same behavior as DECnet/VAX. However, in the simple test in 60.66, the server
aborted the connection in both cases.
|
516.14 | another fix for 0x2dba002 | MDVAX3::SOCHA | Out in the Field | Fri Aug 18 1989 13:42 | 27 |
| I got this response in the DECwrite conference to repeated
problems with the 0x2dba002 error. It is another thing to
try, especially if you have alot of nodes on your LAN.
Kevin
<<< QUEEN::PIX1:[PUBLIC.NOTES]EPIC.NOTE;6 >>>
-< You can't go wrong with DECwrite >-
================================================================================
Note 1959.21 DECwrite or DECwindows error? 21 of 21
DCC::HAGARTY "Essen, Trinken und Shaggen..." 13 lines 18-AUG-1989 04:24
-< Network configuration! >-
--------------------------------------------------------------------------------
Ahhh Gi'day...�
Sounds like the infamous BROADCAST NONROUTERS problem! MAKE SURE THAT
THIS IS DONE ON ALL SYSTEMS IN THE LAN, but firstly yours...
Count the numbers of nonrouters on the LAN (say in the region of 300),
and do a:
$ MC NCP SET EXEC MAX BROADCAST NONROUTERS 512
$ MC NCP DEF EXEC MAX BROADCAST NONROUTERS 512
This will stop the timeouts happening to the other nodes in the LAN! On
big machines, make it 1024!
|
516.15 | Curious | EAGLE1::BRUNNER | VAX Vector Architecture | Thu Jan 04 1990 20:33 | 4 |
| What I am trying to figure out as a novice is why I get this error when I
invoke a remote DECWRITE through a remote FileVUE (both on the same system)
but not when I invoke the remote DECWRITE directly (by remote job or
logging into the remote system.) How is FileVUE getting in the way?
|