T.R | Title | User | Personal Name | Date | Lines |
---|
2427.1 | Fix the client stack | CFSCTC::HUSTON | Steve Huston | Thu Jan 30 1997 09:43 | 16 |
| >The W3.1 clients seem to *always* leave their sockets
>open on the server. Even if they shut down tidily (release and rundown).
This indicates a problem with the W3.1 TCP stack. You should get a
TCP trace of one of these TCP connections, verify that the client is
not handling the session correctly, and go to the stack vendor for
a real fix.
One other off-the-cuff thing you could try on the client side... do the
rundowns and then sleep a few seconds before exiting the program. Maybe if
the stack code has a bit more time, it'll finish completely.
I'd be willing to help more with this if needed - you can contact me
off-line at [email protected].
-Steve
|
2427.2 | | LEMAN::DONALDSON | Froggisattva! Froggisattva! | Thu Jan 30 1997 10:45 | 28 |
| Steve, thanks for the reply. I'm pretty sure you must be right
(I've seen W3.1 OBB v2.5 cleaning up properly, so I was a
bit surprised with their claims). However, the solution
"fix the stack" will probably not be accepted (they dont
want to spend money on something that will be thrown away
soon).
In any case we need to put in place a solution for those
clients that end abruptly (powerfail or whatever).
At the moment: we monitor (with a DCL script) the number
of sockets and re-start the server if they go above danger-level
(about 70 sockets); we check each socket and ping the remote
end - if it's inactive we disconnect the socket. As a
solution its clunky but it seems to work.
I'd like any sockets which haven't been used for x minutes
to go away automatically. Any ideas how to do this?
(Wishlist: allow me to configure this).
In the medium term the clients will get a release
which releases the objref and re-connects if a request
fails. This will avoid the worst effects of re-starting
the servers.
(Wishlist: add this binding - something like
OBB_BINDING_AUTOMATIC_WITH_FAILOVER).
John D.
|
2427.3 | | CFSCTC::HUSTON | Steve Huston | Thu Jan 30 1997 12:19 | 40 |
| >"fix the stack" will probably not be accepted (they dont
>want to spend money on something that will be thrown away
>soon).
Well, if the problem is a bug in the TCP stack, maybe the vendor would
give you a fix for free. Especially such an obvious problem as not
handling connection shutdown correctly.
Ok, I won't push this... you know your customer. I was trying
to get ObjectBroker (and DEC) off the hook for the problem by
transferring blame (and attention) to the appropriate place.
>In any case we need to put in place a solution for those
>clients that end abruptly (powerfail or whatever).
This is where OBB may be able to help. If keepalives are enabled
on the sockets, it may help to catch this condition and kill the
socket. But it can take keepalives quite a while to notice and kill
a dead connection. And depending on what the PC end is doing, keepalives
may not be any help at all. More info would be needed (below...)
>At the moment: we monitor (with a DCL script) the number
>of sockets and re-start the server if they go above danger-level
>(about 70 sockets); we check each socket and ping the remote
>end - if it's inactive we disconnect the socket. As a
>solution its clunky but it seems to work.
Do you know what state the TCP connection is in when they're "stuck"
or hung? This is a key piece of info to getting the base problem
taken care of.
>I'd like any sockets which haven't been used for x minutes
>to go away automatically. Any ideas how to do this?
>(Wishlist: allow me to configure this).
No, this is an application responsibility (e.g. OBB) and it's incredibly
hard to get it right without intimate knowledge of what both sides
are doing, which ObjectBroker does not (and can not) have.
-Steve
|
2427.4 | | LEMAN::DONALDSON | Froggisattva! Froggisattva! | Mon Feb 10 1997 05:49 | 19 |
| The customer is still investigating at a low priority level.
(At least their current production system is stable, if not
exactly elegant). I'll come back if we make any more progress.
>>I'd like any sockets which haven't been used for x minutes
>>to go away automatically. Any ideas how to do this?
>>(Wishlist: allow me to configure this).
>
>No, this is an application responsibility (e.g. OBB) and it's incredibly
>hard to get it right without intimate knowledge of what both sides
>are doing, which ObjectBroker does not (and can not) have.
Well, I think it should be possible to add some method
like timeout which could do whatever is appropriate to kill
of the link. OBB::Object::timeout or something (OBB_Object_set_timeout).
Which basically says if this object is active then timeout its
link (network dependant) if there's no activity for a certain period.
John D.
|
2427.5 | Can't time out on per-object basis | REQUE::ctxobj.zko.dec.com::Patrick | ObjectBroker Engineering | Mon Feb 10 1997 08:30 | 27 |
| ObjectBroker does not create a separate logically link for each
object. If it did, we'd not scale in systems with large number of
objects. It's for this same reason we don't track every object in
the system.
What we do support is multiplexing objects over a single network
link when more than one object reference is being held by the
client that has resulted in going to the same server process.
Given this, if we timed out the link because Object #1 had not
been used, we'd also time out the link for Object #2 which had
just been used.
As indicated, this is a problem in the TCP/IP implementation
being used. We will continue to consider mechanisms that
would allow idle links to be shutdown and re-established to
be supported in some future version, but nothing definite yet.
FWIW: it appears that many of the TCP/IP implementations
for Windows 3.x (16-bit) just don't work properly. As
a result, you see things just like your seeing. My
suggestion is move to 32-bit Windows quickly, if at
all possible.
Paul Patrick
|
2427.6 | | LEMAN::DONALDSON | Froggisattva! Froggisattva! | Tue Feb 11 1997 10:30 | 29 |
| I'm glad you're multiplexing - I hadn't got a good inside
story on that (any chance of an 'internals' one-off course?).
>What we do support is multiplexing objects over a single network
>link when more than one object reference is being held by the
>client that has resulted in going to the same server process.
>Given this, if we timed out the link because Object #1 had not
>been used, we'd also time out the link for Object #2 which had
>just been used.
Well, if you take a simplistic approach that's true.
But you could use some kind of 'reference count' technique.
The common-sense expectation ought to be implementable
here - when a link hasn't been used - time it out.
>FWIW: it appears that many of the TCP/IP implementations
> for Windows 3.x (16-bit) just don't work properly. As
> a result, you see things just like your seeing. My
> suggestion is move to 32-bit Windows quickly, if at
> all possible.
I understand what you're saying and I'm trying to get
more accurate data on *when* the sockets are getting
left behind. You can imagine that in a large global enterprise
there are *lots* of different PCs and versions of software etc.
John D.
|
2427.7 | | REQUE::BOWER | Peter Bower, ObjectBroker | Sat Feb 15 1997 07:57 | 10 |
|
Steve's question from .3 is a good one. Do you know the answer
for it ?
> Do you know what state the TCP connection is in when they're "stuck"
> or hung? This is a key piece of info to getting the base problem
> taken care of.
A UCX SHOW DEVICE/full on a hung socket may be usefull.
|
2427.8 | | LEMAN::DONALDSON | Froggisattva! Froggisattva! | Mon Feb 17 1997 05:48 | 14 |
| > Steve's question from .3 is a good one. Do you know the answer
> for it ?
>
> > Do you know what state the TCP connection is in when they're "stuck"
> > or hung? This is a key piece of info to getting the base problem
> > taken care of.
Yes, I know. I was on site with this customer for a week
recently and we discovered this problem. I delivered the
consulting I'd been hired to do and I'm working elsewhere
now. So, I cant push too much to get the customer to look
at this. When I can I'll be back with more info.
John D.
|