[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference tuxedo::dce-products

Title:	DCE Product Information
Notice:	Kit Info - See 2.-4.
Moderator:	TUXEDO::MAZZAFERRO

Created:	Fri Jun 26 1992
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2269
Total number of notes:	10003

2157.0. "Automatic Rebind via SetRebind" by BHAJEE::KONRAD (pour des nouvelles aventures) Tue Feb 11 1997 09:56

We're implementing a distributed system based on the C++ support of DCE for
Dunix V2.0. 

	I tested the Automatic rebind in case of server failure. A really nice
feature and it would help us a lot for the implementation of soft-fail
strategies. 

	The test is quite simple: 

	- start the server, 
	- create named object & register
	- listen

	- start the client
	- bind to the server object
	- SetRebind(wait_on_rebind)
	- start calling the server in an endless loop

	- stop the server   
	- restart the server 

	When the server goes down, the client waits until the server comes up
again and restarts the operation at the point it failed 

	All o.k. so far - *but* if I have a look at the system while the
server is down, the client is eating up ~40% of the CPU time. 
So either I'm doing something wrong or the rebind-timeinterval is *very small.

	Is there any way to slow down this polling or any hints to avoid the
heavy load ??


Thanks 
	-conny

T.R	Title	User	Personal Name	Date	Lines
2157.1	put a delay in the client call loop on error	FOUNDR::WOODRUFF		`Tue Feb 11 1997 12:58`	22
	> - start the client > - bind to the server object > - SetRebind(wait_on_rebind) > - start calling the server in an endless loop > So either I'm doing something wrong or the rebind-timeinterval is *very small. > Is there any way to slow down this polling or any hints to avoid the > heavy load ?? what I normally do is have a retry time that is used to wait between calling the server. The value is used as the time for pthread_delay_np. the other option is to write a background thread that signals the client when a binding has been made. both of these should reduce the CPU cycles. garry
2157.2	what to try	PTHRED::VIVENEY	Bob Viveney	`Tue Feb 11 1997 16:32`	14
	-1 suggested putting a delay between calling the server. I don't believe this will help you since the looping is happening in the client stub. I don't think you want to change the stub code. But you could use 'attempt_rebind_n' as the policy which will try to bind n times before giving up, at which time it will return control back to your client code via an exception you can catch. If you are using a rebind policy at all, then you must be assuming that the server will be available at all times (allowing for faults). This would justify a tight loop in the client stub. If the server goes down, then there should be an automated way for it to restart so that availability is high. You might want to look at the dynamic invocation features of dced (see server(8dce) man page) or look at Digital's Resource Broker which will also dynamically start a dce server if its not running as well as load balance between multiple instances of a server.
2157.3	More about the "tight" rebind loop	BHAJEE::KONRAD	pour des nouvelles aventures	`Wed Feb 12 1997 03:41`	54
	re -.1 >> If you are using a rebind policy at all, then you must be assuming that the >> server will be available at all times (allowing for faults). That's how we want to implement it. The client simply should wait for the server (or a standby server) to come up again and then restart operation at the point it failed. >> If the server goes down, then there should be >> an automated way for it to restart so that availability is high. Agreed! We'll do it that way. >> This would justify a tight loop in the client stub. Hmmh! Depends on what you mean with "tight": The automatic server restart takes a while since the server has to set up/ access a couple of heavy-weight resources (e.g. some large databases) and this process in general will take more than a few seconds - let's say something > 30 sec. Therefore, it's not neccessary to try to rebind in very tight intervals. The tight interval will generate quite some network load since there may be some hundred client workstations trying simultaneously to rebind to the server that failed. I don't know what rebind time interval is used in the stub implementation, but it looks like something well below 1 sec. >> I don't think you want to change the stub code. Hmmh again! Why not - We do that anyway. We have to map C++ and DCE exceptions at the RPC interfaces. To do that, we use a little idl postprocessor that is automagically inserting this mapping into the client/server stub code. (Works perfectly btw. :-) ) So if the rebind loop is part of the idl-generated client stub source code, it would be a SMOP to manipulate this interval in the same postprocessor. >> to look at the dynamic invocation features of dced (see server(8dce) man page) >> or look at Digital's Resource Broker which will also dynamically start a dce >> server if its not running as well as load balance between multiple instances >> of a server. Good idea - I'll have a look. Thanks for the responses -conny
2157.4	where to delay	PTHRED::VIVENEY	Bob Viveney	`Wed Feb 12 1997 10:21`	13
	conny wrote: > I don't know what rebind time interval is used in the stub implementation, > but it looks like something well below 1 sec. When you attempt a call on a binding handle, rpc_mgmt_set_com_timeout() is called with a timeout value of rpc_c_binding_min_timeout. So if the call fails, control goes from the dce library back to the stub, which loops around to try another handle. There is no built in delay between attempts. But according to your mail, you are willing to modify stub code so you could put a delay in the section of code where an exception is caught for a communication failuer, connection reject, ... inside the 'if' statement before the line 'goto IDL_find_server'.
2157.5	That's what I looked for !!	BHAJEE::KONRAD	pour des nouvelles aventures	`Wed Feb 12 1997 11:59`	6
	re .-1 Thanks, I'll give it a try -conny