[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference bulova::decw_jan-89_to_nov-90

Title:	DECWINDOWS 26-JAN-89 to 29-NOV-90
Notice:	See 1639.0 for VMS V5.3 kit; 2043.0 for 5.4 IFT kit
Moderator:	STAR::VATNE

Created:	Mon Oct 30 1989
Last Modified:	Mon Dec 31 1990
Last Successful Update:	Fri Jun 06 1997
Number of topics:	3726
Total number of notes:	19516

876.0. "user-written X IO Error Handler question" by CSC32::K_TICE (Ada...Keeping the world safe for bureaucracy!) Fri Jun 02 1989 13:14

    The documnetation for XSetIOErrorHandler says a user-written error 
    handler can be created to handle "...any type of system-call error, 
    such as losing the connection to the server..."
    
    OK-fine.  I have a customer who want to do exactly this.
    
    The manual also says "This is assumed to be a fatal condition; the
    error handler should not return.  If the IO error handler does 
    return, the client process exits."
    
    Ouch!  Picture the following scenario.  One client process has done
    several XtOpenDisplay's, each to a different server.  This kind of
    architecture is one that, from my experience, occurs frequently in 
    many military applications as well as many process control applications.
    In this case, it is a power plant.
    
    ...so you lose a connection to ONE server, and you blow away the 
    client process and ALL other displays with it?  In the mean time, 
    your reactor core about to melt down, but you don't know it because
    all your displays are gone just because somebody tripped over a
    thin-wire Ethernet cord.
    
    Is there a way to recover from this kind of error without destroying 
    the client process?
    
    
    Ken

T.R	Title	User	Personal Name	Date	Lines
876.1	setjmp/longjmp	STAR::BRANDENBERG	Si vis pacem para bellum	`Fri Jun 02 1989 14:38`	10
	This limitation is understood by both DEC and MIT. I don't know if they're planning to change the behaviour of the error handlers but rws' response to such comments has been that the programmer could setjmp/longjmp out of such an I/O error handler and never refer to the connection again. How well this will work in a full toolkit environment I can't say. monty
876.2	?	ULTRA::WRAY	John Wray, Secure Systems Development	`Fri Jun 02 1989 15:14`	2
	What does "SETJMP/LONGJMP" mean?
876.3		TLE::REAGAN	Pascal, A kinder and gentler language	`Fri Jun 02 1989 15:32`	6
	It's a C-ism. Setjmp/longjmp are C's cousin to Pascal's non-local GOTO (The C RTL uses a "clone" of the PASRTL's non-local GOTO code do to the work...). -John
876.4		STAR::BRANDENBERG	Si vis pacem para bellum	`Fri Jun 02 1989 15:34`	4
	Sorry, I'm operating in C-mode. C RTL routines which perform stack unwinds. On VMS, implemented with signals and condition handlers.
876.5	More Unixism/OS than C-ism...	FUEL::graham	If people lead, the leaders will follow	`Fri Jun 02 1989 18:35`	16
	Probably, the real question is... how does one design a portable error or signal handling mechanism without breaking the disparate language models on various operating systems or environments? One of the strongest selling points of X has been its portability.... however, this area of difficulty looks like a mean task for the folks at MIT. Check out note 691.* for a discussion of setjmps ,lonjmps and error/signal handling under X11. Kris...
876.6		ULTRA::WRAY	John Wray, Secure Systems Development	`Fri Jun 02 1989 20:55`	13
	Note 691 seems to imply that the problem is difficult, but also seems to say that the X consortium is dragging its feet on the issue. Is this a correct analysis, or is there active work going on to solve the problem? Meanwhile, what do we tell customers who are investigating a distributed DECwindows-based process-control solution? The previous reply indicated that part of the problem is due to some deficiency in the UNIX signalling mechanism. Is this correct, or is it simply that the toolkit doesn't cope with a stack-unwind properly (which would be a SMOP to fix, surely - a re-entrant toolkit that could cope with being unwound would be fully upwards compatible with the MIT version)?
876.7	Other problems and some ramblings...	FUEL::graham	If people lead, the leaders will follow	`Sat Jun 03 1989 02:18`	93
	This topic should attract a lot of interest...especially as it relates to mission critical applications requiring fast and unbuffered X responses to system and user errors. I have been asked this question a few times by people developing critical applications for Wall Street traders... who put severaL checks and balances in their callback routines. (Required to trade volatile financial instruments) The asynchronous nature of X events is actually part of the problem..although synchronization can be forced in debugging mode. An excerpt from the C Library Reference Protocol by Scheifler, Gettys and Newman follows: "Because Xlib usuallly does not transmit requests to the server immediately (that is, it buffers them), errors can be reported much later than they actually occur..." However, they go on to advise those users with critical needs for custom error handlers... "When Xlib detects an error, it calls an error handler, which your program can provide..." BUT, they NEVER tell you how to recover errors with your own routines. That must have been the smartest thing to do at the time if you remember what the goals of X were... - The X Protocol, as repository for X data structures needed to communicate (send/receive) between X clients and the server, had to guarantee that the X interface operate correctly, regardless of operating system, network transports, and programming languages. Securing the above goals is nothing trivial ..especially when you think of how one achieves portability for error handling routines for different operating systems and machine architectures without compromising the X mission goals. Also, it must be remembered that a lot of work was done initially using UNIX and C. And, C is a by-product of UNIX - if one is to trace the history of UNIX correctly (remember the 'B' language?). Technically, the LONGJMP and SETJMP algorithms were conceived during the design of UNIX. These routines are used for creating program objects/processes and restoring stack location (push and pops) during kernel context switches. It is easy to confuse setjmps and longjmps as they relate to system context switching and their use during user program development. The confusion is not very serious...especially when one is dealing with X as a networked system, where numerous interrupts (via signals) are generated. One can think of a high priority Xlib program (call) that cannot be interrupted at any arbitrary point because that program was in the middle of updating a very complicated data structure. In this instance, the way to force critical errors (caused elswhere) to be generated, would be to set flags in the interrupt routine. In this instance, setjmps and longjmps can never save your ass! I am beginning to think that this problem is bigger than most of us would presume. RE: .6 >The previous reply indicated that part of the problem is due to some >deficiency in the UNIX signalling mechanism. Is this correct, or is it >simply that the toolkit doesn't cope with a stack-unwind properly [I am not an X designer so, the ideas in here are my own...I do not claim to represent the DECwindows designers or the people at MIT. Maybe, they will see this note and provide better comments than mine.] I believe the problem is combination of several unresolved problems. The fact that the UNIX model was the bias of the MIT designers (for a good reason), produced a lot of X features inherent in UNIX. The UNIX-style of handling errors and signals are good testimonies. So, how do you design a toolkit that is is re-entrant - without breaking the fundamental goals of X as a heterogeneous platform with common user code? Tough question! Erorr handling comes in different flavors (at least, in UNIX). Sometimes, there is a need to mix and match error handling with signal handling..for instance...a test to catch floating point errors...such as when a floating point number overflows. The same can be said for the use of networked X applications. How do we apply clean signals that determine or pinpoint the exact location of LAN failures without combining signals and error handlers with application re-entrancy (at the toolkit level)? Hopefully, future X extensions will come out with pragmatic ways to deal with most of these probelms. Some us see the problems...just that we cannot prescribe any solutions just yet :-) Kris...
876.8		VWSENG::KLEINSORGE	Toys 'R' Us	`Sat Jun 03 1989 15:40`	21
	Aren't you trying to solve too much? At it's simplest level, all that is needed is a way to tell Xlib that the error isn't fatal and that it should convert the error into a non-fatal error. There are two error handler routines, XSetErrorHandler and XSetIOErrorHandler. The basic difference is that in one case the result of the return is to die, and the other is to dismiss the error and continue. Well the choice of fatal/non-fatal is pretty arbitrary in "application" terms. Note that HOW this is done can be as O/S specific as you'd like. If the user was able to return a status (i.e. a "return (1);" which means "hey, forget it, we'll handle the problem", or "return (0);" which might mean, "die an ugly death" then it's up to the application to worry about the implications of the error. What the application does with the error is then the applications business... very X'ish huh? Personally, I'd like it if error handlers could be set for each display (and even perhaps window) as opposed to application wide...
876.9		CASEE::LACROIX	Gone with the wind	`Sun Jun 04 1989 14:48`	10
	> how does one design a portable error or signal handling mechanism > without breaking the disparate language models on various operating > systems or environments? I like the signal handling mechanisms similar to the one described in a recent SRC report; it seems to work fine on various OS, doesn't really fit all languages models. Denis.
876.10		ULTRA::WRAY	John Wray, Secure Systems Development	`Sun Jun 04 1989 19:10`	17
	> <<< Note 876.8 by VWSENG::KLEINSORGE "Toys 'R' Us" >>> > > Aren't you trying to solve too much? At it's simplest level, all that > is needed is a way to tell Xlib that the error isn't fatal and that it > should convert the error into a non-fatal error. In the case of a "connection aborted" error, you also want a way to tell Xlib to do some stuff of its own - ie clean up any structures that had anything to do with the connection. This sort of clean-up has to be provided by Xlib - the application can't do it. What does the protocol allow to happen if the client detects that the connection has gone away like this, but the server hasn't noticed yet? If the client application were allowed to continue, and tried to establish a new connection to the server, would the server get confused?
876.11		VWSENG::KLEINSORGE	Toys 'R' Us	`Mon Jun 05 1989 08:44`	7
	I'd be happy to keep stale data structures wasting memory rather than be forced to have the image terminated... though I wouldn't be unhappy if there was also a way to get Xlib to clean up...