[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
Notice: | Welcome to the Digital UNIX Conference |
Moderator: | SMURF::DENHAM |
|
Created: | Thu Mar 16 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 10068 |
Total number of notes: | 35879 |
8821.0. "sigwait returns with EINTR while ALL signals are blocked !!!" by NETRIX::"[email protected]" (Laurent Martin) Thu Feb 13 1997 11:34
Hi all,
We are experiencing a strange behavior of sigwait:
First of all the context:
DECss7 is a multithreaded application written in ADA/C++.
Versions:
DIGITAL Unix 4.0B
ADABASE330 installed DEC Ada V3.3 Primary Subset
ADAPAL330 installed DEC Ada V3.3 Predefined Library
ADALIB400 installed DEC Ada runtime library for Digital UNIX V4.0
CXXBASE550 installed DEC C++ (cxx) for Digital UNIX
CXLSHRDA410 installed DEC C++ Class Shared Libraries
Other Topics dealing with DECss7 and signals:
#1303 #3318 #6192
4 signals are blocked for all threads, a thread is waiting (sigwait)
for one of these signals (and this thread has ALL signals blocked). The
problem
is that sigwait is interrupted with a EINTR error: it shouldn't occur
because all signals are blocked.
We first thought it was a problem we already experienced with
exceptions/signals
but note #8748 specifies that this bug has been fixed in unix4.0 release.
We don't understand where is the problem. No signal handler is activated,
this should mean that no signal is delivered. So what's going on ?
Thanks for you responses !!
Laurent
Its behaviour is:
-----------------
caught SIGXCPU
sigset(before)=0xfffffffffffefeff (unblocked signals are KILL and
STOP)
sigset(after)=0xfffffffffffefeff
<!!!!!!!!!!!!!!!same thing dutring 2 minutes or so...>
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
<!!!!!!!!!!!!!!!some 50 occurence of Error 4>
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
....and so on
The relevant code is here:
void handler(int sig)
{
printf("FV_handler sig=%d\n",sig);
MM_check_system_call ( kill(getpid(), SIGTERM ) ) ;
}
main()
{
sigaction(<all signals but KILL STOP CONT>,handler)
block signals SIGABRT, SIGXCPU , SIGURG , SIGTERM; (-> EV_sigset )
*create service thread*
*create other threads*
loop on work
}
void
service_thread ()
{
int VI_signal;
sigset_t VS_sigset_all ;
sigset_t VS_sigset_old ;
sigset_t VS_empty ;
sigfillset(&VS_sigset_all);
sigemptyset ( &VS_empty ) ;
// mask all signals so that they can't be handled
MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
loop
{
int VI_error;
MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
printf("sigset(before)=%#lx\n",VS_sigset_old);
VI_error=sigwait( &EV_sigset , &VI_signal ) ;
MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
printf("sigset(after)=%#lx\n",VS_sigset_old);
if (VI_error)
{
printf("############################## Error #%d #############\n",VI_error);
continue;
}
switch ( VI_signal )
{
case SIGABRT :
printf("caught SIGABRT\n");
// Generate core file.
abort () ;
break ;
case SIGXCPU :
printf("caught SIGXCPU\n");
// Log Alarm event.
break ;
case SIGURG :
printf("caught SIGURG\n");
// SIGURG handlers are defined in "io_channel.cc" & "tracepoint.cc" source
files.
break ;
case SIGTERM :
printf("caught SIGTERM\n");
// Actually exit now !!!
exit ( 0 ) ;
default :
MM_throw_system_error ( errno ) ;
}
}
}
[Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|
8821.1 | | SMURF::DENHAM | Digital UNIX Kernel | Thu Feb 13 1997 15:15 | 12 |
| This sounds a little like an anomaly I discovered tracking down
the infamous SS7 signal lock timeout program on V3.2.
While working on the patches for the on the V4.0 base, I found
very occasionally sigwait behave exactly as your saying.
Turns out to have been a side effect of the extremely complex
code required to deal with signals in a 2-level thread scheduling
environment.
Let me do some testing here and see whether I can manage to reproduce
the behavior. I do have a V4.0B patch you can try if you like.
It's not official yet -- in process still.
|
8821.2 | QAR for this problem is QAR51813 | NETRIX::"[email protected]" | Laurent | Wed Mar 05 1997 12:33 | 9 |
| Hi,
We entered a QAR for this problem that was related to the topic #8718.
This the fix to be traced in future Unix versions.
Laurent
SISB-Telecom
[Posted by WWW Notes gateway]
|