[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

8821.0. "sigwait returns with EINTR while ALL signals are blocked !!!" by NETRIX::"[email protected]" (Laurent Martin) Thu Feb 13 1997 11:34

Hi all,
  We are experiencing a strange behavior of sigwait:

  First of all the context:
    DECss7 is a multithreaded application written in ADA/C++.
Versions:
DIGITAL Unix 4.0B
ADABASE330           installed  DEC Ada V3.3 Primary Subset
ADAPAL330            installed  DEC Ada V3.3 Predefined Library
ADALIB400            installed  DEC Ada runtime library for Digital UNIX V4.0
CXXBASE550           installed  DEC C++ (cxx) for Digital UNIX
CXLSHRDA410          installed  DEC C++ Class Shared Libraries

  Other Topics dealing with DECss7 and signals: 
     #1303 #3318 #6192

  4 signals are blocked for all threads, a thread is waiting (sigwait)
for one of these signals (and this thread has ALL signals blocked). The
problem
is that sigwait is interrupted with a EINTR error: it shouldn't occur
because all signals are blocked.

We first thought it was a problem we already experienced with
exceptions/signals
but note #8748 specifies that this bug has been fixed in unix4.0 release.

We don't understand where is the problem. No signal handler is activated,
this should mean that no signal is delivered. So what's going on ?


Thanks for you responses !!

Laurent



Its behaviour is:
-----------------
caught SIGXCPU
sigset(before)=0xfffffffffffefeff         (unblocked signals are KILL and
STOP)
sigset(after)=0xfffffffffffefeff
<!!!!!!!!!!!!!!!same thing dutring 2 minutes or so...>
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
<!!!!!!!!!!!!!!!some 50 occurence of Error 4>
############################## Error #4 #############
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
caught SIGXCPU
sigset(before)=0xfffffffffffefeff
sigset(after)=0xfffffffffffefeff
....and so on


The relevant code is here:
void handler(int sig)
{
  printf("FV_handler sig=%d\n",sig);
  MM_check_system_call ( kill(getpid(),  SIGTERM ) ) ;  
}

main()
{
  sigaction(<all signals but KILL STOP CONT>,handler)
  block signals SIGABRT, SIGXCPU , SIGURG  , SIGTERM; (-> EV_sigset )
  *create service thread*
  *create other threads*
  loop on work
}

void
service_thread ()
{
  int              VI_signal;

  sigset_t         VS_sigset_all ;
  sigset_t         VS_sigset_old ;
  sigset_t         VS_empty ;

  sigfillset(&VS_sigset_all);
  sigemptyset ( &VS_empty ) ;

  // mask all signals so that they can't be handled
  MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
    
  loop
  {
    int VI_error;

    MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
    printf("sigset(before)=%#lx\n",VS_sigset_old);

    VI_error=sigwait( &EV_sigset , &VI_signal ) ;

    MM_check_posix95_call ( pthread_sigmask ( SIG_SETMASK , &VS_sigset_all ,
&VS_sigset_old ));
    printf("sigset(after)=%#lx\n",VS_sigset_old);

    if (VI_error)
      {
	printf("############################## Error #%d #############\n",VI_error);
	continue;
      }

    switch ( VI_signal )
    {
      case SIGABRT :

	printf("caught SIGABRT\n");

	// Generate core file.

	abort () ;

	break ;

      case SIGXCPU :

	printf("caught SIGXCPU\n");
	// Log Alarm event.

	break ;

      case SIGURG :

	printf("caught SIGURG\n");
	// SIGURG handlers are defined in "io_channel.cc" & "tracepoint.cc" source
files.

	break ;

      case SIGTERM :

	printf("caught SIGTERM\n");

	// Actually exit now !!!

	exit ( 0 ) ;

      default :

	MM_throw_system_error ( errno ) ;
    }
  }
}

[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
8821.1SMURF::DENHAMDigital UNIX KernelThu Feb 13 1997 15:1512
    This sounds a little like an anomaly I discovered tracking down
    the infamous SS7 signal lock timeout program on V3.2.
    
    While working on the patches for the on the V4.0 base, I found
    very occasionally sigwait behave exactly as your saying.
    Turns out to have been a side effect of the extremely complex
    code required to deal with signals in a 2-level thread scheduling
    environment.
    
    Let me do some testing here and see whether I can manage to reproduce
    the behavior. I do have a V4.0B patch you can try if you like.
    It's not official yet -- in process still.
8821.2QAR for this problem is QAR51813NETRIX::&quot;[email protected]&quot;LaurentWed Mar 05 1997 12:339
Hi,
  We entered a QAR for this problem that was related to the topic #8718.
This the fix to be traced in future Unix versions.

Laurent

SISB-Telecom

[Posted by WWW Notes gateway]