[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9139.0. "Real Time and Asynchronous I/O implementation" by EVTIS2::ES_ANCELIN () Wed Mar 12 1997 11:44

Hi,

Customer is porting to V4.0B an application that used to work fine under V3.2G

The processing may be summarized as follow:
( ---> represent shared memory segments used to communicate between the
       different processes )

        data-input --->  filter ---> treatment ---> data-output
SCHED   FIFO             RR          RR             RR
PRIO    FIFO_MAX         RR_MAX      RR_MAX         RR_MAX

For historic reasons, the 'data-input' process is build with -laoi and -threads
options (despite the fact that asynchronous IO is not used at all in this
process); 

When the read operation is not re-started within few ms, the board generate
an 'overflow' interrupt (from where we crashed the system)


The crash analisys revealed the following:

The 3 processes running with sched RR have the correct policy/priority;
The 'data-input' process is multithreaded (this is the effect of -threads
or -pthread under V4.0), and all the kernel threads are scheduled with
SCHED_OTHER, except one which is correctly scheduled with SCHED_FIFO;
As you guess, the thread running the critical code is SCHED_OTHER

Removing the unuseful -laoi -pthread options make the problem disappear

I suppose this phenomen is related to two-level scheduling/contention scope;

In all the cases, from the customer point of view, sched_setsheduler()
does not seems to do what it should.

Any detailed explanation are welcome;

What solution do we have to workaround this (in case we really need Async I/O) ?

Denis.
T.RTitleUserPersonal
Name
DateLines
9139.1Known problem.WTFN::SCALESDespair is appropriate and inevitable.Wed Mar 12 1997 14:0311
.0> I suppose this phenomen is related to two-level scheduling/contention scope

Correct.  The kernel folks are familiar with this problem and are devising a way
to address it.

.0> What solution do we have to workaround this

There is none, as far as I'm aware, prior to PtMin.


				Webb
9139.2One workaround (makes it hard to maintain, though)WIBBIN::NOYCEPulling weeds, pickin' stonesWed Mar 12 1997 14:412
Webb, if it worked in 3.2*, can they link it (non_shared?) there, and
get the same behavior on 4.0*?
9139.3Yep, that should work...WTFN::SCALESDespair is appropriate and inevitable.Wed Mar 12 1997 16:244
.2> if it worked in 3.2*, can they link it (non_shared?) there, and
.2> get the same behavior on 4.0*?

Yes, that's true...that would be a workaround.