Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
Notice: | Welcome to the Digital UNIX Conference |
Moderator: | SMURF::DENHAM |
Created: | Thu Mar 16 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 10068 |
Total number of notes: | 35879 |
Hi, Customer is porting to V4.0B an application that used to work fine under V3.2G The processing may be summarized as follow: ( ---> represent shared memory segments used to communicate between the different processes ) data-input ---> filter ---> treatment ---> data-output SCHED FIFO RR RR RR PRIO FIFO_MAX RR_MAX RR_MAX RR_MAX For historic reasons, the 'data-input' process is build with -laoi and -threads options (despite the fact that asynchronous IO is not used at all in this process); When the read operation is not re-started within few ms, the board generate an 'overflow' interrupt (from where we crashed the system) The crash analisys revealed the following: The 3 processes running with sched RR have the correct policy/priority; The 'data-input' process is multithreaded (this is the effect of -threads or -pthread under V4.0), and all the kernel threads are scheduled with SCHED_OTHER, except one which is correctly scheduled with SCHED_FIFO; As you guess, the thread running the critical code is SCHED_OTHER Removing the unuseful -laoi -pthread options make the problem disappear I suppose this phenomen is related to two-level scheduling/contention scope; In all the cases, from the customer point of view, sched_setsheduler() does not seems to do what it should. Any detailed explanation are welcome; What solution do we have to workaround this (in case we really need Async I/O) ? Denis.
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
9139.1 | Known problem. | WTFN::SCALES | Despair is appropriate and inevitable. | Wed Mar 12 1997 14:03 | 11 |
.0> I suppose this phenomen is related to two-level scheduling/contention scope Correct. The kernel folks are familiar with this problem and are devising a way to address it. .0> What solution do we have to workaround this There is none, as far as I'm aware, prior to PtMin. Webb | |||||
9139.2 | One workaround (makes it hard to maintain, though) | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Wed Mar 12 1997 14:41 | 2 |
Webb, if it worked in 3.2*, can they link it (non_shared?) there, and get the same behavior on 4.0*? | |||||
9139.3 | Yep, that should work... | WTFN::SCALES | Despair is appropriate and inevitable. | Wed Mar 12 1997 16:24 | 4 |
.2> if it worked in 3.2*, can they link it (non_shared?) there, and .2> get the same behavior on 4.0*? Yes, that's true...that would be a workaround. |