T.R | Title | User | Personal Name | Date | Lines |
---|
1470.1 | We need some basic information... | WTFN::SCALES | Despair is appropriate and inevitable. | Thu Jan 23 1997 14:58 | 15 |
1470.2 | Here's a little more info | STOSS1::KASEFANG | | Fri Jan 24 1997 10:33 | 33 |
|
DUNIX 4.0a is the OS.
The clock thread performs a timed wait for 1/60 of a second on a
condition variable. The variable is not signaled.
When the wait times-out the clock thread signals the 3 medium priority
threads. They are currently using 3 threads at the same prioity. They
will want to use 7 priorities for their implementation. The low
priority threads use the lowest priority.
Schedule Policy/Priorities
Seven priorities are defined specific to each schedule policy.
Highest is pri_rr_max - 1, pri_fifo_max - 1 .... .
Lowest is pri_rr_max - 7, pri_fifo_max - 7 .... .
The following policies were tested round robin, fifo, bg_mp and other.
The higher priority threads never appeared to execute in fifo and rr.
All threads appeared to function unpredictably in other and bg_mp.
The high priority thread didn't appear to execute on a timely basis in
other.
I've requested code and the output from setld. I should have both a
little later.
Thanks for the assistance,
David
|
1470.3 | Need to use "Real-time" scheduling policies | WTFN::SCALES | Despair is appropriate and inevitable. | Fri Jan 24 1997 11:18 | 20 |
| OK, David, it looks like the customer has avoided most of the worst pitfalls.
As observed, in order for the clock thread to be responsive, it must have a
scheduling policy of either FIFO or RR; OTHER will not necessarily preempt
the currently running thread.
The fg_np (i.e., OTHER) and the bg_np scheduling policies do not run in
strict priority order. These policies are intended to run high priority
threads more than low priority threads while avoiding starvation of any of
the threads. Thus, this customer probably wants to stick to the "strict
priority/preemption" (aka "real-time") scheduling policies, at least for all
but the three low priority threads.
.2> The higher priority threads never appeared to execute in fifo and rr.
This sounds like a bug. I'd be interested to know if you can reproduce it.
Webb
|
1470.4 | | SMURF::DENHAM | Digital UNIX Kernel | Fri Jan 24 1997 17:11 | 8 |
| If this a serious realtime application (OK, soft, but serious),
then these folks are probably going to want system-contention-scope
threads coming in the next functional release (V4.0D tentatively).
Until then, the threads may be FIFO with a high priority, but
they'll be competing against other threads/processes on the system
on a timeshare basis.
Webb will explain more :^) if he thinks it's relevant.
|
1470.5 | Is this a dedicated platform? | WTFN::SCALES | Despair is appropriate and inevitable. | Mon Jan 27 1997 11:28 | 12 |
| OK, since Jeff prompted...
David, is the customer's system dedicated to this job? Or, does the customer
expect to run other processes/applications on this system "in the background"
while the simulator is running?
If the answer is that there will be other stuff running on the system, then,
yes, the customer will need the SCS stuff, which will be available in the next
release.
Webb
|
1470.6 | I'm the lucky winner!! | RHETT::PARKER | | Mon Jan 27 1997 16:52 | 40 |
|
Thanks for the replies Jeff & Webb. David is in class this week so
he called us to work with the customer. They have narrowed down the
application to a mere 5000 lines of code! :-)
I'm working on it now - I thought I would try building it on 3.2C
statically in order to get the system contention scope behavior as
well as building it on 4.0/A/B to see what happens there as well.
Basically, what I am seeing is that lower priority threads do not
appear to be preempted by higher priority threads. If I change the
code so that the lower priority threads call sched_yield(), then
it's much better. I've tried both FIFO/RR as well as the OTHER -
bg_np/fg_np scheduling policies without much difference.
So, yes I think it's a process versus contention scope related issue
but I thought I should test it to make sure. We are up against SGI
in this bid so it may be pretty tough!
I'll update here with findings in a day or so. Of course, if anybody
has any other suggestions, I'm open to them. There are about a dozen
threads in all, some at low priority, some at medium, and some a high
priority - namely the clock related thread.
On 4.0, they are linking like:
cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc
On 3.2C, I know I need -non_shared but will I also use the following
libraries ?
-lpthreads -lmach -lc_r -lc in that order? I always get confused by
this and this time, I'll write it down and plaster it to my cube wall!
:-)
Thanks for the input!!
Lee Parker
Realtime Expertise Center
|
1470.7 | Building on Digital UNIX | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Tue Jan 28 1997 08:09 | 42 |
| > On 4.0, they are linking like:
>
> cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc
They should be linking "cc -g -o relsim *.o -pthread" if they're using the
POSIX api, or "cc -g -o relsim *.o -threads" if they're using the DCE thread
api (also known, though slightly inaccurately, as "draft 4 POSIX").
There is no reason not to use the proper switches when compiling and linking
with cc. For example, not only have you gotten the libraries in the wrong
order, but you've omitted the CRITICAL definition of _REENTRANT.
The only reason to ever use the -l list directly is that, when building a
shared library, one must use ld to link, and ld doesn't accept either
-pthread or -threads. You should still always use -threads or -pthread for
the compilation, and let the compiler driver take care of remembering
-D_REENTRANT for you!
(For the record, "-threads" is "-D_REENTRANT -lpthreads -lpthread -lmach
-lexc" and "-pthread" is "-D_REENTRANT -lpthread -lmach -lexc"... and "cc"
ALWAYS adds "-lc" at the end, so you don't need to.)
> On 3.2C, I know I need -non_shared but will I also use the following
> libraries ?
>
> -lpthreads -lmach -lc_r -lc in that order? I always get confused by
> this and this time, I'll write it down and plaster it to my cube wall!
> :-)
No, you do NOT need -non_shared on 3.2C, unless you really want to build
non-shared. (I recommend against it, but if they really want to, they can.)
Again, you should always use the proper compiler switch when you're compiling
and linking with cc. 3.2C didn't support POSIX threads, only DCE threads, so
there's no "-pthread", only "-threads". There's no libpthread on 3.2C, and
threaded programs didn't use libexc (there wasn't a .so anyway), and threaded
code additionally required the reentrant version of the C runtime. So the
libraries were "-lpthreads -lmach -lc_r" in that order. Again, cc always
includes -lc at the end, so you don't need to worry about it. And, as in 4.0,
it's CRITICAL that you define _REENTRANT -- either by using "-threads", or by
using the -D_REENTRANT directly. It's really much easier just to remember
"-threads".
|
1470.8 | Thanks for the clarification! | RHETT::PARKER | | Tue Jan 28 1997 09:25 | 71 |
|
Hi Dave,
Thank you for the clarification. I know this has been discussed many
times, your note now makes it crystal clear! Just having the one switch
as opposed to specifically linking in several libraries is much better!
> On 4.0, they are linking like:
>
> cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc
That's what was in their Makefile and I did wonder if it was correct.
Of course, it worked... :-)
>>They should be linking "cc -g -o relsim *.o -pthread" if they're using
>>the POSIX api, or "cc -g -o relsim *.o -threads" if they're using the
>>DCE thread api (also known, though slightly inaccurately, as "draft 4
>>POSIX").
Well, I tried that and now I'm getting an Unresolved on pthread_exit.
I/they must be doing something wrong!
>>There is no reason not to use the proper switches when compiling and
>>linking with cc. For example, not only have you gotten the libraries
>>in the wrong order, but you've omitted the CRITICAL definition of
>>_REENTRANT.
That was in their makefile but it was not being defined. Thanks for
pointing that out.
>>The only reason to ever use the -l list directly is that, when building
>>a shared library, one must use ld to link, and ld doesn't accept either
>>-pthread or -threads. You should still always use -threads or -pthread
>>for the compilation, and let the compiler driver take care of remembering
>>-D_REENTRANT for you!
>> Understood. Thanks.
>>(For the record, "-threads" is "-D_REENTRANT -lpthreads -lpthread -lmach
>>-lexc" and "-pthread" is "-D_REENTRANT -lpthread -lmach -lexc"... and "cc"
>>ALWAYS adds "-lc" at the end, so you don't need to.)
> On 3.2C, I know I need -non_shared but will I also use the following
> libraries ?
>
> -lpthreads -lmach -lc_r -lc in that order? I always get confused by
> this and this time, I'll write it down and plaster it to my cube wall!
> :-)
>>No, you do NOT need -non_shared on 3.2C, unless you really want to build
>>non-shared. (I recommend against it, but if they really want to, they can.)
Well, I was going to do this in order to make sure I get system contention
scope behavior. I think it was you who suggested doing this IF one really
needs that. But, since 3.2C didn't support POSIX threads, I guess I can
scrap that idea. :-) I think it's time for me to take a course on
DECthreads!
>>Again, you should always use the proper compiler switch when you're compiling
>>and linking with cc. 3.2C didn't support POSIX threads, only DCE threads, so
>>there's no "-pthread", only "-threads". There's no libpthread on 3.2C, and
>>threaded programs didn't use libexc (there wasn't a .so anyway), and threaded
>>code additionally required the reentrant version of the C runtime. So the
>>libraries were "-lpthreads -lmach -lc_r" in that order. Again, cc always
>>includes -lc at the end, so you don't need to worry about it. And, as in 4.0,
>>it's CRITICAL that you define _REENTRANT -- either by using "-threads", or by
>>using the -D_REENTRANT directly. It's really much easier just to remember
>>"-threads".
|
1470.9 | | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Wed Jan 29 1997 07:29 | 25 |
| > Well, I tried that and now I'm getting an Unresolved on pthread_exit.
> I/they must be doing something wrong!
Probably. If you can't figure it out, I'll need a more complete example to
guess what's happening, though.
> Well, I was going to do this in order to make sure I get system contention
> scope behavior. I think it was you who suggested doing this IF one really
> needs that. But, since 3.2C didn't support POSIX threads, I guess I can
> scrap that idea. :-) I think it's time for me to take a course on
> DECthreads!
Ah. You wanted to link static on 3.2C and bring the binary to 4.0. I
understand now. Yes, if you're using POSIX threads rather than DCE threads or
CMA, you don't have that option. (And, by the way, I still wouldn't RECOMMEND
that -- there are a lot of improvements in the 4.0 libraries that you'll miss
by doing that.)
And, uh, by the way... are you SURE they're really using "POSIX threads"?
It's hard not to be a little confused by the fact that POSIX threads and DCE
threads both have "pthread_" names. (It's unfortunate that OSF insisted we
use the pthread_ prefix for DCE threads, and it's unfortunate that I gave in
to them instead of fighting harder, but none of that helps now.)
/dave
|
1470.10 | More info | RHETT::PARKER | | Wed Jan 29 1997 10:57 | 91 |
|
Hi Dave,
>> Thanks for the information. I found the 4.0 "Guide to DECthreads"
>> manual yesterday morning - sorry about .6, asking how to build.
>> You should have just told me to RTFM! :-)
>> I went to appendix A and right there in black and white is how
>> to compile and link and lots of other good information.
>> Great job on the manual!! - I'm going to be getting very familiar
>> with it over the next few months! From what I've read so far, it's
>> very well written - clear and concise!
> Well, I tried that and now I'm getting an Unresolved on pthread_exit.
> I/they must be doing something wrong!
Probably. If you can't figure it out, I'll need a more complete example to
guess what's happening, though.
>> Sorry - they had left out pthread.h in a couple of their .c's. Guess
>> that's just one reason to make sure you are building correctly! ;-)
> Well, I was going to do this in order to make sure I get system contention
> scope behavior. I think it was you who suggested doing this IF one really
> needs that. But, since 3.2C didn't support POSIX threads, I guess I can
> scrap that idea. :-) I think it's time for me to take a course on
> DECthreads!
Ah. You wanted to link static on 3.2C and bring the binary to 4.0. I
understand now. Yes, if you're using POSIX threads rather than DCE threads or
CMA, you don't have that option. (And, by the way, I still wouldn't RECOMMEND
that -- there are a lot of improvements in the 4.0 libraries that you'll miss
by doing that.)
>> Ok - thanks for the info! We are just starting to see issues w/ using
>> realtime and threads. For some apps that "broke" on 4.0 (due to not
>> having system contention scope, that was the only alternative. Now
>> people are starting to port to the final 1003.1c POSIX routines and
>> they are going to wait for 4.0D and just go with process contention
>> scope until then (and may wind up sticking with it depending on the
>> degree of determinism they can acheive).
And, uh, by the way... are you SURE they're really using "POSIX threads"?
It's hard not to be a little confused by the fact that POSIX threads and DCE
threads both have "pthread_" names. (It's unfortunate that OSF insisted we
use the pthread_ prefix for DCE threads, and it's unfortunate that I gave in
to them instead of fighting harder, but none of that helps now.)
>> Well, I think I am. They are including pthread.h and the routines
>> being used are those described in PART II of the manual. Now I'm really
>> confused! :-} I'm now building using the -pthreads option to cc and it
>> seems to compile fine. I tried -threads and that seems to work too.
>> Is there an easy way to tell?
>> Perhaps I should file a high/med priority QAR and see if someone can
>> look into it. The application is a realtime flight similator and they
>> are comparing Digital UNIX/Alpha to SGI. They narrowed it down to a
>> test case of about 2400 lines of code. I've been running it here and
>> I am finding that even though they appear to be filling in the attribute
>> structure correctly, specifying PTHREAD_EXPLICIT_SCHED, the threads
>> are not shown by ps axm -OSCHED to be using RR scheduling policy. Of
>> course, since we are now using process contention scope, I'm not sure
>> that ps(1) will be able to correctly report that anymore.
>> Just called the customer ...
>> They started this application in summer/95 and it runs correctly on
>> Solaris and SGI. On Solaris the routines are different, don't contain
>> the pthread_ prefix... But, that fact that it runs correctly on SGI
>> does make me concerned. I could go into a lot more detail on what I
>> have tried so far but I'm not sure it's worth the time. I guess I could
>> also file an IPMT instead but I'm not convinced they are not doing
>> something incorrectly on Digital UNIX. Looks like they started with the
>> draft 4 routines and are bringing that up to the final POSIX routines.
>> Would someone up there be able to spend a little time on this over the
>> next day or two if I file a QAR? Or, I could put a tar file on our
>> internal anonymous ftp server if someone wants to pick it up.
>> BTW: Where does one file QAR's for DECthreads on Digital UNIX ? If it's
>> the same as any other Digital UNIX QAR, I already know how to do that.
>> Thank you all for your input so far!! It's been very helpful!!
Lee Parker
Realtime Expertise Center
|
1470.11 | Show us the reproducer! (;-) | WTFN::SCALES | Despair is appropriate and inevitable. | Wed Jan 29 1997 11:27 | 53 |
| .10> I'm now building using the -pthreads option to cc and it
.10> seems to compile fine. I tried -threads and that seems to work too.
.10>
.10> Is there an easy way to tell?
Both interfaces provide routines with the same names for the most part, and
the routine signatures are pretty much the same as well. (Unfortunately.)
The most obvious difference is that the D4 interface returns -1 on error, and
the standard interface returns the ERRNO code.
Also, PTHREAD_EXPLICIT_SCHED is a constant in the standard interface (it's
analog in the D4 interface is PTHREAD_DEFAULT_SCHED), so it sounds like they
are using the standard (-pthread) interface.
.10> the threads are not shown by ps axm -OSCHED to be using RR scheduling
.10> policy
Process contention scope scheduling parameters do not show up in the ps
output (it shows the scheduling parameters of the DECthreads "virtual
processor" instead).
.10> BTW: Where does one file QAR's for DECthreads on Digital UNIX ? If it's
.10> the same as any other Digital UNIX QAR, I already know how to do that.
RTFNF. ;-) See note 3.3. (Yes, it's the same as for any other component of
the Digital Unix base operating system.)
.6> Basically, what I am seeing is that lower priority threads do not
.6> appear to be preempted by higher priority threads. If I change the
.6> code so that the lower priority threads call sched_yield(), then
.6> it's much better. I've tried both FIFO/RR as well as the OTHER -
.6> bg_np/fg_np scheduling policies without much difference.
This would strike me as a bug. Can you write a small test program which
demonstrates that you don't see preemption with FIFO/RR policies? If so,
please enter a QAR or open an IMPT case, as appropriate.
Webb
P.S. Lee, your convetion for quoting previous text is very confusing. The
typical convention is to add characters at the beginning of the lines which
you are quoting, not at the beginning of the lines you are writing. (This
way, it's the lines which are quoted from quotations that acquire multiple
characters at the front; the new text is obvious by its lack of quotation
characters...) [Alternatively, in notes conferences it makes sense to put
the note/reply number in the quote prefix, which can be useful since it
serves as "bibliography" as well as quote-indicator.]
|
1470.12 | Coming up! | RHETT::PARKER | | Wed Jan 29 1997 12:16 | 30 |
|
Hi Webb,
Ok, I think I may try to narrow it down to a smaller test case.
I kinda thought that ps(1) would not work correctly when using
process contention scope. Thanks for the info...How about when
system contention scope comes back? Or, is this not going to work
then either since the scheduling for threads is now done in user
mode?
.11> P.S. Lee, your convetion for quoting previous text is very confusing.
.11> The typical convention is to add characters at the beginning of the
.11> lines which you are quoting, not at the beginning of the lines youare
.11> writing. (This way, it's the lines which are quoted from quotations
.11> that acquire multiple characters at the front; the new text is obvious
.11> by its lack of quotation characters...) [Alternatively, in notes
.11> conferences it makes sense to put the note/reply number in the quote
.11> prefix, which can be useful since it serves as "bibliography" as well
.11> as quote-indicator.
Amen for that suggestion!! I just about drove myself crazy trying to
follow a couple of my own note strings!! :-)
Now, if we can just get a vi or emacs editor for notes...
Thanks,
Lee
|
1470.13 | | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Wed Jan 29 1997 12:43 | 16 |
| > I kinda thought that ps(1) would not work correctly when using
> process contention scope. Thanks for the info...How about when
> system contention scope comes back? Or, is this not going to work
> then either since the scheduling for threads is now done in user
> mode?
ps will be able to show system contention scope threads, but it won't help to
associate those kernel threads with the user threads in your program. (But
that's not a new problem.) Pete's toyed with the notion of modifying ps to
use libpthreaddebug.so to show user thread information. That'd be "cute", but
maybe not practical.
You can always run the program with ladebug to see the full scoop on all the
threads.
/dave
|
1470.14 | More info | RHETT::PARKER | | Thu Jan 30 1997 15:35 | 115 |
|
Thanks for the info!
.13> You can always run the program with ladebug to see the full
.13> scoop on all the threads.
I'm new to ladebug too - any special tricks to doing this. I have
used ladebug on this app w/ some interesting side-effects. These
led me to try to fix this compiler warning :
cc: Warning: rt_pkg.c, line 183: In this statement, the referenced
type of the pointer value "&rt_clock" is "function (pointer to unnamed
struct) returning void ", which is not compatible with "function (pointer
to void) returning pointer to void".
rt_clock_address = &rt_clock;
--^
cc: Warning: rt_pkg.c, line 274: In this statement, the referenced type
of the pointer value "&rt_task" is "function (pointer to unnamed struct)
returning void" , which is not compatible with "function (pointer to void)
returning pointer to void".
rt_address = &rt_task;
--^
The offending line :
void *(*rt_clock_address)(void *);
rt_clock_address = &rt_clock;
I changed to :
void (*rt_clock_address)(clock_specific_struct *);
rt_clock_address = &rt_clock;
And, the other one:
void *(*rt_address)(void *);
rt_address = &rt_task;
changed to:
void (*rt_address)(thread_specific_struct *);
rt_address = &rt_task;
Those warnings maybe were ok but I thought I should try to fix it ...
And, once I've fixed that, I start getting other warnings that really
concern me! Like :
cc: Warning: rt_pkg2.c, line 232: In this statement, the referenced
type of the pointer value "rt_clock_address" is "function (pointer to
unnamed struct) returning void", which is not compatible with "function
(pointer to void) returning pointer to void".
status = pthread_create(&rt_clock_id,
-----------^
cc: Warning: rt_pkg2.c, line 337: In this statement, the referenced type
of the pointer value "rt_address" is "function (pointer to unnamed struct)
returning void", which is not compatible with "function (pointer to void)
returning pointer to void".
status = pthread_create(&rt_task1_id,
-----------^
cc: Warning: rt_pkg2.c, line 348: In this statement, the referenced type
of the pointer value "rt_address" is "function (pointer to unnamed struct)
returning void", which is not compatible with "function (pointer to void)
returning pointer to void".
status = pthread_create(&rt_task2_id,
-----------^
cc: Warning: rt_pkg2.c, line 355: In this statement, the referenced type
of the pointer value "rt_address" is "function (pointer to unnamed struct)
returning void", which is not compatible with "function (pointer to void)
returning pointer to void".
status = pthread_create(&rt_task3_id,
-----------^
Needless to say, Warnings like this on the pthread_create() concerns me
a lot! :-)
The definitions that it's complaining about are :
status = pthread_create(&rt_clock_id,
&rt_clock_attr,
rt_clock_address,
&rt_clock_data);
check(status, "rt_clock create error");
....
BTW: The check routine checks for (!= 0) and not for (!= -1)
Thanks for that tip in an earlier note.
Anyway, I was getting TRAP signals when running in ladebug when the
pthread_create() occured but I was stepping and may be that's why?
Remember, I'm just try to find out why their code is not behaving
as expected. They say this all works on SGI - I'm doubtful!!
Any comments or suggestions. I've narrowed the code down a bit
but it's still ~2000 lines...
Thanks again!!
Lee
|
1470.15 | Duh! | RHETT::PARKER | | Thu Jan 30 1997 16:32 | 10 |
|
Oh, duh! Nevermind, I looked at the man page for pthread_create(3)
and now I see what's going on. Well, almost! :-0
I'll keep plugging away at it. Man, this threads stuff is really
different!
Lee
|
1470.16 | Incoming... | RHETT::PARKER | | Tue Feb 04 1997 14:49 | 26 |
|
Hi Folks,
Well, I've worked with this program enough to convince myself
that we are dealing with a bug here. I added some calls to
pthread_getschedparam(3) to verify that the scheduling policy
is round robin and the priorities are set correctly. This looks
good but, unless the lower priority threads do something that
block, like sleep(2), the higher priority thread does not get
to run. If I call sched_yield(3) instead of the sleep, then it
helps a little. But the one second sleep allows the highest
priority thread to run the most.
This is just a heads up on an incoming IPMT - priority 2.
My apologies if I have overlooked something. I don't think I
have though. In any event, we can win against SGI if we can
show them that this works. Unfortuantely, it already works
on their UNIX. It works on Solaris too but they got the boot
anyway! ;-)
Feel free to beat me up if I missed something!!
Lee
|
1470.17 | It's a (now) known problem | WTFN::SCALES | Despair is appropriate and inevitable. | Tue Feb 04 1997 17:36 | 12 |
| We've had a report of a similar problem (preemption after blocking in a system
call not working) from another customer. It turns out that the problem reported
in this note boils down to the same problem, because on V4.0a, the manager
thread (the one responsible for, among other things, waking threads at timeout),
is basically like any other thread and it's blocking in a system call.
I expect that the problem in this note will be resolved in V4.0c, in which the
manager thread is treated specially. The general fix to the preemption problem
will be available in the following release at the earliest, and possibly not
until the next functional release.
Thanks for the IMPT case (it just arrived).... :-)
|