T.R | Title | User | Personal Name | Date | Lines |
---|
1235.1 | | HELIX::SONTAKKE | | Tue Mar 04 1997 10:43 | 2 |
| You should ask this question in the DECnet/OSI conference as that
device is supported by the WAN Support for Digital UNIX Systems
|
1235.2 | some help | HELIX::KAUFFMAN | And I don't know why... | Tue Mar 04 1997 10:46 | 28 |
| Hi Sylvia,
The only way to validate the performance for sure is
to do a benchmark with the hardware and software configuration
of the customers system.
I can give you some general information that may help. If you look
at the realtime performance report available on the external
web under: http://www.digital.com/oem/products/rtunix/rtunix.htm
you'll find basic realtime performance information including ISR
latency numbers. We do not have results for a 4100 but they
are most likely better than the VME 2100 (sable based) system
that is reported.
I am not familiar with the PBXDP interface but hope that it is
capable of doing some local buffering (and perhaps DMA). If
so and the application does buffering of the data it should be
able to keep up with a 1Khz data rate ( this assumes that
equates to 1 interrupt per frame or 1 millisecond). Looking at the
VME 2100 ISR latency numbers in a multi-user test the Mean ISR latency
time is 12.5 microseconds and the worst case is 262 microseconds
so the data rate seems very reasonable assuming the rest of
the data manipulation is simple as you have stated.
Hope this helps a little. Perhaps others that know the
PBXDP can add some additional information.
Good Luck...Jeff
|
1235.3 | more questions about RT-DU??? | BEJVC::SYLVIAXIONG | | Thu Mar 06 1997 04:55 | 29 |
| Jeff,
I was just back from the customer site and got more info about
the project, more questions about realtime features of Digital UNIX.
1) How many memory space the realtime kernel of Digital UNIX takes up?
What's the maximum system overhead of realtime kernel of Digital UNIX?
2) Is there limit on memory space locked by a realtime process? The
customer told me that there would be a 7GB data file, which needs to
be locked in memory in his realtime application.
3) Is it possible to get better realtime responsiveness when
ubc_maxpercent is reduced below 50% or set to 0?
3) The system clock resolution on Digital Alpha system is 1/1024 second.
How CLOCK_REALTIME clock measures time in nanoseconds? Is CLOCK_REALTIME
clock a soft or hard clock? How high-resolution clock is implemented by
configuring kernel option MICRO_TIME?
4) The realtime performace data of Digital UNIX is based on process. Is
there data based on threads?
I will post more info about the customer's project and his concerns.
Thanks & Regards
Sylvia
|
1235.4 | Some info | RHETT::PARKER | | Thu Mar 06 1997 09:40 | 61 |
|
Sylvia,
I'll take a shot at these questions. I'm sure Jeff or someone will clarify
if I've missed some things.
Hth,
Lee
--------------------------------------------------------------------------
> 1) How many memory space the realtime kernel of Digital UNIX takes up?
> What's the maximum system overhead of realtime kernel of Digital UNIX?
All kernels are realtime as of Digital UNIX 3.0. The kernel is usually
between 7-9MB in size on 4.0. In order to make the kernel more fully
preemptable, you enable rt-preempt-opt in the generic subsystem in
/etc/sysconfigtab, like so:
generic:
rt-preempt-opt = 1
> customer told me that there would be a 7GB data file, which needs to
> be locked in memory in his realtime application.
No, not really. You would need to "tune" a couple of vm parameters in
order to wire down 7GB - vm-maxwire and vm-maxvas + possibly some others.
> 3) Is it possible to get better realtime responsiveness when
> ubc_maxpercent is reduced below 50% or set to 0?
If they are doing a lot of writes to the file, then yes, lowering the
ubc_maxpercent attribute would help with the issue where the system response
is less than ideal when the update daemon sync's the disks. I don't think
I would lower it to 0 though.
> 3) The system clock resolution on Digital Alpha system is 1/1024 second.
> How CLOCK_REALTIME clock measures time in nanoseconds? Is CLOCK_REALTIME
> clock a soft or hard clock? How high-resolution clock is implemented by
> configuring kernel option MICRO_TIME?
You can't get nanosecond granularity without an external clock. The MICRO_TIME
option does give finer granularity - the time returned by the clock_gettime
routine is extrapolated between clock ticks to give apparent microsecond
granularity - so it's sort of a soft clock. The actual resolution of the
CLOCK_REALTIME is 976562 microseconds.
> 4) The realtime performace data of Digital UNIX is based on process. Is
> there data based on threads?
If you are using 4.0, then you are going to have to wait until 4.0D
to get the best realtime performance. Currenty, only process contention
scope has been implemented in the DECthreads 2 level scheduler. They
should have system contention scope back in by 4.0D, I think. As long as
not much else is running on the system, then that may be ok.
|
1235.5 | | BEJVC::SYLVIAXIONG | | Fri Mar 07 1997 04:16 | 17 |
| Lee,
The physical memory configuration will be 8GB. Under this
situation, is it possible to lock 7GB data file in memory?
Is there limit on memory space locked by a realtime process
for Digital UNIX?
In section 6.1 Clock Functions, on Page 6-2, chapter 6 of
Guide to realtime programming, there is a statement, "The
CLOCK_REALTIME clock measures time in nanoseconds", how to
understand it?
Thanks & Regards
Sylvia
|
1235.6 | | RHETT::PARKER | | Fri Mar 07 1997 10:10 | 25 |
|
Sylvia,
With 8GB of physical memory, you should be able to lock a 7GB
file into memory -> you will need to up some of the VM params
mentioned before, esp. vm-maxwire.
> CLOCK_REALTIME clock measures time in nanoseconds
We are required by POSIX to return seconds & nanosecoonds.If you
had an external clock that actually had nanosecond resolution,
then you would get that. But, since the system clock only has
~millisecond granularity, that's about as fine as you can get unless
you use the options MICRO_TIME kernel config option. Then, you can
get apparent micro-second granularity. Sections 6.1.4 & 6.1.5 in the
"Guide to Realtime Programming" explain this fairly well. 6.1.4 also
explains that POSIX 1003.1a mandates that if a program requests a
timer value that is not an exact multiple of the system clock
resolution (976.5625 microseconds), the actual time period will be
slightly larger than the requested time period.
Hope this helps!
Lee
|
1235.7 | | HELIX::SONTAKKE | | Mon Mar 10 1997 14:56 | 13 |
| The specific question about wiring down the 7GB file in the memory
needs to moved to Digital_Unix conference. For all I know, nobody else
might have tried to do that yet. Not many organizations in the world
have a budget which allows to have 8GB memory on-board and can afford
to allocate 7GB of that to single file :-)
The MICROTIME gives the ability for high granularity time stamping.
You could measure events with apparent granulairty of microsecond.
However, event scheduling still is limited to the granularity of the
system clock which is running at around 1ms (actually 0.9765625 to be
exact)
- Vikas
|
1235.8 | | HGOM11::SYLVIAXIONG | | Tue Mar 11 1997 04:27 | 117 |
| Here are the info about my customer's project -- realtime satellite data
processing system.
1. Proposed configration
------------------------
Side 1 Side 2
--------------- ---------------
Satellite Data Collection Data Collection
Station Instrument Instrument
--------------- ---------------
. .
. HDLC HDLC .
. .
| PBXDP PBXDP |
----------- ----------- Realtime
Data AlphaServer -------- ------- AlphaServer Data
Processing 4100 | | 4100 Acquisition
Center ----------- | | -----------
| |
| |
----------- ------ ----------- Realtime
AlphaServer ------|MC Hub|----- AlphaServer Data
4100 ------ 4100 Processing
----------- | | -----------
| |
| |
----------- | | ----------- Database
AlphaServer -------- ------- AlphaServer Server
4100 4100
----------- -----------
A) Six AlphaServers are composed into memory channel cluster. Side one
and side two are the backup systems for each other.
B) For realtime data collection and processing systems, each AlhpaServer 4100
is configured with 2 CPU and 8GB memory.
C) Realtime data collection is through PBXDP PCI-based synchronous
communication controller based on HDLC protocol. PBXDP is capable of
providing line speeds up to 5Mbps.
D) OS is Digital UNIX configured in realtime kernel.
2. Application and performance requirements
-------------------------------------------
A) Realtime data acquisition system
As for peak performace, to receive 1000 data frame per second and to
send 300 data frame per second through PBXDB, to pass 500 data frame
per secnod through memory channcel to data processing system
simulatenously, one data frame is 100 bytes in maximum. In addition
to data collection, the system will do some simple data processing,
such as encoding, decoding, comparision and formatting, etc. The
procedure is to receive 2 data frames, do some processing, to form one
initial data and pass it to processing system, it is required to finish
the procedure in 3 milliseconds without data lost.
B) Data processing systme
For one initial data, there will be a computation of 400,000 instructions,
the result will be passed back to data acquisition system for sending to
satellite station.
3. Questions about Digital realtime features
--------------------------------------------
A) The realtime performace of context switch and preemption was reported
based on processs, which was little concerned by this customer. Since the
overhead of threads is smaller than the overhead of processes in general,
the customer asked whether it will get better performance when process
Input A and Output A are combined into one process with two threads, Input A
and Output A.
Data acquisition system | Data processing system
|
--> Input A Output A --> |
First Second |
Priority Priority |
|
|
<-- Output B Input B <-- |
Fouth Third |
Priority Priority |
B) Because it is required to finish procedures from Input A to Output A in
3 milliseconds, this customer thought that Digital UNIX system overhead
would be the major factor, how about the system overhead in this case?
C) Now, data acquisition system and data processing system are both 2 CPU
systems. Because there will be large amount of data transmittion between
two systems, which maybe the major infection to the realtime response of
the application, is it possible to get better performance when data
acquisition and processing systems are combined into one system with 4 CPU?
D) Digital UNIX realtime interface supports timesharing scheduling policy and
fixed-priority, preemptive scheduling policy. Are these two kinds of
scheduling policies able to coexist simulatenously, that is some processes
run under timesharing scheduling policy, some processes run under fixed-
priority, preemptive scheduling policy?
Please kindly give your advises, many thanks!
Regards
Sylvia
|
1235.9 | | HELIX::SONTAKKE | | Mon Mar 17 1997 13:00 | 1 |
| I thought the memory channel hub only supports 4 systems.
|
1235.10 | 4 was an Encore limit??? | BBPBV1::WALLACE | john wallace @ bbp. +44 860 675093 | Mon Mar 17 1997 19:06 | 8 |
| I'm pretty sure it *works* with more. As usual, "support" may be a
different matter.
Adding more nodes and longer cables is one part of Digital's "added
value" to what came in from Encore.
regards
john
|
1235.11 | | HELIX::SONTAKKE | | Tue Mar 18 1997 10:57 | 1 |
| Is this being discussed anywhere else?
|
1235.12 | Some late-in-the-day thoughts. No warranty, etc. | BBPBV1::WALLACE | john wallace @ bbp. +44 860 675093 | Tue Mar 18 1997 15:32 | 87 |
| Hi Sylvia,
Re .8: You have quite a project on here. Who else do you have to help?
You probably need more than folks can offer here.
For a start, you need someone who has in-depth knowledge of the HDLC
stuff. Do the drivers and software you plan to use give you frame level
access with the kind of functionality you need ? Some software doesn't
give you full access; I don't know about the DU stuff. You may need
full access if the HDLC link has been "modified" in any way.
You also need a good estimate of the per-frame CPU time requirement
just to pass the data up or down the driver(s).
But here are my thoughts so far.
2A:
The processing you describe in 2A and 2B doesn't seem to need a 7GB
file in memory. Where does the 7GB come in ?
main() { /* Data acquisition program */
while (1) {
receive (frame1);
receive (frame2);
mangle_1 (frame1, frame2, big_7gb_array?, frame3);
/* ^------ what kind of work goes on in here ? */
send (frame3); /* the "initial" frame */
}
}
2B: You mention that the computation per "initial message" will be some
400,000 instructions. Do you have more info? The time to perform these
instructions will vary enormously depending on whether the data is in
on-chip cache, main memory, or somewhere in between. Obviously, the
100bytes per HDLC frame fit nicely in on-chip cache but the 7GB database
is main-memory speed. What kind of mix can you expect ?
main() { /* Data processing program */
while (1) {
receive (frame_in); /* Get "initial message" */
mangle_2 (frame_in, frame_out, another_7gb_array_?);
/* ^---- 400K instructions */
send (frame_out);
}
}
3A: As you say, in general you get much better context switch times
between threads than between processes.
If your data acquisition system really is an input-processing job and
an output-processing job, with little interaction between them, it
*might* be convenient to have separate processes for simplicity. But if
you use two threads you might save context switch time.
3B: What is the impact of missing your 3mS deadline? If it is
catastrophic, maybe DU as configured in .8 is not the right answer. If
it is acceptable to be late "occasionally" then we carry on looking at
DU. (I would be surprised if it was catastrophic because I don't expect
100% availability and integrity from a satellite downlink).
3C: The 4100 has excellent system bus bandwidth. At first glance it
might be reasonable to expect a single system with all the data in one
box to "perform better" than the same number of CPUs in several boxes
with an MC interconnect, because the latency and thruput of the 4100
system bus is better than the MC interconnect. You save money, too, so
long as you can fit enough processing power in one box. And you've
already got a master/standby setup. Maybe you don't need MC at all?
NB "perform better" here means "give better thruput".
3D: Yes, realtime and roundrobin can coexist in the same box. Do you
want them to ? With everything in the same box you may find you have
worse latencies for some RT things, so if that is important there may
still be advantages to splitting RT from TS in separate boxes, so the
TS stuff cannot block the RT stuff so easily...
What is the "output" of this system ? Files on disk (or records in a
database) ? Users at screens ? Frames back up the uplink within n mS of
the matching incoming frames ? That affects where you split the RT and
the "timesharing"...
hope this helps a little
regards
john
|
1235.13 | | HELIX::SONTAKKE | | Wed Mar 19 1997 08:42 | 8 |
| By the way,
VM enforces a system wide wiring limit of 80% of available memory.
That will be ~6G on an 8G system. The limit can be reconfigured by
changing vm-syswiredpercent in /etc/sysconfitab.
- Vikas
|
1235.14 | | SMURF::DENHAM | Digital UNIX Kernel | Thu Mar 20 1997 08:40 | 5 |
| And further by the way, change that 80% wire limit very carefully!
The kernel really doesn't like running out of wired memory. You
can end up wedging things to you suck up too much wired memory.
Anyone who's dealt with wired memory leaks knows that things can
fall apart pretty badly...
|
1235.15 | | HGOM11::SYLVIAXIONG | | Tue Mar 25 1997 04:37 | 60 |
| John,
Yes, it's a about US$ 1M project.
The processing described in 2B of .8 will need 8GB basic data.
As for 400,000 instructions, it means that at least 0.4 mips
computing capacity is needed to generate a result from an initial
data, told by my customer.
One of output in 2B is back up to the Satelite station by HDLC
connection.
I don't know much about HDLC, so I posted the questions to the
conference OZROCK::X25_OSF. The attached is answer to my ABC
questions. Could I posted your question to this conference in
order to get the answer for my project?
Thanks & Regards
Sylvia
<<< OZROCK::DISK$NAC$PUBLIC:[NOTES$LIBRARY]X25_OSF.NOTE;1 >>>
-< Proudly built by the engineers of NaC Australia >-
================================================================================
Note 872.1 Questions about PBXDP??? 1 of 1
OZROCK::MUGGERIDGE "X.25 is 1-2-3" 29 lines 9-MAR-1997 20:10
--------------------------------------------------------------------------------
>> 1) Is PBXDP supported on AlphaSever 4100 running Digital UNIX?
Yes.
>> 2) Data communication is based on HDLC protocol, which software
>> driver is better for this case? WAN Support for Digital
>> UNIX System V2.0A (SPD 42.47) or DECnet/OSI V4.0 for Digital
>> UNIX (SPD 41.92)? Is there HDLC APIs, which support
>> network programming to PBXDP?
WAN Support for Digital UNIX System V2.0A is the only software you require
for this.
There are HDLC APIs. The WDD Programmer's reference contains more details.
>> 3) The performance requirements for PBXDP are to receive 1000
>> data frame/sec and to send 500 data frame/sec simulatenously,
>> one data frame is 100 bytes in maximum. Is the communication
>> capacity of PBXDP able to meet these requirments?
This is a definite maybe! Looking at the numbers your system will be required
to process more than 1500 interrupts per second and run the lines between 1-2
Mb/s.I don't see any problems with that, as long as your aware of the other
activity on your system.
Naturally, you will need to use V.35 or X.21 type interfaces for these speeds.
Matt.
|