[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Mathematics at DEC |
|
Moderator: | RUSURE::EDP |
|
Created: | Mon Feb 03 1986 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2083 |
Total number of notes: | 14613 |
1610.0. "Wanted: Sanity check on my rusty statistics" by MINDER::WIGLEYA () Mon May 18 1992 12:21
After years keeping my head down as a humble programmer, I somehow
managed to get the job of doing some statistical modelling of a messaging
system that we are designing. I have done some analysis, consulting my
old school books in the process, and I would now like to throw open the
fruits of my labour to comment from greater intellects than my own (and
there will be many of those!).
Please feel free to criticize, comment, suggest alternative approaches!
- Andy Wigley @MCO
1.0 Problem:
It is required to deliver a message to 300 recipients with a 99%
certainty of completion within 20 seconds.
This is to take place on a system that has to be designed to handle a
peak messaging rate of 70 messages/sec.
1.1 Redefinition of problem:
Allowing MTA-MTA and UA-MTA transmit times, and other software delays
reduces the delivery time to 15 seconds.
At a time of peak activity (70 msgs/sec) any one message must be
delivered with less than 1% probability of delivery time exceeding 15
seconds.
2.0 Message Delivery time
Test programs have shown that a single message can be 'delivered' to an
RF72 (i.e. the file cabinet server can make the appropriate file updates)
in 0.076 seconds.
2.1 Determining the number of messages for a disk that can be delivered
in 15 seconds with 99% certainty
The ERLANG C statistical model is appropriate.
Using just one server (disk), we can adjust the arrival rate to find the
number of messages that can be queued up for this disk such that the
probability that a single message will have to wait more than 15 seconds
is approximately 1%.
No. of Servers 1
Service Time (secs) .076
Arrival Rate (/secs) 12.852
Util. per server (%) 97.6752
All servers idle (%) 2.32606
All servers busy (%) 97.728
Avge No. in the queue 41.0599
Avge No. in system 42.0366
Average wait time 3.19482
Average flow time 3.27082
PROBABILITY OF WAIT TIME IN QUEUE EXCEEDING t secs IS:
1 71.9732
2 53.0057
3 39.0367
..
13 1.83226
14 1.34942
15 .993783
This arrival rate of 12.852 messages/second is equivalent to 192.78
messages/15sec.
3.0 Distribution of addressees across disks
In the 15 second period, the distribution of addressees across disk
volumes becomes crucial.
The problem now becomes one of:
Given a user population evenly distributed across different disk volumes,
how many disks do you require for the probability that, in any 15 second
period, more than 361.2 messages will go to the same disk to become
insignificant.
If you repeatedly deliver samples of 932 messages to n disks and count
how many of the messages go to each disk, over a large sample, the
results will be evenly distributed around the mean with rapidly
diminishing observations the further from the mean you go. This can be
drawn as a normal distribution curve.
p is the probability that one message will arrive at a given disk
q is the probability that one message will go to any other disk
n is the number of samples
m is the expected mean, which is n/number of disks
The standard deviation of the distribution is squareroot(npq)
The probability that a single sample will have x messages addressed to
one disk is derived by dividing the variance of x from the mean m and
looking the result up in the 'Table of Partial areas under the Normal
Curve'.
Total population of messages during the 15 seconds is:
10 seconds at 70 msgs/sec (peak rate)
+ 5 seconds at the average rate of 46 msgs/sec
TOTAL : 932 msgs/(15 seconds).
No. of Probability of more than 192.78 in 932 messages
disks going to 1 disk
2 ~ 99.99%
3 ~ 99.99%
4 ~ 99.99%
5 30.08%
6 ~ 0.00%
4.0 Conclusions
As long as the potential addressees are evenly distributed across 6 or
more disks, it is virtually impossible that 932 messages will be
distributed across disks such that the probability that the delivery is
completed in 15 seconds is less than 99%.
T.R | Title | User | Personal Name | Date | Lines |
---|
1610.1 | How long are the messages? | CIV009::LYNN | Lynn Yarbrough @WNP DTN 427-5663 | Mon May 18 1992 12:44 | 3 |
| Perhaps I missed something - but is the *length* of the messages relevant?
Perhaps you have assumed that message length is short enough that it is
overwhelmed by other considerations, but it should be stated somewhere.
|
1610.2 | Do I get my grade?? | MINDER::WIGLEYA | | Tue May 19 1992 04:48 | 16 |
| No the length of the message doesn't affect this, since 'a delivery' of a
message doesn't result in a copy of the full message being written to
every recipients' directory.
It is like ALL-IN-1 - a single shared copy of the message is created for
all recipients, and the individual 'deliveries' are updates to each
recipients indexed file (their file cabinet).
The time to create the single shared copy is assumed to always be shorter
than the time to make the file updates, and all the operations go on in
parallel in any case.
The service time of 0.076 seconds/message quoted was taken from tests
making the kind of updates that will be required.
- Andy
|