[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::pwv50ift

Title:Kit: Note 4229; Please use NOTED::PWDOSWIN5 for V4.x server
Notice:Kit: Note 4229; Please use NOTED::PWDOSWIN5 for V4.x server
Moderator:CPEEDY::KENNEDY
Created:Fri Dec 18 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4319
Total number of notes:18478

4267.0. "Data cache saturated" by UTRTSC::HARLE () Thu Apr 17 1997 10:00

Could somebody explain what the warning "data cache
saturated" in the server logfile means?

Is it possible that a large and fast file transfer 
will overflow the data cache because the data in the
cache cannot be written to disk fast enough ? Isn't
there some sort of flow-control for incoming data ?

Are there known bugs in the caching algorithm ?

Rein Harle
Digital Pathworks Support
T.RTitleUserPersonal
Name
DateLines
4267.1cache saturation -> increase the cache's sizeCPEEDY::KENNEDYSteve KennedyThu Apr 17 1997 14:0935
    .0> Could somebody explain what the warning "data cache
    .0> saturated" in the server logfile means?
    
    It means that during the cache's processing of a new request for client
    file data the cache found that all cache buffers are actively in use
    with other operations and that no buffers are available for use in
    processing the new request and bringing in the requested data. If this
    occurs, you should consider following the recommendation in the message
    to increasing the cache size.
    
    .0> Is it possible that a large and fast file transfer 
    .0> will overflow the data cache because the data in the
    .0> cache cannot be written to disk fast enough ? Isn't
    .0> there some sort of flow-control for incoming data ?
    
    No, it should not be possible for a single large file transfer to 
    cause this situation.  There are multiple algorithms and heuristics in
    place to deal with buffer starvation and loads.  As things become
    busier, data stays in in the cache less time and cache activities for
    freeing up buffers for other requests increases.  If things become
    really bad, the cache goes into a "serialized access" mode to limit the
    activity in the cache until the situation improves and the cache is 
    "back on its feet".  A cache saturation condition will only occur after
    all other possible dynamic parameters are pulled back to their minimal
    levels and even then the message you refer to is output only after
    every 100th consecutive detection that the cache is saturated.
    
    .0> Are there known bugs in the caching algorithm ?
    
    No, no known problems.
    
    How big is the cache you're referring to?  Have you used the Monitor
    utility to check the load on the system?
    
    \steve
4267.2UTRTSC::HARLEFri Apr 18 1997 07:0232
The cache in question has a size of 2048 Kbytes. I advised
the customer to increase to 8192 Kbytes. I expect this will
help in the case of peak load problems, but I doubt it will
work in the case of a general disk i/o overload problem.

It is clear to me that the caching algorithm is very clever
in deciding when to write data blocks to disks. But what
happens if the disk just can't keep up with the rate the 
caching algorithm issues write-disk requests ? Eventually, 
the PW (or the RMS data cache) will overflow ?

Suppose, I have a PW server, with PW data cache and RMS data
cache. When the PW data cache threatens to overflow, it will
write data to RMS. The data will initially be written in the
RMS data cache, and eventually end up on disk. I presume, the
rate with which data can be written to disk is much lower than
the rate with which data can be read from the network.
So, when I have a large number of clients, each one transferring
large files to the PW server, there will be some point where
the disks can't keep up with the rate the data is coming from 
the network. The use of PW and RMS caching will evidently postpone
the problem, by keeping the data in the cache as long as possible,
but sooner or later both caches will be full. 

My worry is: is there some sort of flow control mechanism at 
the netbios level that will keep the clients from sending more 
data ? Or better: is there some sort of flow control mechanism
at the netbios level that will tell the decnet layer to stop 
accepting more data ?

Rein Harle
Digital Pathworks Support
4267.3flow control mechanisms are localizedCPEEDY::KENNEDYSteve KennedyFri Apr 18 1997 12:2965
.2> The cache in question has a size of 2048 Kbytes. I advised
.2> the customer to increase to 8192 Kbytes. I expect this will
.2> help in the case of peak load problems, ... 

    The 2Mbyte cache size is the default.  Assuming the customer's system
    has the capacity for increasing the cache size, this should help.

.2> ... but I doubt it will
.2> work in the case of a general disk i/o overload problem.
.2> 
.2> [...] But what
.2> happens if the disk just can't keep up with the rate the
.2> caching algorithm issues write-disk requests ?

    First, PW doesn't use RMS for servicing client data - it uses the XQP,
    so the double caching you describe doesn't happen.

    (Ignoring the references to RMS ...)

    Disk i/o bottlenecks are not a PW-only problem.  Obviously you can
    create circumstances where a disk is the bottleneck in the system - but
    you can do that with or without PW.  If PW is running in such an
    environment, PW will suffer along with anything else.

    If an environment has both a requirement for speedy transfers and a
    need to service a large number of clients, then one shouldn't use slow
    disks nor concentrate the load on a single disk. Again, not a PW only
    problem. Obviously if you spread out the load, the chances of the
    bottleneck are decreased ... and that would help PW too.

.2> My worry is: is there some sort of flow control mechanism at 
.2> the netbios level that will keep the clients from sending more 
.2> data ? Or better: is there some sort of flow control mechanism
.2> at the netbios level that will tell the decnet layer to stop 
.2> accepting more data ?

    Why would one want to stop the flow to buckets A, B & C when only
    bucket C (the data cache) is full?  A network request doesn't always
    translate into load on the data cache, therefore the flow control
    mechanism (for cache saturation) is in the data cache itself, not at
    the network. When PW cache saturation occurs, mechanisms in the cache
    code reduce the flow of requests into the cache (thereby reducing the
    load on the cache) in the hopes that the cache will be able to catch up
    under a 'controlled' load.  As things get better, the flow of requests
    to the cache is opened a bit more as the cache thinks it has the
    capacity to deal with an increased load.  If cache satuation is only
    hit during peak load situations, then hopefully the need for
    controlling the flow of requests into the data cache will not be
    necessary during non-peak situations.

    Finally, the cache saturation message originally referenced is only an
    FYI message.  You can ignore it and things will still work - they'll
    just be working under far less than optimal conditions (note that the
    message will be output proportionally to the number of times the cache
    detects this situation).  If the PW system has the capacity to increase
    the size of the cache, that's what we recommend. Obviously anything
    else which can be done to generally help i/o throughput to the disks
    will usually also benefit PW throughput.

    And finally (I lied the last time ;-), as mentioned in .1, PW provides
    the client-based Monitor utility to monitor load on the PW system.  The
    Monitor utility can be used to try to characterize the client load on
    PW and help see when/where bottlenecks are occurring.

    \steve