[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9075.0. "Blocking all other processes via "mv foo bar" on 4.0B ..." by APACHE::CHAMBERS () Fri Mar 07 1997 16:20

Yet another problem with 4.0B that doesn't seem to be matched by any keywords,
and which probably means something needs a bit of tuning up ...

I had a good-sized (about 100MB) file in /tmp in the root partition, and wanted
to move it elsewhere, so of course I typed a command:
   mv /tmp/foo /var/tmp/foo
where /var/tmp is of course in a different partition.  Imagine my surprise when
this command caused everything on my screen to freeze!  I could move the pointer
around all I wanted, but the focus never changed, and other X magic (such as
clicking on borders to raise windows) had no effect whatsoever.  The focus was
still in the window where I typed the "mv" command, and a ^C there had no effect
at all.  I even tried telnetting from another nearby machine, and it didn't
work; I got a timeout but no login prompt.

Eventually the mv command terminated, and all the queued-up operations took
place rapidly.  (The ^C wasn't too effective, of course. ;-)

This (mis)behavior is of course a total showstopper for most of our applications
so I'd like to find out 1) what causes it, and 2) how to make sure that no
customer sees such behavior.  Freezing the GUI when some process triggers a
"mv" command isn't going to produce rave reviews.

This was on a DEC 3000 machine, with 32MB memory, a newly-installed "default"
4.0B Unix (as of yesterday), and "which mv" said that /sbin/mv was what I was
running.

I'd guess that there might be some system tuning parameters that I need to know
about to fix this.  Digging around in the Installation Guide didn't turn up any
recognizable clues.

T.RTitleUserPersonal
Name
DateLines
9075.1ubc consuming memory?RHETT::MOOREFri Mar 07 1997 16:2719
    I'll take a WAG and assume that the system was able to read the data
    from /tmp faster than it wrote to /var/tmp, and as a result wound up
    filling up the unified buffer cache with pending disk writes.  With
    the default setting of ubc-maxpercent=100%, the UBC can grow to 100%
    of available memory.  If this is happening, you've got little or no
    memory left for your other processes to run.
    
    Try tuning ubc-maxpercent down to 50% and see if the same thing
    happens.  To do this, add the following to /etc/sysconfigtab:
    
    vm:
    ubc-maxpercent = 50
    
    then reboot and try the test again.
    
    Martin Moore
    Digital UNIX Support
    Atlanta CSC
    
9075.2Any other sugestions?APACHE::CHAMBERSMon Mar 10 1997 13:3740
    Try tuning ubc-maxpercent down to 50% and see if the same thing
    happens.  To do this, add the following to /etc/sysconfigtab:
    
    vm:
    ubc-maxpercent = 50
    
    then reboot and try the test again.
    
Well, I did that, and verified via dbx that the value of ubc_maxpercent was
indeed 50.  It didn't change a thing.  In fact, I did an even more basic
test:  I wrote a couple-line C program that merely fills an N-byte buffer
with junk and a newline in position N, and then writes the result to stdout
until it has written M bytes.  With M=100000000, N=1000 and stdout redirected
to a disk file, I got the same misbehavior:  After a few seconds, the only
thing that worked on the screen was the pointer, which moved around just fine.
But any other operations (changing focus, clicking to change stacking, input
to the active window) simply hung until the disk write completed.

Further evidence that the problem isn't with output buffers:  I also wrote a
little C program that merely reads stdin and does nothing with the data.  I
redirected its stdin to the 100-MB file generated above.  After a few seconds,
most of the stuff on the screen was again hung.  This one wasn't quite as
drastic; the hangs typically lasted 5-10 seconds, then everything woke up 
and ran like crazy for a few seconds before hanging again.

So it's not just cp that causes the problem.  A small loop that just reads
or just writes data from/to a disk file can hang most of the process table.

During both of these tests, I kept a copy of top running in one window with
a 1-sec refresh time.  Its output would hang at the same time as everything
else's.  This could be either because top itself was hung, or because its
xterm was hung; I don't know how to distinguish these cases.  But top itself 
is a rather small process (RES=196K), and uses only a couple percent of the 
cpu at its highest refresh rate, so the problem isn't only with big programs.

Any other ideas?

[This is a serious "showstopper" to some customers; if it can't be solved, the
next test case will probably be on an NT system, and if that also fails, then
maybe a Solaris or SGI system. ;-]
9075.3Show Stopper Problem = High Priority IPMT !!!namix.fno.dec.com::jptFIS and ChipsMon Mar 10 1997 16:062
	As always...
9075.4HELIX::SONTAKKEThu May 29 1997 17:214
    Did IPMT get entered?  This problem seem to be pretty bad and needs to be
    fixed with or without formal escalation.
    
    - Vikas