T.R | Title | User | Personal Name | Date | Lines |
---|
9307.1 | monitor resources, then tune | RIPPER::STRAUSS | talking through my binoculars | Mon Mar 31 1997 18:34 | 9 |
| Have you monitored the system to see which resources are exhausted?
CPU? Memory? I/O?
Run vmstat, iostat, anything-you-can-think-of-stat to see what is
slowing the system. Then take whatever action you need to correct the
problem.
Hope tihs helps
leon
|
9307.2 | Need more help | TPOMC1::DAVIDHSIEH | | Wed Apr 02 1997 23:44 | 14 |
| Thanks. We have run those tools to monitor the system resources usage
and do something to tune, ubc_maxpercent, free_page_target ...
After tuning, the system work fine a little. Now we see the I/O
overhead is very heaven, how could we tune the I/O subsystem.
BTW, do you have any experience about the tuning ERGO Sar? We are a new
in this field, can you point out what system paramaters need to tune to
meet the EAGO Sar requirement.
Thanks in advance.
My email address is tpovc::davidhsieh, if you need more
information about it, please contact me.
|
9307.3 | Need more information | GIDDAY::STRAUSS | talking through my binoculars | Thu Apr 03 1997 17:13 | 14 |
| > BTW, do you have any experience about the tuning ERGO Sar?
No, sorry
> how could we tune the I/O subsystem.
What does this mean?
Is all the disk i/o going to one disk? If so, try to spread it across
more disks.
If it's already spread across many disks, perhaps you need faster disks
or controllers.
It's hard to advise you without specific information.
leon
|
9307.4 | | XIRTLU::schott | Eric R. Schott USG Product Management | Fri Apr 04 1997 15:00 | 9 |
| Hi
I suggest you run sys_check
see http://www-unix.zk3.dec.com/tuning/tools/sys_check/sys_check.html
it will at least provide data for you to point people at...
|
9307.5 | How to tune the scheduling? | TPOMC1::DAVIDHSIEH | | Sun Apr 06 1997 22:04 | 16 |
| Thank. I'll post the result of sys_check later.
The user complains the scheduler function is not very well -
1. When he run a program to create a 500MB file, the other processes will
be hung for a long while. The testing result is need 2 min on local disk,
but it takes more than 20 min on RAID disk. Both the operation is under
Advfs. Any hints to explain it?
2. The user puts the program into IBM and HP UNIX, it will not hurt others
performance when it do the I/O. This is why we must tune the I/O
subsytem to reduce the I/O bottle neck.
Do we have any way to affect the scheduler to suspend or reduce the
priority to specified job? We have tried the nice (or renice), but it
has not helpful.
|
9307.6 | | SSDEVO::ROLLOW | Dr. File System's Home for Wayward Inodes. | Mon Apr 07 1997 11:20 | 45 |
| Since you're earlier note mentioned ubc-maxpercent, you
probably already know this, but...
Digital UNIX uses a unified buffer cache where all of free
memory can be used as file system buffer cache. When a large
file is written sequentially, the cache quickly fills with
the data from the file. Left to itself the system will trim
pages from program and swap idle programs to get more memory
for the large file. During the writes, other programs will
want to run, the cache will be flushed to free up memory, and
these programs will will be paged back in. As soon as they
go idle again, they'll be paged and swapped out.
In the right configuration the performance of this might not
be too bad. In the wrong configuration the performance will
be terrible. It might not be true in V4, but some earlier
version preferred to pick large processes to swap out when
it needed memory. But, when this process needed to run again,
it would swap the whole thing back in, run a short time and
then get swapped out again. Because of the program size, the
pageing/swapping time would dominate the run-time and the
system would be busy most of the time moving data between
the page/swap device(s) and memory, instead of doing useful
work. I think it was possible to tune around that. You may
want to see if the customer has such a large process.
Another problem with this system is probably that RAID. RAID-5
writes can have very poor performance. At best they won't have
good performance. With the sequential writes going through the
RAID-5 it is understandable that the write performance is poor.
And, since the performance of everthing else that needs to do I/O
(file, paging, etc) is tied to that RAID-5, it slows the whole
system down. The page/swap device(s) are also on that RAID-5,
it will just make things worse.
Paging I/O has the advantage of tending to be large sequential
writes. The HSZ40/50 family can turn such writes into RAID-3
like writes that offer better sequential I/O performance. I
don't know if the backplane RAID controller (SWXCR family) do
this. If the customer's I/O load is going this write intensive
on a regular basis, they'd do well to move away from RAID-5
for the devices that will see heavy writes. RAID-5 is fine for
read-intensive loads where you need to redundancy, but there
isn't much you can do for writes except to throw lots of
controller memory at it.
|