[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | VAX and Alpha VMS |
Notice: | This is a new VMSnotes, please read note 2.1 |
Moderator: | VAXAXP::BERNARDO |
|
Created: | Wed Jan 22 1997 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 703 |
Total number of notes: | 3722 |
259.0. "Can OpenVMS handle this much auditing?" by GIDDAY::GILLINGS (a crucible of informative mistakes) Thu Feb 27 1997 21:49
I have a customer who wants to run with full file access auditing. That's
SUCCESS, FAIL, SYSPRV, GRPPRV, READALL and BYPASS for all access types.
They have specifically chosen systems which should have sufficient resources
to do this. Indeed, they've been running for more than 1 year with auditing
enabled as above. However, they have also been suffering crashes. At first
these were occasional, but recently they have increased to 1 or more per day.
It seems the frequency is related to the gradual increase in workload over
time. The node is pretty much exclusively a Teamlinks server.
The crash footprint is always very similar, the process is always OAFC$SERVER
with READ ACCVIOs in EXE$CHKPRO at varying offsets. Those that I've checked
are in the "item list scanner" at the targets of the large CASE statement
leading to ITEM_xxx: labels. Possibly also of interest, the AUDIT_SERVER is
always current on the other CPU. This doesn't match anything in CANASTA that
I can access.
The customer is running OpenVMS/Alpha V6.1 and Pathworks V4.2 in a 5
node cluster of 2xdual processor 2100's + 3 VAXes. They won't upgrade
OpenVMS or Pathworks for various reasons. We also can't get them to install
all the patches we want installed. It's a secure site in another city so
detailed analysis is very difficult.
The good news is that I was able to convince them to disable SUCCESS and
SYSPRV audits. Their audit load for this node went from anything up to
500,000 blocks of journal per day to 20 events (yes, that's TWENTY events :-).
Now, touch wood with all fingers and toes crossed, we've had 3 crash free
days, so maybe the crashes were caused by stressing the security audit
subsystem, or perhaps a small synchronisation window with AUDIT_SERVER.
So, the question is, is OpenVMS qualified to run with that level of
auditing, assuming sufficient disk, CPU and I/O bandwidth? We can point
out the absurdity of it all, but it *does* sell hardware, so I'm very
loath to try and dissuade them from their wishes over the long term.
Is there any specific tuning or system configuration that we should
suggest? Perhaps there are some upgrades or patches we can convince them
to install if/when the spooks insist that they need all that audit data.
John Gillings, Sydney CSC
T.R | Title | User | Personal Name | Date | Lines |
---|
259.1 | | UTRTSC::thecow.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Fri Feb 28 1997 05:49 | 9 |
| VMS should be able to deal with this, period. No crashes whatsoever should
result from this. It looks like a synchronization problem somewhere where
the CHKPRO system service and the audit server play some role.
I suggest escalating this (or have a thourough look in the crash and look for
the problem instead of fighting symptoms. That's what i would do).
Jur.
|
259.2 | | ALPHAZ::HARNEY | John A Harney | Fri Feb 28 1997 09:25 | 12 |
| re: .0
Well, VMS should run with all audits enabled, it will just be
very sluggish.
There were a few remedial fixes for CHKPRO; I don't know exactly when
or what, but I know they exist. Is the customer adverse to even these
patches, or just random ones?
Back to V6.1 Alpha increases the chances that a fix will be available to help.
\john
|
259.3 | | BSS::JILSON | WFH in the Chemung River Valley | Fri Feb 28 1997 09:31 | 7 |
| IMHO there appear to be a number of scenarios where auditing can cause
system hang, crashes, etc. I IPMT'd a deathly embrace for mutexes when
auditing failed logical name table access. The auditing mechanisms mainly
around logging is way too resource intensive for my taste. I believe this
will get worse as systems get faster and IO struggles to keep up.
Jilly
|
259.4 | | ALPHAZ::HARNEY | John A Harney | Fri Feb 28 1997 09:48 | 9 |
| re: .3
Well, I wouldn't say "a number of." The mutex/logical-name-table/audit
problem is in fact the only (real) outstanding audit server problem I know
of, and it will likely never be solved. (It involves two types of
incompatible synchronization, and brings on a deadlock)
If you know of others, please QAR or IMPT them!
\john
|