| First off, none of these (8630, 8282, 7828) has pinned file corruption
on lightweight wiring (the thing new-wire-method enables). The problem
there was lww casuing the system to crash. That may have well
resulted in database corruption, but it's not exactly subtle
corruption -- you'll know if you've got the problem.
Far as I can see, no one has identified the cause of the corruption
beyond the effects of a system crash.
FWIW, lightweight wiring is just that -- a faster way to wire (lock)
down kernel memory for I/O operations. When it works, it uses the
PTEs of resident pages to mark the wiring rather than going through
the more heavyweight vm-map level wiring mechanism. The system
couldn't care less as whether you've got new-wire-method turned on
or off. There's a slight performance penalty to turning it off, but
that's better than a crash. The problems these notes are dealing
with involves flaws miscommunications about the wired state of
pages between the PTE-level wirings established by lww and the
map level wirings. In particular, when an attempt to lw-wire a
memory range fails because a nonresident page is encountered, the
higher level wiring code (vm_mape_pageable) isn't being properly
notified of the pages wired so far. This causes inconsistencies
in the wiring state of the memory range with subsequent panics.
Maybe a vm person can speak to whether similar lww confusion could
result in corruption rather than panics.
|
| We have seen other reports on database corruptions, which seems to have cured
by turning off new-wire-method. As far as I know there are no CLDs or QARs
filed against the corruption. You may want to consider filing a CLD/IPMT.
As for the LW-wire panics, a fix is in the works. The same bug maybe
responsible for the corruption, provided the system is low in memory to cause
page stealing. If you can consistently reproduce the corruption, I can give
you a patch to verify. Send me a note if you are interested.
--shashi
[Posted by WWW Notes gateway]
|