T.R | Title | User | Personal Name | Date | Lines |
---|
632.1 | Shut Off Disk Keeper | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Fri May 23 1997 10:18 | 14 |
|
The "no such file" error you are seeing is normal for "lost" files,
but the question is why the file structure is getting corrupted in
this fashion.
An obvious potential cause of this would be a buggy on-line disk
defragmentation tool -- if this tool is in use at this site, then
turn it off and see if the problem reoccurs. I assume this is the
"Disk Keeper" package is doing.
Also, make sure the current set of shadowing patches is in use on
the system. If not, the get the current patches from the patch
area (http://www.service.digital.com/), and apply them.
|
632.2 | VAXF11X01_071 | GIDDAY::GILLINGS | a crucible of informative mistakes | Sun May 25 1997 19:18 | 14 |
|
> Also, make sure the current set of shadowing patches is in use on
> the system. If not, the get the current patches from the patch
> area (http://www.service.digital.com/), and apply them.
Even more important - make sure you have the latest F11X ECO installed.
See VAXF11X01_071 - it corrects a number of problems, some of which
could result in the symptoms you're describing if a defragger was
used on the disk.
This ECO is MANDATORY if you're using a defragger.
John Gillings, Sydney CSC
|
632.3 | This may be an operational problem rather than a corruption. | MOVIES::MCLAREN | Oh no - Not ANOTHER amusing one-liner | Mon May 26 1997 08:37 | 9 |
|
Note that you will see this error if Alias directory
entries are created using SET FILE /ENTER, and
then the target file is deleted, and the file header
re-used. This will leave dangling directory entries
which can result in the observed behaviour.
regards
Duncan McLaren.
|
632.4 | customer experienced serious disk corruption | CUJO::SAMPSON | | Tue Jun 03 1997 02:29 | 28 |
| Re: .2, FYI, we have a customer who experienced serious disk
corruption on all disks being defragmented, in the following unsupported
configuration:
The defragmenter was X2.1A-3, running on OpenVMS Alpha V6.2,
on two AlphaServer 1000's. The 38 disks being defragmented were on
OpenVMS VAX V6.2, OpenVMS Alpha V6.2, and OpenVMS Alpha V7.1, all in
the same cluster. Note that the V6.2 systems did *not* have the
required CLUSIO ECO.
The customer has temporarily stopped using the defragmenter,
is applying the ECOs recommended by the CSC, and will upgrade the
defragmenter to V2.2. The customer finds it strange that there had
not been any apparent problems with the unsupported configuration
until May 20th, the day after the 1970+10K date. We do not yet have
enough information to escalate an IPMT on this.
Applying the ALPF11X01_071 ECO to the OpenVMS Alpha V7.1 systems
*only* has *not* been effective in preventing XQPERR crashes on those
systems. It appears that the entire cluster must receive the appropriate
ECOs in order to fully benefit from them.
The CXO CSC was apparently unaware of the possible existence of any
such disk corruption problem. Where did you find out that VAXF11X01_071 is
*mandatory* on systems using a defragmenter?
Thanks,
Bob Sampson
|
632.5 | hopefully disk corruption is finally ended | CUJO::SAMPSON | | Tue Jun 03 1997 22:29 | 6 |
| The good news: customer finally applied the ECOs (CLUSIO, DRIV02,
and F11X) to all systems in the cluster today. The bad news: corruption
of disks may have continued, even after use of the defragmenter was
discontinued. One disk is missing its master file directory [000000],
and another is just missing some of its files, some of which are "lost"
from their directories, and others of which don't seem to exist anymore.
|
632.6 | make sure all previous mess is cleaned up! | GIDDAY::GILLINGS | a crucible of informative mistakes | Wed Jun 04 1997 03:33 | 9 |
| > The bad news: corruption
>of disks may have continued, even after use of the defragmenter was
>discontinued.
Applying the ECO won't fix existing damage, which can lay "dormant" on
the disk. I'd recommend an image backup and restore of any suspect disk,
then look for signs of corruption which occurs *after* the restore.
John Gillings, Sydney CSC
|
632.8 | | VMSSG::FRIEDRICHS | Ask me about Young Eagles | Thu Jun 05 1997 11:00 | 11 |
|
I anxiously await hearing from Al Meier.
It was my understanding from the grapevine this morning that the
customer was not running COMPAT nor CLUSIO.. Even without either one of
them, I know of no issue with any software incompatibility with V7.1
Cheers,
jeff friedrichs
Project Leader - COMPAT and CLUSIO kits
|
632.9 | The conclusions in .7 are PREMATURE | STAR::BOAEN | LANclusters/VMScluster Tech. Office | Thu Jun 05 1997 12:31 | 18 |
| re: .7
The conclusions in .7 are PREMATURE. Al called me late yesterday and
among several things I suggested he check was to see if an old image/TIMA
kit had been loaded on top of the compatibility kit or some other newer
image. This was just a hunch, not based on any specific knowledge.
We've never seen anything like this widespread corruption before, except
when someone's booted a CI node with VAXCLUSTER = 0 into a running cluster
creating a partitioned cluster. The problem is still being researched
here.
PLEASE DON'T SUBMIT NOTES BASED ON IN PROGRESS PROBLEM SOLVING. WE DON'T
KNOW WHETHER ARE NOT THE SCENARIO YOU DESCRIBE IS THE PROBLEM. WE MADE A
STRONG EFFORT TO ENSURE COMPATIBILITY BETWEEN KITS SO WE SUSPECT YOU MAY BE
WRONG.
'Gards,
Verell
|
632.10 | never mind then | CUJO::SAMPSON | | Thu Jun 05 1997 17:09 | 6 |
| Okay, I've deleted my note .7. I didn't state the conjecture as a
conclusion, but as a conjecture. Since you don't want a progress
report entered here, I will refrain from adding anything.
Sorry,
Bob Sampson
|