|
> 1) In redudancy mode how I can see which controller
> have the actual access to the disk?
HSZ> sho unit full
Unit XXX .... ONLINE to THIS CONTROLLER (OTHER CONTROLLER)
> 2) Using the cache policy B and firmware version 2.7
> when a battery fails why some disk Raid5 and mirror are not switched
> to the good controller?
Bug in V2.7Z. Corrected in V3.0Z.
The units will stay online, if battery is GOOD or LOW with CACHE_PLOICY=B.
RAID and mirrorsets need battery backed up cache because of data integrity
problems, that may occur in case of a power fail while a write is in progress.
> 3) In this condition I have to shutdown the two controllers to replace the
> defective batteries.
> (This is the procedure that is given to replace the batteries).
No. Use "set this preferred_id=(all_the_id's_you_have_configured)" on the HSZ
with the good batteries or issue a shutdown to the HSZ with the failed
batteries.
I think the shutdown is the best method, because you don't have to reconfigure
your controllers after the swap.
> Then:
> Why I (customer) have to use a redudancies configuration if I
> have to shut the two controllers ???
Where is this stated?
> In an emergency situation I have shut the defective controller,
> disconnect the trilink adapter and the system Dec Unix 3.2g is gone
> in crash. Then effective I MUST shut both controllers !
I think the OS will panic most likely (ADVFS domain panic only, if you're with
V3.2g, and lucky :-) ) when your batteries go defective.
To swap the batteries, I would use C_SWAP.
> How everyone can understand this behaviour is very critical for our
> big customers.
> Thanks for any suggestions and comments.
Use V3.0-3 immediately.
> Gabriele
�tzi
|
|
Recently we had an incident with a dual-redundant HSZ40-B, running
V30Z-2 in a Windows NT cluster environment.
One HSZ40 failed due to a Bad Battery and, as expected with HSOF V30Z,
it went down ("//"-LED solid on).
The other HSZ40 took over, BUT the RAID5-set on that other controller
became unavailable for the application for some unknown reason.
Also the WNT Cluster Manager got confused somehow because of this,
trying to get this UNIT back on any of the two Alpha-systems.
Question:
If a Battery failes on ONE HSZ40, HOW LONG will any UNIT be
unavailable due to house-keeping being done by the controller?
If this is a SHORT time (seconds), then a Unit-failover from one
to another HSZ SHOULD be TRANSPARANT for the Windoes NT clsuter
software. If it takes TOO long, timeout's will occur within the
WNT cluster Manager and he will start working to get the unit back.
Jan Visser.
|