[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:	DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:	Welcome to the Digital UNIX Conference
Moderator:	SMURF::DENHAM

Created:	Thu Mar 16 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	10068
Total number of notes:	35879

8083.0. "media error in a RAID5 array (HSZ40) corrupts ADVFS" by MXOC00::MJUAREZ () Tue Dec 03 1996 22:26

T.R	Title	User	Personal Name	Date	Lines
8083.1		LEXS01::GINGER	Ron Ginger	`Wed Dec 04 1996 11:53`	6
8083.2	Don't shoot the messenger !	RUSURE::KATZ		`Wed Dec 04 1996 12:48`	11
8083.3	Does this happen with RAID1 as well?	VIRGIN::SUTTER	Who are you ??? - I'm BATMAN !!!	`Fri Dec 06 1996 04:04`	16
8083.4		KITCHE::schott	Eric R. Schott USG Product Management	`Fri Dec 06 1996 06:38`	25
8083.5	HSZ40 is not resposible for my case...	EPS::NGUYEN	Without fools there would be no wisdom.	`Fri Feb 21 1997 13:35`	35
	>> >>As I understand there is the possibility that a RAID5 logical unit on a >>HSZ40 fails temporarily when a media device goes bad in that set and thus >>corrupts the data in it: >> >> - Is this unique to the HSZ40 or would this happen on other >> controllers as well (SWXCR, HSZ10, other vendor's RAID Controller)? > >this is a bug in the hsz40 firmware...I think later versions fix >it but you would have to ask the HSZ folks. I have one customer who doesn't use HSZ40 at all. He has some 20GB internal disks with RAID 5 setup. He got a similar problem when the 3rd party soft ware indicate a media problem. He is using DU 3.2D. When he restores the backup on another equivalent system without RAID 5 setup, the system seems to be OK. Is RAID 5 that unstable? >> >> - If this is unique to the HSZ40, is it HSOF dependent? >> >> - Would a RAID1 set be more secure here (i.e. would a RAID1 set >> NEVER fail as long there is still enough device redundancy available? > >In the presence of bugs like this, it is hard to say what is >more secure...if you are really paranoid, use LSM mirroring. LSM would give an option of high availability but doesn't offer fault tolerant which is what RAID 5 suppose to be. That lead me to questions: Should OS 4.x is better to use with RAID5? In case of OS V3.2D, do we have any patch(ES) to fix this problem (what, where,how to find?) Many thanks. Gina
8083.6		NABETH::alan	Dr. File System's Home for Wayward Inodes.	`Fri Feb 21 1997 15:48`	31
	In a perfect world you wouldn't need any form of RAID. This being an imperfect world where devices fail, blocks go bad and tradeoffs have to made, the various RAID levels offer differing levels of protection against failures with tradeoffs in cost and performance. With RAID-5 in the normal state, it should be able to correct errors on single devices or compensate for the loss of a single device. It a particular RAID-5 implementation can't handle this case, without letting an I/O error get through, it is a rather poor one. If the array has lost one member, than any I/O failure on the others will be passed back up as an because all the other members contribute to the data regeneration. Some implementions keep track of I/O failures and the state of the data used for regeneration. If bad data would go into the regeneration on an error, they will treat that as error instead of returning the wrong data. It isn't very informative just to say that a RAID-5 let an I/O error get through, since there are many different causes. Hopefully the array or software will keep track of the detailed failure and offer some way to determine the cause in case it is preventable. re: LSM, fault tolerance and availability. This doesn't make any sense. LSM offers mirroring, which is superior to RAID-5 in every way but price. A properly configured mirror can survive any fault in the I/O path, except for one in the base system. A RAID-5 won't do any better in that case.