[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Linux, the Free Operating System |
Notice: | New here? Sign in on topic 2 |
Moderator: | EST::DEEGAN |
|
Created: | Fri Feb 11 1994 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 609 |
Total number of notes: | 2862 |
579.0. "FYI: PCI EIDE Controller Flaws Discovered" by NEWVAX::PAVLICEK (Linux: the PC O/S that isn't PC) Tue Feb 25 1997 05:59
<<< NOTED::NOTES$10:[NOTES$LIBRARY]IBMPC-95.NOTE;1 >>>
-< IBM PCs, clones, DOS, etc. >-
================================================================================
Note 2796.13 Large harddisks and BIOS support question 13 of 13
ODIXIE::SIMPSONT "PC = world's biggest con job!" 263 lines 24-FEB-1997 19:48
-< additional consideration >-
--------------------------------------------------------------------------------
Just to stir things up a little...here is something else you may want to
check into further before plunking down your hard-earned cash for a big
EIDE disk drive...
PCI EIDE Controller Flaws Discovered
BY ROEDY GREEN
Introduction
There are serious flaws affecting about one-third of all PCI
motherboards. The flaws affect any motherboard or EIDE controller
paddleboard containing the PC-Tech RZ-1000 PCI EIDE controller chip or
the CMD PCIO 640 PCI EIDE controller chip.
The flaws affect motherboards from ASUSTeK, AT&T, DEC, Dell, Gateway,
Intel, Micron, NEC, Zeos and others. Since Intel makes so many of the
motherboards sold under other brand names, the flaws affect many
machines, both 486 and Pentium PCI.
The flaws show up most frequently when you run a true multitasking
operating system such as OS/2 Warp or NT. They also show up under
Windows For WorkGroups in 32-bit mode during tape or floppy backup and
restore. In theory, the flaws could do damage under DOS, DESQview,
Windows and Windows For WorkGroups in 16-bit mode, but so far there
have been no damage reports. Windows-95 contains code to bypass the
flaws.
The RZ-1000 has two flaws. The CMD-640 has those same two flaws, plus
three others. To make matters worse, most motherboard manufacturers
using these two flawed chips connected them up incorrectly. There are
software bypasses for these flaws. However, the Warp fix for the
CMD-640 reduces performance by 50 percent.
What are the symptoms?
When you are using an IDE or EIDE hard disk attached to the EIDE
motherboard port, the flaws subtly corrupt your files by randomly
changing bytes every once in a while. The flaws introduce bugs into EXE
files, subtle errors into your spreadsheets, stray characters into your
word processing documents, changes to the deductions in last year's tax
return files, and random changes to engineering design files.
This corruption happens when you are simultaneously using your EIDE or
IDE hard disk and some other device, most commonly the floppy drive or
mag tape backup. The same sort of problem may occur on reading a CD-ROM
drive attached to an EIDE port.
Unfortunately, correcting the problem just stops further file
corruption. It will not help to clean up the existing damage to your
files. Right now, the focus is on bypassing the flaws. Preventing
further corruption is child's play compared with the nightmare of
trying to track down all the existing random errors in files. Backups,
even from day one, may be corrupt. If you have either of the flawed
chips, you will probably never be able to completely eliminate the
effects of past corruption.
Testing For The Flaws
I wrote two test programs that run under DESQview, Windows, Windows For
WorkGroups, Windows 95, NT and OS/2. EIDEtest verifies that your hard
disk is working properly, and CDtest verifies your CD-ROM. If these
tests fail, it proves you have a serious problem, but not necessarily
that you have the RZ-1000 or CMD-640 chip.
If the tests pass, you still may have a problem since, especially under
DOS, DESQview and Windows, the flaws may only show up rarely. If you
run the tests under Windows 95 they will always pass, even if you have
the defective chip, because the operating system already bypasses the
flaws.
What Can You Do If You Have A Flaw?
Pester the manufacturer. Unfortunately, the EIDE controller chips are
soldered in. The only way to repair a flaw is to replace the whole
motherboard, recycling the socketed chips: the CPU, DRAM and SRAM
cache. It would be very expensive for computer and motherboard
manufacturers to fix a flaw.
Buy a new, unpopulated Triton PCI motherboard and recycle the CPU, DRAM
and SRAM cache chips from the old motherboard.
Run the controller in degraded mode. Some BIOSes have a feature to
disable the EIDE prefetch buffer. Vendors may offer a BIOS upgrade to
allow you to manually disable prefetch. The BIOS may also turn it off
automatically if either of the defective chips is present. This will
bypass both RZ-1000 flaws and two of the five CMD-640 flaws.
Buy a PCI EIDE paddleboard controller, such as the Promise 2300+ or the
BusLogic BT-910, to replace the one on the motherboard. You must
disable the EIDE controller on the motherboard. This fix will waste one
of your precious slots. Be careful. You could be leaping out of the
RZ-1000 frying pan into the CMD-640 fire, since paddleboards often use
the CMD-640.
Buy a SCSI hard disk and CD-ROM, and avoid using the EIDE ports
entirely. Under OS/2 and Linux, SCSI gives better performance, but
costs more. DOS, Windows, Windows For WorkGroups and Windows 95 are
unable to exploit the advanced features of SCSI, but at least avoid the
EIDE flaws when you go to pure SCSI.
Find a software work-around. There are fixes for Warp to bypass all the
flaws in the RZ-1000 and CMD-640. Fixpack 5 and pre-release Fixpack 9
do not bypass the flaws. Now that Intel and IBM have revealed the
technical details, all the operating system writers can patch their
EIDE drivers to bypass the flaws. There are also fixes for NT 3.1 and
3.5.
Get a BIOS upgrade. For DOS, DESQview, and Windows 3.1, to bypass the
flaws you may need a new BIOS: an EPROM chip. If you have a flash BIOS,
you can update it simply by downloading a file. Most BIOSes already
have code to bypass the flaws for DOS, DESQview and Windows. However,
more advanced operating systems bypass the BIOS, so even a smart BIOS
will not protect you. However, the BIOS CMOS settings may allow you to
disable prefetch, which also protects you in even true multitasking
operating systems.
Cut the trace. Cut the trace on the motherboard from the floppy
changeline to the EIDE controller. However, this only bypasses one of
the CMD-640's five flaws and one of the RZ-1000's two flaws.
Whatever method you use to bypass the flaws, retest with EIDEtest and
CDTest afterwards to be sure your fix worked and you caught all the
problems.
Cleaning Up The Mess
Once you have bypassed the flaws, you can start working on the problem
of cleaning up your files.
The first thing to do is to re-install your operating system and all
your application programs. This will replace any damaged EXE and DLL
files.
Catching errors in your data files is more difficult. Keep your eyes
peeled for any improbable spreadsheet results. You may have to hire a
programmer to write you some comb programs to sniff through your
databases, looking for suspicious values.
If you routinely use the verify feature of Lotus Magellan, it can
detect changes to files that should not have changed. This may help you
uncover some of the damage. The flaws are not polite enough to redate
the files they corrupt. :-)
If you have backups from before the time you bought the faulty machine,
you can restore them and re-key everything.
Most people will not be so fortunate. All their backups will also be
corrupt.
Most people with flaws will just have to put up with random errors
dotting their data files ever after.
What Are the Flaws?
IBM confirmed the RZ-100 has two different flaws:
In prefetch mode, multi-sector reads often fail.
The chip erroneously responds to floppy status commands and corrupts
the hard disk or CD-ROM I/O in the process.
IBM confirmed the CMD-640 has five different flaws. It has the same
prefetch problem as the RZ-1000. It has the same floppy status problem
as the RZ-1000. It does not support simultaneous I/O on the primary and
secondary EIDE ports. There is confusion over legacy and PCI mode.
Finally, it does not support 32-bit writes.
Test Programs
When requesting files on the Internet,you must generally use lower
case.
Below are the addresses for Roedy Green's EIDEtest and CDTest programs
for DOS, DESQview, Windows, Windows For WorkGroups, Windows 95, NT,
OS/2 and Warp. By the time you read this newer version, I will likely
have posted newer versions.
ftp://garbo.uwasa.fi/pc/diskutil/
ftp://ftp.cdrom.com/.4/os2/incoming/eidete16.zip
Intel's RZ-1000 chip detect program:
http://www.intel.com/procs/support/rz1000/rztest.exe
Intel's CMD-640 and RZ-1000 chip detect program, coming soon:
http://www.intel.com/procs/support/ctrltest/
IOTest from PowerQuest, the makers of Partition Magic, a Warp test for
the flaws.
http://www.powerquest.com/download/iotest.zip
Fixes
Warp bypass for the RZ-1000 chip flaws:
ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-us/pj19409/pj19409.zip
Warp bypass for the CMD-640 chip flaws:
ftp://ftpos2.cdrom.com/pub/os2/drivers/cmd640x.zip
Microsoft Windows NT 3.1 ATDISK.SYS fix for the CMD-640 chip:
http://www.microsoft.com/KB/softlib/mslfiles/pciatdsk.exe
Microsoft Windows NT 3.5 fix for the CMD-640 chip:
CMD's BBS at (714) 454-1134.
File 640XNT35.ZIP
Essays
Roedy Green's FAQ (Frequently Asked Questions) a 19-page unabridged
version of this article.
ftp://garbo.uwasa.fi/pc/diskutil/eidete16.zip
ftp://ftp.cdrom.com/.4/os2/incoming/eidete16.zip
PowerQuest essay:
http://www.powerquest.com/
Intel's FAQ
http://www.intel.com/procs/support/rz1000
PC-Tech's essay:
http://www.mei.micron.com/rz1000/rz1000.txt
Catch Pat Duffy's ([email protected]) essays each Sunday in:
comp.os.os2.misc, comp.os.os2.setup.misc, comp.os.os2.setup.storage and
comp.sys.ibm.pc.hardware.misc
Check out Pat Duffy's Web site at:
http://warp.eecs.berkeley.edu/os2/workbench/work.htm
ftp://ftp.netcom.com/pub/ab/abe/
Roedy Green is a computer consultant who prefers to work on Forth, C++,
Delphi, DOS, OS/2 and Internet Web projects. If you send $5 (US or
Canadian) to cover duplication, postage, and handling, he will send you
a diskette containing the relevant test programs, fixes, Internet
postings and essays. Send email to: [email protected] or discuss this
problem on the Internet newsgroup in: comp.os.os2.bugs.
You can also write via snail mail:
Roedy Green, Canadian Mind Products #601 - 1330 Burrard Street,
Vancouver, BC CANADA V6Z 2B8 (604) 685-8412
T.R | Title | User | Personal Name | Date | Lines |
---|
579.1 | | MOVIES::TWEEDIE | | Tue Feb 25 1997 11:30 | 16 |
| For what it's worth, Linux has had software auto-detect of these buggy
chipsets, plus software workaround for the bugs, for ages --- over a year
now, I think. Of course, you still lose performance because the workarounds
have got to disable some of the advanced features of EIDE, but I'm not aware
of any data-corruption problems with current Linux kernels using these
chipsets.
The recommendation that you replace a buggy rz1000 or cmd640 chipset board
with a newer Triton board is a good one. Using the latest Triton chipsets,
Linux will allow you to perform EIDE transfers using DMA, giving you at
least one of the performance advantages usually only available on SCSI
controllers. (Of course, SCSI is _still_ a good bit faster, since even
using DMA, EIDE doesn't let you ever have more than one outstanding IO
command at a time per channel.)
Stephen.
|
579.2 | | DECWET::LOWE | Bruce Lowe, DECwest Eng., DTN 548-8910 | Sat Mar 08 1997 00:35 | 17 |
|
Hmmm ... this would explain a problem I was having, which i was going to
ask about in here.
My ASUS mboard p90 had an old Soundblaster/CD setup, and sbpcd worked OK.
I installed a move recent soundcard with an ATAPI 12x CD drive, and tried
booting bare.i. It can't see the CD (I have two EIDE drives in IDE controller
0, and the CD on controller 1).
When I disconnect the 2nd hard drive and put the CD on controller 0, it can
see it.
On booting, I see a message:
ide: buggy CDM640B interface on pci (0x80006800); serialized;
secondary port
So it's NOT the CD.
|