[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 7.286::digital

Title:The Digital way of working
Moderator:QUARK::LIONELON
Created:Fri Feb 14 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5321
Total number of notes:139771

4241.0. "The RZ28M / KZPSC problem" by MBALDY::LANGSTON (our middle name is 'Equipment') Wed Nov 01 1995 17:43

The RZ28M, our most popular, most available, 2GB disk drive does not work with
the KZESC and KZPSA, our most popular SCSI RAID controllers.

This is a huge problem, from my perspective.  I work at DECsale.  We take 
400-500 calls a day on our "systems split."  A significant percentage of those 
calls are disk-related.

The most popular disk size in the industry today, everybody knows, is 2GB.  We,
at DECsale, have been configuring the RZ28M as the generic 2GB drive for many 
weeks, since the RZ28 was retired.

The stovepipes in this company and the problems they cause particularly come to 
the fore in this situation.

StorageWorks might be intrepreted as implying -- by having 44 RZ28 part
numbers, 26 of which are "ACTIVE" in AQS, eight -VA and -VW variants -- that
there are many options.  But we know better.  So we check to see what's been
qualified.

The server folks test some of them, based on I'm not sure what.
The controller people tell customers and sales reps that the RZxxx works with 
these controllers: mumblefratz and whatchamacallit.

But, when, all of sudden, the controller people find out that there's a problem
with the RZ28M and the KZxSC, do the server people find out?  Is there any
motivating process in place, to encourage them to be interested?  Do the
stovepipes prevent them from understanding the issues (or encourage them not
to)?  Do their jobs immediately depend on taking team action to fix the
problems?  I know that product management groups have sustained layoffs, just
as the rest of the corporation, so I'm not blaming anyone, just wondering how
could we do this to ourselves?

The only way we have of knowing about this is a Storage Bulletin, dated
September 18th, which says,

 "... The end-of-life date for RZ28s was moved to end-of-day, Thursday,
      September 14. New RZ28M drives will be substituted in all orders for
      RZ28s shipped after that date.
[...]
      RZ28M and RZ26N Disks:
      The SCSI adapter for which these two disks have been qualified is
      the KZTSA-AA on Digital UNIX, Alpha NT and OpenVMS. Qualification on
      other PCI-to-SCSI adapters, along with the KZESC-AA/BA and KZPSC-AA/BA
      RAID controllers, is expected to be completed in the next quarter."

No "sales flash," no e-mail to "the world."

Has anyone formed a task force to assess the risk to Digital and our customers?
How many RZ28M disks have been configured on KZPSC and KZESC controllers?
Hundreds, I'd bet, if not thousands.  I know I've done it and been doing it for 
weeks.  We have 20 people doing the same job here at DECsale.

How many times have sales reps whom we've told "use the RZ28M with the KZPSC
in the 2100" have configured it just that way?

Is there round-the-clock testing being conducted with the KZPSC and KZESC on
several parallel tracks to find the fixes?

Have we made any concerted effort to tell our customers?  Whose going to
upgrade the firmware (or perform whatever fix) on the disks when we figure it
out?  Who'll pay?

How did we get in to such a fix that we have this single point of failure? Take
a look in AQS at RZ28, wild-carded.  There are 44 RZ28*-** variants, RZ28, 
RZ28B, C, D, E, M and T.

The RZ28D apparently works fine with the KZxSC controllers.  But, has it been 
tested/qualified on any systems?  None that I know of.

What can we do to fix it?  Is everything that can reasonably be done being
done?

Bruce
DECsale
T.RTitleUserPersonal
Name
DateLines
4241.1WLDBIL::KILGOREDEC: ReClaim The Name!Thu Nov 02 1995 07:3712
    
    The first thing you have to do is make sure everyone is talking about
    the same "problem". .0 doesn't seem to have a clear reference to the
    problem -- is it that there is a bug encountered when a certain drive
    is mated with a certain controller, or is it that this combination has
    just not been qualified? (And how does one define the term "qualified"?)
    
    Experience tells me that unless you have a clear, concise,
    reproducible, *WRITTEN* and universally referenced (eg, via IPMT case #
    or equivalent) description of the "problem" you need solved, you're just
    mixing mortar for the Tower of Babel.
    
4241.2myriad problemsMBALDY::LANGSTONour middle name is 'Equipment'Thu Nov 02 1995 09:1428
The problems, as I see them, are manifold:

	there is a bug or incompatibility between the RZ28M firmware and
	the KZxSC controllers.  I suspect somebody's working around the
	clock to fix this.
	
	we allowed ourselves to be a position where we don't have a 2GB 
	drive-based, PCI/EISA controller-based RAID solution.  Didn't we know,
	back in the spring or early summer when the RZ28 drives started
	running out, that we'd need a replacement?

	if we're shipping the SD001-CA and SD001-DA, which consist of a one-
	or three-channel KZPSC and 2GB drives, is it with with RZ28Ms?

	if "conditions beyond our control" caused us to run out of the RZ28
	drives before we had time to qual another drive, why didn't have a 
	plan to do *something* about it?

	if there is a plan, why don't we know about it?

	there are many situations where our customers are trying to use RZ28M
	drives on the these two controllers, because we sold it that way

	evidently, the RZ28D has been tested or qualified with the controllers
	but has not been qualified in any systems


Bruce
4241.3See support statement in .10JULIET::HATTRUP_JAJim Hattrup, Santa Clara, CAFri Nov 03 1995 19:5214
4241.4tennis.ivo.dec.com::KAMKam WWSE 714/261.4133 DTN/535.4133 IVOSat Nov 04 1995 00:485
    I think these discussions should be held in the
    MSGAXP::STORAGE_PRESENTERS conference.  Moderated by SHAKEY::SENGUPTA.
    
    This is the Storage PID Presenters conference, a restricted conference,
    so you need to find out the requirements for entry.
4241.5Don't hide the smoking guns...NQOS01::nqsrv326.nqo.dec.com::WernerNORMAN WERNERMon Nov 06 1995 07:5610
RE: .4 Why in heavens name would you suggest that the discussion of a potentially critical issue 
such as this one be moved to a restricted conference. The base noter, while admittedly also 
venting a bit was, IMHO, trying to alert other unsuspecting Digits about this problem. What 
possible good would it do, other than to bring it again to the attention of people who undoubtedly 
know about this problem, to move it to a restricted Storageworks presenters conference. That would 
certainly defeat one of the basic purposes of these conferences. Hopefully, public airing of our 
own internal stupidity contributes to the process of identifying and weeding out the folks who 
make some of these stupid decisions. 

-OFWAMI-
4241.6Re-wrapped for the columnarly-challenged (like me)ATLANT::SCHMIDTSee http://atlant2.zko.dec.com/Mon Nov 06 1995 08:3817
   <<< Note 4241.5 by NQOS01::nqsrv326.nqo.dec.com::Werner "NORMAN WERNER" >>>
                      -< Don't hide the smoking guns... >-

  RE: .4 Why in heavens name would you suggest that the discussion of a
  potentially critical issue such as this one be moved to a restricted
  conference. The base noter, while admittedly also venting a bit was,
  IMHO, trying to alert other unsuspecting Digits about this problem.
  What possible good would it do, other than to bring it again to the
  attention of people who undoubtedly know about this problem, to move
  it to a restricted Storageworks presenters conference. That would
  certainly defeat one of the basic purposes of these conferences.
  Hopefully, public airing of our own internal stupidity contributes
  to the process of identifying and weeding out the folks who make
  some of these stupid decisions. 

  -OFWAMI- 

4241.7BABAGI::VIDIOT::PATENAUDEPinball, PC&#039;s, Storage an more...Wed Nov 08 1995 10:2626
I have more to do during the day than monitor this soapbox, BUT someone sent me
this note thread VIA mail and I had to put in my .02 worth.


>    I understand the required firmware fix for the RZ28M is one to two
>    months away.  Don't know when testing and support will be in place.....

Gee could have fooled me. Revision 568 was cut into manufacturing the end of
August. 616 is still in qual but has no fixes that preclude it's use. Only 1 fix
in that code that I am aware of and it's a minor one.

>    Only place I've actually seen info on this drive problem is in notes.
>    .................

You and me both and ALL RZ28M IPMT cases come through ME!

I'm sure product managers will be adding more responses to this as this is not
correct information being passed around.

Roger.





4241.8See support statement in .10JULIET::HATTRUP_JAJim Hattrup, Santa Clara, CAThu Nov 09 1995 19:0117
4241.9Fix Digital and these problems won't happenMBALDY::LANGSTONour middle name is &#039;Equipment&#039;Fri Nov 10 1995 14:3633
    Werner describes my motivations well.  I was venting, but mostly
    because I *know* that there have to be cases out there where customers
    have RZ28M drives and KZxSC controllers.  And friends, the combination
    doesn't work.
    
    Look in the current SOC.  In the AlphaServer 1000, 2000, 2100 sections
    we say, in adjacent configuration steps, to use the KZPSC or KZESC and
    the RZ28M as the 2GB drive of choice.  
    
    In the Sales Update article announcing the RZ28M and the RZ26N, right
    at the top of the article, we say the RZ28M replaces the RZ28.  Only
    when one looks almost at the bottom of the article does one find the
    indirect statement of non support "Support for the KZESC and KXPSC [sic] 
    RAID controllers will be included in futre [sic] high performance 
    StorageWorks disk drives."
    
    Who reads Sales Updates, anyway?
    
    The point is that it *should* work and it doesn't.  Okay?
    
    Now, it's being worked, I hope.
    
    But I'll ask the question I asked in the base note.  Does anyone, who
    can do anything, understand this problem from (a) the customers' 
    pespective and (b) the systems' perspective?
    
    Re: the RZ28D...
    It's not on the latest (September) Supported options list,
    http://www.service.digital.com/alpha/server/as2100/docs/
                              supported_options_abs.html
    
    Bruce
                              
4241.10It does work.BABAGI::VIDIOT::PATENAUDEPinball, PC&#039;s, Storage an more...Fri Nov 10 1995 23:239
    
    If the Customer configuration is not working the Escalate it to
    engineering. Don't hope it is being worked. Drives are being released
    so fast that you cannot always trust the docs. The RZ28M is supported.
    
    Roger Patenaude
    Storage External Products
    Continuation Engineering
    
4241.11It is supported on the 1000 and 400WRKSYS::SWARTOUTMon Nov 20 1995 10:0915
    I've been forwarded the base note and some of the replies of this note;
    due to a week at COMDEX this is my first opportunity to reply.
    
    The configuration of RZ28M drives and the KZESC or KZPSC is supported
    on the AlphaServer 1000 and 400 systems.  I have had NO problem reports
    forwarded to me and we've had no IPMT cases come in through the
    customer support side.
    
    It seems to me that the only problem that's actually been reported here
    to date, is that the DOCUMENTATION hasn't kept up with reality...
    
    If anyone has experienced a functional problem with these combinations,
    please let us know.
    
    
4241.12No writers left to update the docs?DELNI::DIORIOMyopic VisionariesMon Nov 20 1995 10:358
RE -1

>>  It seems to me that the only problem that's actually been reported here
>>  to date, is that the DOCUMENTATION hasn't kept up with reality...

Probably because the tech writer responsible for the DOCUMENTATION got 
outsourced!