[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference mvblab::alphaserver_4100

Title:AlphaServer 4100
Moderator:MOVMON::DAVISS
Created:Tue Apr 16 1996
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:648
Total number of notes:3158

599.0. "Out of Memory error w/kzpsc ????" by APACHE::CARLTON (Mike - DTN:264-4455) Tue May 13 1997 10:05



There is a strange problem that occurs with a KZPSC when a third party card is
installed on a AS4100. This is a brand new machine running the latest code. 

 The third party card is a DCI-1100 from Logical Company (www.logical-co.com).
This is a DRV11-W/DRQ3B look-alike which resides on the PCI bus.  

 The problem is that the KZPSC raid manager, both stand-alone and under
OpemVMS 6.2-1h3, reports an "Out of Memory" error when initializing the raid
array. 
The error message appears to be issued by debugger and is located on the 
top of the video screen in black lettering on a red background.

 The problem will go away if a third party PCI card is removed which indicates
some sort of conflict.  The DCI-1100 card passes all of its diagnostics 
without error.  If the array is configured without the card installed and
VMS is booted, the disks are accessable and the system runs fine.  However the
SWXCRMGR s/w won't run.  
 
I have passed this by Logical and they indicate that there are no known
problems with this card.  It is used on numerous Alpha machines including
as8x00, as2100a, etc.  The 4100 has 1 Gb of memory and the KZPSC has 32Kb
Cache memory.  

Has anyone had any experience with this?  What does the "Out of Memory" error
really mean?  

This is a big $$ order and we need a solution asap.  

Thanks

Mike Carlton

Cross posted in ask_ssag 

    
T.RTitleUserPersonal
Name
DateLines
599.1HARMNY::CUMMINSTue May 13 1997 19:2122
    I may know what the problem is.
    
    The 4100/4000 SRM console has a problem where it does not initialize
    certain PCI-PCI bridge devices properly. Specifically, it does not init
    the bridge's I/O space window properly. On older bridges, this is not
    an issue. But it is on newer bridges. The SRM console's RAID driver
    happens to use I/O space and thus you can/will see conflicts.
    
    The above problem was very recently discovered in our qual labs.
    
    If the above is indeed your customer's problem, a short-term solution,
    if possible, is to move the RAID card to a different hose/PCI from the
    DCI-1100. We have a fixed version of console undergoing test now. I'll
    work with you offline around a possible firmware fix.
    
    For what it's worth, the same firmware bug was found in the 8200/8400
    SRM console. Perhaps customers have just been getting lucky re: RAID
    cards and this vendor's card in the same 8X00 PCI bus.
    
    I'll contact you tomorrow to work out a longer term solution..
    
    BC
599.2Thanks..APACHE::CARLTONMike - DTN:264-4455Wed May 14 1997 14:418
    Thanks,  It does sound simular to the problem seen with this card. 
    FWIW, Logical engineers were here last December trying to replicate a
    problem they were seeing on 8200 machines at a customer site.  Do not
    know if the raid controller was there as well or not.
    
    I will email you early next week as we are moving to zk02 this week.
    
    Mike
599.3HARMNY::CUMMINSWed May 28 1997 12:44131
From:	HARMNY::CUMMINS      "Bill Cummins, PKO3-2/Q21, 223-4641" 28-MAY-1997 11:21:22.36
To:	US8RMC::"[email protected]"
CC:	CUMMINS
Subj:	Re: Rawhide issue with DCI-1100 and RAID card

This doesn't sound like an SRM console issue after all.. Matthew Buchman
(OLEUM::BUCHMAN) is my counterpart in the AlphaBIOS development group.
Perhaps DECwest has a DCI-1100 in their labs; i.e. to see if they can
reproduce and possibly help debug the problem?

BC

From:	US8RMC::"[email protected]" "Mike Carlton" 27-MAY-1997 15:31:03.46
To:	"'Bill Cummins, PKO3-2/Q21, 223-4641'" <may30::cummins>
CC:	
Subj:	RE: Rawhide issue

Bill

  Yes, it is seen on both busses.  It also is only affecting the KZPSC and 
only appears to affect the swxcfgmgr operation.  It also does not appear to 
be affected by other devices.  If this device is there along with the 
KZPSC, the problem occurs.

Mfg has also stated that the error occurs even under OVMS, but I do not 
have the specifics.  Don't know if it crashes the system or just the app.

The exact message is "DEBUG - Out of Memory, System will reset, press any 
key to continue" and it
appears on the video screen at the top on a red banner.

I have one of these devices on it's way to me now from the vendor.  Looking 
for a KZPSC 3 port controller.
    
    
    
    
From:	MAY30::CUMMINS      "Bill Cummins, PKO3-2/Q21, 223-4641" 27-MAY-1997 14:53:15.02
To:	US8RMC::"[email protected]" 
CC:	CUMMINS
Subj:	Re: Rawhide issue

You're saying it's seen on both PCI buses; i.e. in the same bus as
the DCI-1100 device and also in a different bus from this device?

Could you post/send the exact "out of memory" message displayed?

I now no longer believe this is due to a console issue.

Are you able to reproduce the problem locally? [I seem to recall your
mentioning something about setting up a system in your local lab vs.
at the customer's site?]

Do you notice any message such as the following at the SRM console?

<device> <size> <hose> <slot> <bus> address space request too large
PCI sparse memory space exceeded

or PCI IO space exceeded
or PCI dense memory space exceeded

If you don't see any error prints like those above from console and SHOW
DEVICE under console sees all devices okay (for those options which the
console has a driver) when the DCI-1100 is in the system, then I have no
clue what the problem is..

BC

From:	US8RMC::"[email protected]" "Mike Carlton" 23-MAY-1997 11:17:41.23
To:	"'[email protected]'" <harmny::cummins>
CC:	
Subj:	RE: Okay. Get back to me and work out the details (next week)..

OK

I am now somewhat resettled in my new digs. Where can I get this code from? 
 I am not 100% sure that this will correct the problem as it is seen on 
both pci busses as well as stand-alone and under vms.
I am now gathering the necessary hardware together to test this again in a 
lab environment to see what I can find.

The machine has shipped to the customer with a caveat that the DCI card be 
removed if the KZPSC needs to be reconfigured.  My feeling is that this 
will go over like a ton of bricks with the customer so a real resolution is 
going to be needed.

My email is [email protected] or apache::carlton.


  Mike Carlton  

          CSS Storage Engineering
           ZKo2-1/Q38
           603-884-5544 (264-5544)
           [email protected]


----------
From: 	 "HARMNY::CUMMINS"@apache.mko.dec.com[SMTP:"HARMNY::CUMMINS"@apache.mko.dec.com
Sent: 	 Wednesday, May 14, 1997 4:40 PM
To: 	 Mike Carlton
Subject: Okay. Get back to me and work out the details (next week)..

      <<< MVBLAB::SYS$SYSDEVICE:[NOTES$LIBRARY]ALPHASERVER_4100.NOTE;1 >>>
                             -< AlphaServer 4100 >-
================================================================================
Note 599.2              Out of Memory error w/kzpsc ????                  2 of 2
APACHE::CARLTON "Mike - DTN:264-4455"                 8 lines  14-MAY-1997 13:41
                                 -< Thanks.. >-
--------------------------------------------------------------------------------
    Thanks,  It does sound simular to the problem seen with this card.
    FWIW, Logical engineers were here last December trying to replicate a
    problem they were seeing on 8200 machines at a customer site.  Do not
    know if the raid controller was there as well or not.

    I will email you early next week as we are moving to zk02 this week.

    Mike



% ====== Internet headers and postmarks ======
% Received: from mrohub1.mro.dec.com by us8rmc.bb.dec.com (5.65/rmc-17Jan97) id AA01434; Fri, 23 May 97 11:08:50 -0400
% Received: by mrohub1.mro.dec.com with SMTP (Microsoft Exchange Server Internet Mail Connector Version 4.0.996.15) id <[email protected]>; Fri, 23 May 1997 11:10:07 -0400
% Message-Id: <c=US%a=_%p=Digital%[email protected]>
% From: Mike Carlton <[email protected]>
% To: "'[email protected]'" <harmny::cummins>
% Subject: RE: Okay. Get back to me and work out the details (next week)..
% Date: Fri, 23 May 1997 11:09:55 -0400
% X-Mailer:  Microsoft Exchange Server Internet Mail Connector Version 4.0.996.15
% Encoding: 53 TEXT