[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference mvblab::alphaserver_4100

Title:AlphaServer 4100
Moderator:MOVMON::DAVISS
Created:Tue Apr 16 1996
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:648
Total number of notes:3158

493.0. "Slow loading using internal CDROM" by SALEM::ARNOLD () Fri Feb 14 1997 15:36

    Has anyone notice the time it takes to load a unix subset with an
    alphaserver 4100 as compared with a 2100.
    
    I loaded the Hubwatch subset and it took over 15 minutes to load
    from a 4100 and only about 3 to 4 minutes from a 2100.  I tried
    it on 3 different 4100 all with the same results.  I did notice the
    4100 had logged errors which follow.  The cd is off the internal bus.
    
    AlphaBIOS	V5.24-0
    SRM		V3.0-10
    
    OS: Digital Unix V3.2F
    
    2 CPU's
    1 gig of memory
    
    DECevent V2.2
    
    
    ******************************** ENTRY    1
    ********************************
    
    
    Logging OS                        2. Digital UNIX
    System Architecture               2. Alpha
    Event sequence number            20.
    Timestamp of occurrence              14-FEB-1997 16:09:54
    Host name                            l1sm1
    
    System type register      x00000016  AlphaStation 4x00
    Number of CPUs (mpnum)    x00000002
    CPU logging event (mperr) x00000001
    
    Event validity                    1. O/S claims event is valid
    Event severity                    1. Severe Priority
    Entry type                      199. CAM SCSI Event Type
    
    
    ------- Unit Info -------
    Bus Number                        0.
    Unit Number                   x0028  Target =   5.
                                         LUN =   0.
    ------- CAM Data -------
    Class                           x22  DEC SIM - SCSI Interface Module
    Subsystem                       x22  DEC SIM - SCSI Interface Module
    Number of Packets                 2.
    
    ------ Packet Type ------       258. Module Name String
    
    Routine Name                         ss_abort_done
    
    ------ Packet Type ------       256. Generic String
    
                                         SCSI abort has been performed
    
    
    ******************************** ENTRY    2
    ********************************
    
    
    Logging OS                        2. Digital UNIX
    System Architecture               2. Alpha
    Event sequence number            19.
    Timestamp of occurrence              14-FEB-1997 16:09:54
    Host name                            l1sm1
    
    System type register      x00000016  AlphaStation 4x00
    Number of CPUs (mpnum)    x00000002
    CPU logging event (mperr) x00000000
    
    Event validity                    1. O/S claims event is valid
    Event severity                    1. Severe Priority
    Entry type                      199. CAM SCSI Event Type
    
    
    ------- Unit Info -------
    Bus Number                        0.
    Unit Number                   x0028  Target =   5.
                                         LUN =   0.
    ------- CAM Data -------
    Class                           x22  DEC SIM - SCSI Interface Module
    Subsystem                       x22  DEC SIM - SCSI Interface Module
    Number of Packets                 3.
    
    ------ Packet Type ------       258. Module Name String
     
    Routine Name                         ss_perform_timeout
    
    ------ Packet Type ------       256. Generic String
    
                                         timeout on disconnected request
    
    ------ Packet Type ------      1038. SIM Working Set(SIM_WS)
    Packet Revision                   2.
    
    *flink                    xFFFFFC003FE6B900
    *blink                    xFFFFFFFF802652E8
    Controller # for HBA              0.
    Target ID                         5.
    LUN                               0.
    Cam Status                      x00  CCB Request In Progress
    TAG                       x00000000
    Sequence Number               40339.
    Time Stamp                x6E5D4742
    *nexus                    xFFFFFFFF802652E8
    *it_nexus                 xFFFFFFFF80265E30
    *sim_sc                   xFFFFFFFF80264000
    *ccb                      xFFFFFC003A333328
    Phase Bits                x00000000
    Misc Flags                x00080000  Timeout
    Cam Flags                 x00000040  Data Direction (01: DATA IN)
    Error Recovery            x00000080  SIM_WS in process of being timed
    out
    Recovery Status           x00000000
    (*as_callback)()          x0000000000000000
    *as_ccb                   x0000000000000000
    (*tmo_fn)()               xFFFFFC0000546B60
    *tmo_arg                  xFFFFFC003A333100
    Rest of SIM_WS                       ** Not Printed **
    
    
    ******************************** ENTRY    3
    ********************************
    
    
    Logging OS                        2. Digital UNIX
    System Architecture               2. Alpha
    Event sequence number            18.
    Timestamp of occurrence              14-FEB-1997 16:07:50
    Host name                            l1sm1
    
    System type register      x00000016  AlphaStation 4x00
    Number of CPUs (mpnum)    x00000002
    CPU logging event (mperr) x00000001
    
    Event validity                    1. O/S claims event is valid
    Event severity                    1. Severe Priority
    Entry type                      199. CAM SCSI Event Type
    
    
    ------- Unit Info -------
    Bus Number                        0.
    Unit Number                   x0028  Target =   5.
                                         LUN =   0.
    ------- CAM Data -------
    Class                           x22  DEC SIM - SCSI Interface Module
    Subsystem                       x22  DEC SIM - SCSI Interface Module
    Number of Packets                 2.
    
    ------ Packet Type ------       258. Module Name String
    
    Routine Name                         ss_abort_done
    
    ------ Packet Type ------       256. Generic String
    
                                         SCSI abort has been performed
    
    
    ******************************** ENTRY    4
    ********************************
    
    
    Logging OS                        2. Digital UNIX
    System Architecture               2. Alpha
    Event sequence number            17.
    Timestamp of occurrence              14-FEB-1997 16:07:50
    Host name                            l1sm1
    
    System type register      x00000016  AlphaStation 4x00
    Number of CPUs (mpnum)    x00000002
    CPU logging event (mperr) x00000000
    
    Event validity                    1. O/S claims event is valid
    Event severity                    1. Severe Priority
    Entry type                      199. CAM SCSI Event Type
    
    
    ------- Unit Info -------
    Bus Number                        0.
    Unit Number                   x0028  Target =   5.
                                         LUN =   0.
    ------- CAM Data -------
    Class                           x22  DEC SIM - SCSI Interface Module
    Subsystem                       x22  DEC SIM - SCSI Interface Module
    Number of Packets                 3.
    
    ------ Packet Type ------       258. Module Name String
    
    Routine Name                         ss_perform_timeout
    
    ------ Packet Type ------       256. Generic String
    
                                         timeout on disconnected request
    
    ------ Packet Type ------      1038. SIM Working Set(SIM_WS)
    Packet Revision                   2.
    
    *flink                    xFFFFFC003FE98500
    *blink                    xFFFFFFFF802652E8
    Controller # for HBA              0.
    Target ID                         5.
    LUN                               0.
    Cam Status                      x00  CCB Request In Progress
    TAG                       x00000000
    Sequence Number               40191.
    Time Stamp                x66E58828
    *nexus                    xFFFFFFFF802652E8
    *it_nexus                 xFFFFFFFF80265E30
    *sim_sc                   xFFFFFFFF80264000
    *ccb                      xFFFFFC003A2FD728
    Phase Bits                x00000000
    Misc Flags                x00080000  Timeout
    Cam Flags                 x00000040  Data Direction (01: DATA IN)
    Error Recovery            x00000080  SIM_WS in process of being timed
    out
    Recovery Status           x00000000
    (*as_callback)()          x0000000000000000
    *as_ccb                   x0000000000000000
    (*tmo_fn)()               xFFFFFC0000546B60
    *tmo_arg                  xFFFFFC003A2FD500
    Rest of SIM_WS                       ** Not Printed **
    
    Thanks for any help that can be provided.
    
    Howard
T.RTitleUserPersonal
Name
DateLines
493.1MAY30::CUMMINSFri Feb 14 1997 16:012
    Interesting. Both the 2100 and 4000/4100 use an NCR810 as the embedded
    CD/ROM controller.
493.2POBOXB::BAKFri Feb 14 1997 17:248
Check the termination on the internal SCSI bus......

If it is a rackmount it should be at the end of the SCSI cable tucked below the
fan tray. If it is a pedestal the terminator should have been moved to the top
tray. If a CD or tapr is on top make sure the bus has not been double
terminated. 

	Dennis
493.3Me tooBRADEC::PODOLINSKYPeter Podolinsky - MCS SlovakiaFri Feb 14 1997 17:585
I have installed 2* RM and 1* pedestal recently.
On boths RMs I had symptoms of .0, pedestal worked fine.
I''ll chech the terminators.
Regards,
	Peter
493.4Here the PN# I usedNWD002::SKRABUT_LALarry SkrabutFri Feb 14 1997 19:0511
    	Interesting note,
    
    	I have had problems with the internal scsi and T/S it down to the
    lack of terminator. I look in QRL/Comet/Service Manual/ProSIC IPB, and
    nowhere was there a part number for that terminator that I could find.
    Now I did talk to someone in SBU and I the why I understood it any
    active male term should work, I finally ended up with using a
    Celebris internal active terminator from cable assy 17-03459-02, or
    Term PN# 12-37791-02 (Which is listed by itself for a PB30 DEC 2000
    Model 500. The system in question has ran without any Customer
    compliants so far.
493.5terminator is installed.SALEM::ARNOLDSat Feb 15 1997 18:237
    I checked and the terminator is installed at the end of the bus.
    
    It is part # 12-37791-01.  Are there any other jumpers I need to look
    at.  I would think it was a bad part but I am getting the same
    results on three rm boxes.
    
    Howard
493.6I will check the bus...VAXRIO::LEANDROSun Feb 16 1997 19:4511
    Hi,
    
    I have the same problem. Using UNIX i have to send "file"comands to the
    tape (TLZ09) if i want to fix the slow down problem of the RRD45. I was
    thinking i have a bad CDROM, but now im not sure that. A NCR driver ?
    
    Any clue?
    
    Tks,
    
    Leandro.
493.7cdrom scsi timeoutsSCASS1::MURPHYThu Feb 27 1997 08:5011
    I have two rm's that demonstrate the exact symptom you described. SCSi
    time outs & aborts. both active & passive termination have been tried (
    active made response worse). I have taken the pedestal sub-system (top
    tray) with 4mm tape which functions fine on pedestal. installed on rm
    and still have problem. CD works on pedestal with no errors. My next
    step is to try SCSI cable and/or saddle module. from known good
    pedestal to rack mount.  Can you say "escalation"
    Todd Murphy
    Dallas MCS
    
    
493.8cdrom scsi timeout 2SCASS1::MURPHYFri Feb 28 1997 00:5015
    I tried swapping some parts from a known good pedestal 4100 into one of
    my rackmounts. I tried in this order, replacing the following parts;
    Internal SCSI cable (with passive term.),, adding 4mm tape with trm
    pwr ena.,term dis., and no term on cable (a good config on the
    pedestal),, PCI mother bd. (saddle module) with original cable and
    passive term, no tape,,  Known good saddle module and horse module with
    original cable, term. All of these attempts yeilded no positive
    results. Under unix 3.2g I continue to recieve excessive load times on
    the cd and continued SCSI aborts and Timeouts entered in decevent.
    Could this be a firmware issue or maybe CPU mother bd.?? I could use
    some input on this one.
    
    Todd Murphy
    Dallas MCS
    
493.9Is it solved ?BRADEC::PODOLINSKYPeter Podolinsky - MCS SlovakiaSun Mar 16 1997 09:047
    Hi all,
    I have checked the termination on 2 RMs (.3), all looks fine, but I
    still get aborts on the bus. Is there a solution for this problem ?
    Was it already escalated ? 
    Regards,
    		Peter
    
493.10pka0_fast settingSALEM::ARNOLDWed Mar 19 1997 10:316
    We found a fix to our problem.  At the console check 
    pka0_fast if it is set to 1 try setting it to 0 reinit the machine
    
    Once booted try load something and see if it improves.
    
    Howard
493.11Please confirm/deny these conclusionsHARMNY::CUMMINSWed Mar 19 1997 12:3228
    It appears there are several data points here; please correct me where
    I've mis-interpreted or mis-represented the facts:
    
     1. Multiple 4100/4000 installations are seeing slow access times
        on devices hanging off the embedded NCR810. Typically CD/ROM,
        but sometimes configured with tape.
    
     2. Problem is only seen on rackmount systems; pedestals appear to
        be immune to the problem.
    
     3. Slow load times are presumably attributable to SCSI bus errors
        and excessive retries, presumably due to poor signal integrity.
        Multiple machines seen with lots of errors in operating system
        error log. How are the tape drives ordered/configured? At Stage
        II MFG or are they ordered separately and installed onsite?
    
     4. Setting PKA0_FAST to 0 ("slow" mode; default is 1) remedied the
        problem on at least three of the rackmount configs, presumably
        reducing bus signal integrity problems.
    
     5. At least one site found that adding a terminator resolved the
        problem (with PKA0_FAST still set to 1).
    
    Is the above correct? I'll check to see what MFG does re: default
    installation of termination. Presumably either cable length or
    termination is at play here re: the difference between rackmount and
    pedestal. I'll most likely have MFG begin setting PKA0_FAST=0 as the
    default console EV setting. PKA0 is always the embedded NCR810 device.
493.12New firmware may fix it.SALEM::ARNOLDWed Mar 19 1997 12:5612
    The only difference I see between the two versions of 4100's is that
    the rackmount version will only have a CDrom, (no room for a tape
    device).  The pedestal uses an additional cable to go to the tape unit
    if present.  
    
    This morning I upgraded my firmware to SRM 4.8-6 and AlphaBios 5.2-8.
    After loading the same software I didn't see any errors.  So this could
    also be a firmware issue.
    
    I know this doesn't help much.
    
    Howard
493.13MAY30::CUMMINSWed Mar 19 1997 13:207
    Are you sure this wasn't because you had already set pka0_fast to 0 on
    these machines? This EV is "sticky". Copy kept in NCR810 NVRAM..
    
    The default setting for SRM's PKA0_FAST has always been 1; at least
    since the V1.2-4 console release. I just verified this from the saved
    source files..
    
493.14HARMNY::CUMMINSTue Mar 25 1997 12:016
    Design/support engineering is investigating this as a potential signal
    integrity problem. Until the problem is better understood, the
    recommendation is that customers try setting PKA0_FAST to 0 to see
    whether this helps.
    
     P00>>> set pka0_fast 0
493.15Does not appear to be known OpenVMS 810 driver bugSTAR::jacobi.zko.dec.com::jacobiPaul A. Jacobi - OpenVMS Systems GroupTue Mar 25 1997 13:2710
I know of one issue with the OpenVMS 810 SCSI driver, which is documented 
in ALPHASTATION, note 1776.  So, far this problem has only been known to 
occur with OpenVMS V7.1 when a non-TCQ, such as RX26F, is present with the 
CD-ROM.  .6 notes that the problem has been seen under Unix, so this does 
not appear to be related to the known OpenVMS driver issue with RZ26F.


							-Paul

493.16Slow DAT now....VMS Backup.VAXRIO::LEANDROWed Mar 26 1997 08:0123
    Hi,
    
    Now i found problems in this bus under VMS, but the slow device is a
    TLZ09. I have changed the dat ( tlz09 ), check terminators, set PKA0_
    FAST to "0" and nothing. Now i'm trying to update the TLZ09 firmware
    to 0167 . I will do that this week. 
    
    4100 5/300
    openvms 6.2-1h3
    firmware from cd 3.8
    
    I would like to now what does the parameter PKA0_DISCONNECT do?
    
    The symptom from DAT is a very slow writing process. If you compare
    with a DAT TLZ07-VA working in the bus of the KZPDA ( ISP1020 ).
    
    Any help?
    
    Tks,
    
    Leandro.
    
    MCS BRASILIA-BRASIL
493.17HARMNY::CUMMINSWed Mar 26 1997 14:416
    Just to make sure..
    
    You have the TLZ09 hanging off the embedded NCR810, yes? Or is it
    hanging off another SCSI controller? It may not be obvious, but
    PKA0_XXXX EVs only apply to the "A" SCSI controller. PKB0_XXXX would
    apply to the next SCSI controller, etc..
493.18Hope is clear now...VAXRIO::LEANDROWed Mar 26 1997 18:085
    Embedded NCR810. It's the bus of CD (RRD45). DKA500 and MKA600.
    
    PKB0 is a KZPDA (ISP1020) option.
    
    Leandro.
493.19CDrom cam scsi eventsSCASS1::MURPHYFri Apr 04 1997 14:3513
    I have placed a couple of entries early on in this note. My issue with
    a rackmount (2) 4100 still exists. since my last entry I have tried
    horse, saddle, SCSI cable, active/passive term., matching the pedestal
    config (i.e. with TLZ09), setting pka0_fast 0, no combination has had
    success with the acception of a kzpaa with internal scsi connection.
    that config ran flawlessly although this bus then became scsi bus 3. A
    question was raised in on of the previous notes "is this really being
    escalated?" the answer is yes. I have been working with engineering for
    several weeks and will uodate notes as appropriate.
    Todd Murphy
    Dallas MCS
    
    
493.20Terminators revisitedMOVMON::DAVISFri Apr 04 1997 15:466
    I know some of the protos had the RRD45 with terminators installed when
    they shouldn't have been.  You've probably checked the termination
    already, but make sure that you're not terminated in the middle of the
    bus as well as the ends.
    
    Todd
493.21Terminators - I think this is the problem...VAXRIO::LEANDROSat Apr 05 1997 23:3624
    hi,
    
    My cases .6 and .16 (RRD45 and TLZ09) are on pedestal systems. I dont
    know about rackmount but in my two situations i think i have found the
    problem: TERMINATORS. In both systems the SCSI cable on TLZ09 (frost
    white) dont have terminator and the DAT is not terminated. The CD is
    not terminated too.
    
    Last nigth when installing four new systems (AS4000/pedestal) i've
    checked and saw the difference: the new systems have a different SCSI
    cable on TLZ09. It has a terminator and is better shielded. The PN of
    this cable is : 17-04306-01 (rev.B03). The terminator PN is:
    12-37791-01 rev.A.
    
    In my two systems i have to order two new cables and two new
    terminators cause there is no place to put the terminators the SCSI
    cables are differents.
    
    I think this is the problem i have in both machines. I hope i have
    found the solution.
    
    Tks,
    
    Leandro. 
493.22Don't forget TLZxx integral Terminators.WONDER::TOUSERKANITue Apr 15 1997 16:188
    
    
    Just a note that TLZ09 (any TLZ drive for that matter) has embedded
    terminator integral to the drive, that can be enabled/disabled via
    jumpers or dip switches.  Bottom line is to make sure that you are 
    terminated on near end and far end and nowhere else in between.
    
    /Frank
493.23HARMNY::CUMMINSTue May 06 1997 12:421
    See note 93.46 for a Blitz which may explain the problem seen here..