[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssdevo::hsz40_product

Title:HSZ40 Product Conference
Moderator:SSDEVO::EDMONDS
Created:Mon Apr 11 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:902
Total number of notes:3319

851.0. "HSZ50 - KZPSA => SCSI bus reset (reason 0x6) !" by LEMAN::MARTIN_A (Be vigilant...) Wed Apr 23 1997 03:59

    Hello,
    
    
    I open a note in ask_ssag (#6610) and wonder is this error could
    not be linked with HSZ50 - setting ?
    
    These errors just appeared shortly after the HSZ50 installation...
    
    Your advise will be welcome on the matter.
    
    				============================
    				Alain MARTIN/SSG Switzerland
    
    *********************************************************************
    
    We are having a few SCSI bus resets at a customer site these
    last couple of days.
    
    Customer wants to know the reason, as theses errors trigger
    off alarms on their system... waking up the operators 8-)
    
    I could not explain these, can someone find any clue...
    
    2 differents busses seem to show the problem : SCSI #1 and
    SCSI # 2, both are KZPSA connected to HSZ50 (FW v50Z) !
    
    No device errors are reported !!
    
    Error Message :
    
    ****************************************************************
                    Bus reset request from adapter detected
                                         (reason = 0x6)
    ****************************************************************
    
    I discovered in note #6066 (ask_ssag) that reason 0x6 means the 
    following:
    
       Unable_to_arbitrate
                    6   This code shall be used with a Reset Request
                        message when the adapter is not participating
                        in any SCSI bus traffic, has I/O requests,
                        and has not seen an opportunity to arbitrate
                        for the bus for a significant period time. 
    			The method in which the adapter determines
                        that it has been waiting too long arbitrate
                        for the bus is implementation specific, and
                        is not required to work when commands specify
                        an infinite timeout period (FFFFh).
    
    Here are the system config and the dia outputs, hope it'll show
    any evidence I'm not able to interpret myself...
    
    **************************** ENTRY 1. *******************************
    
    
    ----- EVENT INFORMATION -----
    
    EVENT CLASS                             OPERATIONAL EVENT
    OS EVENT TYPE                  300.     SYSTEM STARTUP
    SEQUENCE NUMBER                  0.
    OPERATING SYSTEM                        DEC OSF/1
    OCCURRED/LOGGED ON                      Sun Apr 13 17:59:03 1997
    OCCURRED ON SYSTEM                      chpd01
    SYSTEM ID                 x00050018
    SYSTYPE                   x00000000
    MESSAGE                                 PCXAL keyboard, language
    Francais
                                             _(Suisse Romande)
    
                                           Alpha boot: available memory
    from
                                             _0x1860000 to 0x3ffee000
                                            Digital UNIX V3.2D-2 (Rev.
    41.64); Sun
                                                 _Apr 13 16:40:27 MET DST 1997
                                                physical memory = 1024.00
    megabytes.
                                                available memory = 999.61
    megabytes.
                                                using 3924 buffers containing
    30.65
                                                 _megabytes of memory
                                                Master cpu at slot 0.
                                                Firmware revision: 4.7
                                                PALcode: OSF version 1.21
                                                ibus0 at nexus
                                                AlphaServer 2100A 5/250
                                                cpu 0 EV-5 4mb b-cache
                                                cpu 1 EV-5 4mb b-cache
                                                cpu 2 EV-5 4mb b-cache
                                                cpu 3 EV-5 4mb b-cache
                                                gpc0 at ibus0
                                                pci0 at ibus0 slot 0
                                                eisa0 at pci0
                                                ace0 at eisa0
                                                ace1 at eisa0
                                                lp0 at eisa0
                                                fdi0 at eisa0
                                                fd0 at fdi0 unit 0
                                                Initializing xcr0.  Please
    
    wait.
                                                Initializing xcr0.  Please
    wait.
                                                Initializing xcr0.  Please
    wait.
                                                Initializing xcr0.  Please
    wait.
                                                Initializing xcr0.  Please
    wait.
                                                xcr0 at eisa0
                                                re0 at xcr0 unit 0 (unit status
    =
                                                 _ONLINE, raid level = 0)
                                                re1 at xcr0 unit 1 (unit status
    =
                                                 _ONLINE, raid level = 0)
                                                re2 at xcr0 unit 2 (unit status
    =
                                                 _ONLINE, raid level = JBOD)
                                                re4 at xcr0 unit 4 (unit status
    =
                                                 _ONLINE, raid level = 0)
                                                re5 at xcr0 unit 5 (unit status
    =
                                                 _ONLINE, raid level = 0)
                                                pci2000 at pci0 slot 3
                                                psiop0 at pci2000 slot 1
                                                Loading SIOP: script 1007400,
    reg
                                                 _81804000, data 406393c8
                                                scsi0 at psiop0 slot 0
                                                rz0 at scsi0 bus 0 target 0
    lun 0 (DEC
                                                 _    RZ28M    (C) DEC 0616)
                                                rz1 at scsi0 bus 0 target 1
    lun 0 (DEC
                                                 _    RZ28D    (C) DEC 0008)
                                                rz6 at scsi0 bus 0 target 6
    lun 0 (DEC                                         _    RRD45   (C) DEC
    1645)
                                                tu0: DECchip 21040-AA:
    Revision: 2.4
                                                tu0 at pci2000 slot 6
                                                tu0: DEC TULIP Ethernet
    Interface,
                                                 _hardware address:
    00-00-F8-20-C1-70
                                                tu0: console mode: selecting
    UTP
                                                 _(10BaseT) port
                                                tu1: DECchip 21140-AA:
    Revision: 1.2
                                                tu1 at pci2000 slot 7
                                                tu1: DEC Fast Ethernet
    Interface,
                                                 _hardware address:
    00-00-F8-03-1F-0E
                                                tu1: console mode: selecting
    UTP
                                                 _(100BaseT) port
                                                Initializing xcr1.  Please
    wait.
                                                Initializing xcr1.  Please
    wait.
                                            Initializing xcr1.  Please
    wait.
                                                Initializing xcr1.  Please
    wait.
                                                xcr1 at pci2000 slot 8
                                                re8 at xcr1 unit 8 (unit 
    status =
                                                 _ONLINE, raid level = 0)
                                                re9 at xcr1 unit 9 (unit 
    status =
                                                 _ONLINE, raid level = 0)
                                                re10 at xcr1 unit 10 (unit
    status =
                                                 _ONLINE, raid level = JBOD)
                                                re11 at xcr1 unit 11 (unit
    status =
                                                _ONLINE, raid level = 0)
                                                re12 at xcr1 unit 12 (unit
    status =
                                                 _ONLINE, raid level = 0)
                                                re13 at xcr1 unit 13 (unit
    status =
                                                 _ONLINE, raid level = 0)
                                                re14 at xcr1 unit 14 (unit
    status =
                                                 _ONLINE, raid level = JBOD)
                                                vga0 at pci0 slot 6
                                                 1024x768 (S3TRIO  )
                                                pza0 at pci0 slot 7
                                                pza0 firmware version: DEC P01
    A10
                                                 _
                                                scsi1 at pza0 slot 0
                                                rz9 at scsi1 bus 1 target 1
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz10 at scsi1 bus 1 target 2
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz12 at scsi1 bus 1 target 4
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz12 at scsi1 bus 1 target 4
    lun 1
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz11 at scsi1 bus 1 target 3
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                psiop1 at pci0 slot 8
                                                Loading SIOP: script 1013400,
    reg
                                                 _81b32000, data 406397c8
                                                scsi2 at psiop1 slot 0
                                                tz16 at scsi2 bus 2 target 0
    lun 0
                                                 _(DEC     TZ877    (C) DEC
    9B3C)
                                                pza1 at pci0 slot 9
                                                pza1 firmware version: DEC P01
    A10
                                                 _
                                                scsi3 at pza1 slot 0
                                                rz25 at scsi3 bus 3 target 1
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz26 at scsi3 bus 3 target 2
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz27 at scsi3 bus 3 target 3
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz28 at scsi3 bus 3 target 4
    lun 0
                                                 _(DEC     HSZ50-AX V50Z)
                                                rz28 at scsi3 bus 3 target 4
    lun 1
                                                 _(DEC     HSZ50-AX V50Z)
                                                dli: configured
    
    ************************************************************************
    
    
    DECevent V2.2
    
    ******************************** ENTRY    1
    ********************************
    
    
    Logging OS                        2. Digital UNIX
    System Architecture               2. Alpha
    Event sequence number             5.
    Timestamp of occurrence              22-APR-1997 13:31:32
    Host name                            chpd01
    
        System type register      x00000018  Systype 24. Not announced yet
        Number of CPUs (mpnum)    x00000004
        CPU logging event (mperr) x00000000
    
        Event validity                    1. O/S claims event is valid
        Event severity                    1. Severe Priority
        Entry type                      199. CAM SCSI Event Type
    ------ Packet Type ------       256. Generic String
    
                                             A SCSI bus reset has been done
    
        ------ Packet Type ------      1078. SIMport Softc(SIMPORT_SOFTC)
        Packet Revision                   2.
    
        *spo_adp                  xFFFFFC003FA088A0
        Adapter State             x00002240  SIMport Thread Started
                                             Path Inquiry info Valid
        Flags - Supported Feature x00000004  Support Linked BSMs
        Max # of Queued Cmds             55.
        # of SCSI Channel                 1.
        Min KEEPALIVE Time(sec)          30.
        Min # of Free Queue               3.
        # of 4K Memory Segments           0.
        Adap Min Data Alignment         x00
        # of SAC Buffers                x00
        CAM Version                     xD0
        SCSI Capabilities               x86
        Target Mode Support             x49
        Miscellaneous Flags             x00
        HBA Engine Count              64512.
        Vendor Unique Flags
    
                  15--<-12  11--<-08  07--<-04  03--<-00   :Byte Order
         0000:    00000000  00000000  00000080  3247FFFF  
    *..G2...........l*
    
        Private Data Size         x00000000
        Async Capabilities        x00000000
        Highest Path ID                 x38
        SCSI Device ID                  x00
        SIM Vendor ID (ASCII)                M-q
        HBA Vendor ID (ASCII)                ORT/VA13DEC  P01
        *cam_osd_usage            x2020203031412020
        Max CDB Length                    0.
        *spo_sim_softc            x000000000000000C
        *waitq_head               xFFFFFFFF803F1000
        *waitq_tail               xFFFFFC003F998840
        Lock for Wait Queue       x3A16E120
        *spo_adap_sanity_ccb      x000000000049DBF4
        *spo_adap_ccb             xFFFFFC003FE18B28
        # 100 millsec since MIN  1071744808.
        **spo_stl_nexus           x0000000000000000
        # LUNs in Crash Recovery          0.
    
    
        ******************************** ENTRY    2 
        ********************************
    
    
        Logging OS                        2. Digital UNIX
        System Architecture               2. Alpha
        Event sequence number             4.
        Timestamp of occurrence              22-APR-1997 13:31:32
        Host name                            chpd01
    
        System type register      x00000018  Systype 24. Not announced yet
        Number of CPUs (mpnum)    x00000004
        CPU logging event (mperr) x00000000
    
        Event validity                    1. O/S claims event is valid
        Event severity                    1. Severe Priority
        Entry type                      199. CAM SCSI Event Type
    
        ------- Unit Info -------
        Bus Number                        3.
        Unit Number                   xFFFF  Target =   7.
                                             LUN =   7.
                                             Not Defined
        ------- CAM Data -------
        Class                           x33  SIMport Adapter - KZxSA
        Subsystem                       x33  SIMport Adapter - KZxSA
        Number of Packets                 3.
    
        ------ Packet Type ------       258. Module Name String
    
        Routine Name                         spo_bus_reset_rspn
    
        ------ Packet Type ------       256. Generic String
    
                                             Bus reset request from adapter
        detected
                                             (reason = 0x6)
    
        ------ Packet Type ------      1078. SIMport Softc(SIMPORT_SOFTC)
        Packet Revision                   2.
    
        *spo_adp                  xFFFFFC003FA088A0
        Adapter State             x00002240  SIMport Thread Started
                                             Path Inquiry info Valid
        Flags - Supported Feature x00000004  Support Linked BSMs
        Max # of Queued Cmds             55.
        # of SCSI Channel                 1.
        Min KEEPALIVE Time(sec)          30.
        Min # of Free Queue               3.
        # of 4K Memory Segments           0.
        Adap Min Data Alignment         x00
        # of SAC Buffers                x00
        CAM Version                     xD0
        SCSI Capabilities               x86
        Target Mode Support             x49
        Miscellaneous Flags             x00
        HBA Engine Count              64512.
        Vendor Unique Flags
    
                  15--<-12  11--<-08  07--<-04  03--<-00   :Byte Order
         0000:    00000000  00000000  00000080  3247FFFF  *..G2...........l*
    
        Private Data Size         x00000000
        Async Capabilities        x00000000
        Highest Path ID                 x38
        SCSI Device ID                  x00
        SIM Vendor ID (ASCII)                M-q
        HBA Vendor ID (ASCII)                ORT/VA13DEC  P01
        *cam_osd_usage            x2020203031412020
        Max CDB Length                    0.
        *spo_sim_softc            x000000000000000C
        *waitq_head               xFFFFFFFF803F1000
        *waitq_tail               xFFFFFC003F998840
        Lock for Wait Queue       x3A16E120
        *spo_adap_sanity_ccb      x000000000049DB8C
        *spo_adap_ccb             xFFFFFC003FE18B28
        # 100 millsec since MIN  1071744808.
        **spo_stl_nexus           x0000000000000000
        # LUNs in Crash Recovery          0.
    
    
        ******************************** ENTRY    3
        ********************************
    
    
        Logging OS                        2. Digital UNIX
        System Architecture               2. Alpha
        Event sequence number             3.
        Timestamp of occurrence              21-APR-1997 12:31:32
        Host name                            chpd01
    
        System type register      x00000018  Systype 24. Not announced yet
        Number of CPUs (mpnum)    x00000004
        CPU logging event (mperr) x00000000
    
        Event validity                    1. O/S claims event is valid
        Event severity                    1. Severe Priority
        Entry type                      199. CAM SCSI Event Type
    
    
        ------- Unit Info -------
        Bus Number                        1.
        Unit Number                   xFFFF  Target =   7.
                                             LUN =   7.
                                             Not Defined
        ------- CAM Data -------
        Class                           x33  SIMport Adapter - KZxSA
        Subsystem                       x33  SIMport Adapter - KZxSA
        Number of Packets                 3.
    
        ------ Packet Type ------       258. Module Name String
    
    
        Routine Name                         spo_process_ccb
    
        ------ Packet Type ------       256. Generic String
    
                                             A SCSI bus reset has been done
    
        ------ Packet Type ------      1078. SIMport Softc(SIMPORT_SOFTC)
        Packet Revision                   2.
    
        *spo_adp                  xFFFFFC003FA08420
        Adapter State             x00002240  SIMport Thread Started
                                             Path Inquiry info Valid
        Flags - Supported Feature x00000004  Support Linked BSMs
        Max # of Queued Cmds             55.
        # of SCSI Channel                 1.
        Min KEEPALIVE Time(sec)          30.
        Min # of Free Queue               3.
        # of 4K Memory Segments           0.
        Adap Min Data Alignment         x00
        # of SAC Buffers                x00
        CAM Version                     xD0
        SCSI Capabilities               x86
        Target Mode Support             x49
        Miscellaneous Flags             x00
        HBA Engine Count              64512.
        Vendor Unique Flags
    
                  15--<-12  11--<-08  07--<-04  03--<-00   :Byte Order
         0000:    00000000  00000000  00000080  3247FFFF  *..G2...........l*
    
        Private Data Size         x00000000
        Async Capabilities        x00000000
        Highest Path ID                 x38
        SCSI Device ID                  x00
        SIM Vendor ID (ASCII)                M-q
        HBA Vendor ID (ASCII)                ORT/VA13DEC  P01
        *cam_osd_usage            x2020203031412020
        Max CDB Length                    0.
        *spo_sim_softc            x000000000000000C
        *waitq_head               xFFFFFFFF803EB000
        *waitq_tail               xFFFFFC003FA20420
        Lock for Wait Queue       x3FF63020
        *spo_adap_sanity_ccb      x000000000049DBF4
        *spo_adap_ccb             xFFFFFC003FE07B28
        # 100 millsec since MIN  1071675176.
        **spo_stl_nexus           x0000000000000000
        # LUNs in Crash Recovery          0.
    
        ******************************** ENTRY    4
        ********************************
    
    
        Logging OS                        2. Digital UNIX
        System Architecture               2. Alpha
        Event sequence number             2.
        Timestamp of occurrence              21-APR-1997 12:31:32
        Host name                            chpd01
    
        System type register      x00000018  Systype 24. Not announced yet
        Number of CPUs (mpnum)    x00000004
        CPU logging event (mperr) x00000000
    
        Event validity                    1. O/S claims event is valid
        Event severity                    1. Severe Priority
        Entry type                      199. CAM SCSI Event Type
    
    
        ------- Unit Info -------
        Bus Number                        1.
        Unit Number                   xFFFF  Target =   7.
                                             LUN =   7.
                                             Not Defined
        ------- CAM Data -------
        Class                           x33  SIMport Adapter - KZxSA
        Subsystem                       x33  SIMport Adapter - KZxSA
        Number of Packets                 3.
    
        ------ Packet Type ------       258. Module Name String
    
    
        Routine Name                         spo_bus_reset_rspn
    
        ------ Packet Type ------       256. Generic String
    
                                             Bus reset request from adapter
        detected
                                             (reason = 0x6)
    
        ------ Packet Type ------      1078. SIMport Softc(SIMPORT_SOFTC)
        Packet Revision                   2.
    
        *spo_adp                  xFFFFFC003FA08420
        Adapter State             x00002240  SIMport Thread Started
                                             Path Inquiry info Valid
        Flags - Supported Feature x00000004  Support Linked BSMs
        Max # of Queued Cmds             55.
        # of SCSI Channel                 1.
        Min KEEPALIVE Time(sec)          30.
        Min # of Free Queue               3.
        # of 4K Memory Segments           0.
        Adap Min Data Alignment         x00
        # of SAC Buffers                x00
        CAM Version                     xD0
        SCSI Capabilities               x86
        Target Mode Support             x49
        Miscellaneous Flags             x00
        HBA Engine Count              64512.
        Vendor Unique Flags
    
                  15--<-12  11--<-08  07--<-04  03--<-00   :Byte Order
         0000:    00000000  00000000  00000080  3247FFFF  *..G2...........l*
    
        Private Data Size         x00000000
        Async Capabilities        x00000000
        Highest Path ID                 x38
        SCSI Device ID                  x00
        SIM Vendor ID (ASCII)                M-q
        HBA Vendor ID (ASCII)                ORT/VA13DEC  P01
        *cam_osd_usage            x2020203031412020
        Max CDB Length                    0.
        *spo_sim_softc            x000000000000000C
        *waitq_head               xFFFFFFFF803EB000
        *waitq_tail               xFFFFFC003FA20420
        Lock for Wait Queue       x3FF63020
        *spo_adap_sanity_ccb      x000000000049DB8C
        *spo_adap_ccb             xFFFFFC003FE07B28
        # 100 millsec since MIN  1071675176.
        **spo_stl_nexus           x0000000000000000
        # LUNs in Crash Recovery          0.
    
       
    ***********************************************************************
T.RTitleUserPersonal
Name
DateLines
851.1Beleive it's KZPSA FW bug or HSZ50 incompatibility !PANTER::MARTINBe vigilant...Thu Apr 24 1997 07:1526
    We still think there is something wrong around between HSZ50s and
    KZPSAs !!!
    
    We checked on HSZs errors reported by FMU, but none !!!
    
    What could have caused these SCSI bus resets is : the customer is 
    running from crontab 2x/hour (at 01' and 31') a hszterm "show disk full" 
    to make sure all his disks are there...
    
    This since 2 weeks (that's to say 48 times/day) , but only twice have 
    caused SCSI bus resets with reason 0x6 (Unable_to_arbitrate) !
    
    We checked for disks errors, None !
    We checked the disks FW versions , all RZ28 are at rev 442D and all 
    rz29b at rev 16 !!!
    
    So we beleive there is something wrong (either timeout setting or
    FW bug) that cause the KZPSAs to fail to arbitrate the SCSI from
    time to time with HSZ50s (FW v50Z) !!!
    
    Nobody aware of such a problem ??? 
    
    Looking forward to hear from somebody...
    
    					============================
    					Alain MARTIN/SSG Switzerland
851.2SSDEVO::T_GONZALESMon Apr 28 1997 14:396
    We have usually seen these errors as a result of a scsi bus problem.
    
    Check termnination at both ends, also check for bad cables/loose
    cables, trilinks on hsz's. Basically anything on the scsi bus.
    
    
851.3errors come from show device full...PANTER::MARTINBe vigilant...Fri May 02 1997 09:3127
    I agree that most of the time SCSI bus resets are caused by bad
    SCSI cabling.
    
    However it doesn't seem our case as the 2 buses do show the same
    problem and we checked for the proper connection of the cables
    and terminators (we haven't replaced cable/terminator though) !
    
    But we were able to determine that the reason 0x6 (unable to
    arbitrate) is caused by a script the customer is running each
    30 minutes, this script do "show device full" hszterm 
    command. 
    
    Each time a SCSI bus reset occured, we saw from the log that all
    the disks one of the BA350 shelf did not answer (but not always 
    the same) !
    We think this means they were busy doing something at the time of
    the "show device full". Is that wise ?
    
    This script is there for checking for disk errors that would not 
    reported at Digital Unix level....
    
    Do you know a better way to proceed ?
    
    Cheers,
    				============================
    				Alain MARTIN/SSG Switzerland