[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference stkhlm::magtape

Title:MAGNETIC TAPEDRIVES
Moderator:STKHLM::GJOHNSSON
Created:Mon Sep 21 1987
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3775
Total number of notes:13147

3694.0. "TL822 stuck tape, then disappearing drive" by SANITY::LEMONS (And we thank you for your support.) Tue Mar 11 1997 15:55

Hi,                                                                             
    
    We have a TL822 attached to an AlphaServer 2000 running Digital UNIX
    V4.0A that is exhibiting this problem.  So far, this has defied
    resolution by RDC:
                                                                                
   Yesterday there was a problem with drive /dev/nrmt0h, where the tape was     
physically stuck in the drive.  The robotic arm on the jukebox attempted to     
remove the tape, but was unable.  The tape was removed by hand and the jukebox  
allowed to do its normal reset.  After this there was a problem with the drive  
reading the label of the tape in the drive.  What seems to have happened is     
that the drive was dropped from the SCSI bus after timing out trying to remove  
the tape:                                                                       
                                                                                
# file /dev/nrmt0h                                                              
/dev/nrmt0h:    character special (9/19459)                                     
# file /dev/nrmt1h                                                              
/dev/nrmt1h:    character special (9/36867) SCSI #2 TZ88 tape #10 (SCSI ID #4)  
(SCSI LUN #0) 81630_bpi                                                         
# file /dev/nrmt2h                                                              
/dev/nrmt2h:    character special (9/37891) SCSI #2 TZ88 tape #11 (SCSI ID #5)  
(SCSI LUN #0) 81630_bpi                                                         
#                                                                               
                                                                                
    This problem has occured before and it was suggested by RDC to power cycle  
the system and jukebox.
    
    Here, I believe is a uerf -o full of an event that was logged that
    related to this problem:
    
    ********************************* ENTRY    26.
    *********************************
    
    ----- EVENT INFORMATION -----
    
    EVENT CLASS                             ERROR EVENT
    OS EVENT TYPE                  199.     CAM SCSI
    SEQUENCE NUMBER                 64.
    OPERATING SYSTEM                        DEC OSF/1
    OCCURRED/LOGGED ON                      Wed Mar  5 16:10:18 1997
    OCCURRED ON SYSTEM                      robot1
    SYSTEM ID                 x00060009     CPU TYPE:  DEC 2100
    SYSTYPE                   x00000000
    
    ----- UNIT INFORMATION -----
    
    CLASS                         x0001     TAPE
    SUBSYSTEM                     x0000     DISK
    BUS #                         x0001
                                  x0058     LUN x0
                                            TARGET x3
    
    ----- CAM STRING -----
    ROUTINE NAME                            ctape_move_tape
    
    ----- CAM STRING -----
    
                                            Unexpected CCB status
    
    ----- CAM STRING -----
    
    ERROR TYPE                              Hard Error Detected
    
    ----- CAM STRING -----
    
    DEVICE NAME                             DEC     TZ88     (C) DEC.TZ88
    
    ----- CAM STRING -----
    
                                            Active CCB at time of error
    
    ----- CAM STRING -----
    
                                            Target selection timeout
    ERROR - os_std, os_type = 11, std_type = 10
    
    ----- ENT_CCB_SCSIIO -----
    
    *MY ADDR                  x07FC8800
    CCB LENGTH                    x00C0
    FUNC CODE            x01
    CAM_STATUS                    x004A     CAM_SEL_TIMEOUT
                                            SIM QFRZN
    PATH ID              1.
    TARGET ID            3.
    TARGET LUN           0.
    CAM FLAGS                 x000000C0
                                            CAM_DIR_NONE
    *PDRV_PTR                 x07FC84A8
    *NEXT_CCB                 x00000000
    *REQ_MAP                  x00000000
    VOID (*CAM_CBFCNP)()      x004990F8
    *DATA_PTR                 x00000000
    DXFER_LEN                 x00000000
    *SENSE_PTR                x07FC84D0
    SENSE_LEN            x48
    CDB_LEN              x06
    SGLIST_CNT                    x0000
    CAM_SCSI_STATUS               x0000     SCSI_STAT_GOOD
    SENSE_RESID          x00
    RESID                     x00000000
    CAM_CDB_IO           x000000000000000000000001
    CAM_TIMEOUT               x00000E10
    MSGB_LEN                      x0000
    VU_FLAGS                      x0000
    TAG_ACTION           x00
    
    Several days later, we seemed to have the same type of problem occur,
    this time with the media changer:
    
    ********************************* ENTRY     3.
    *********************************
    
    ----- EVENT INFORMATION -----
    
    EVENT CLASS                             ERROR EVENT
    OS EVENT TYPE                  199.     CAM SCSI
    SEQUENCE NUMBER                  5.
    OPERATING SYSTEM                        DEC OSF/1
    OCCURRED/LOGGED ON                      Sun Mar  9 12:54:41 1997
    OCCURRED ON SYSTEM                      robot1
    SYSTEM ID                 x00060009     CPU TYPE:  DEC 2100
    SYSTYPE                   x00000000
    
    ----- UNIT INFORMATION -----
    
    CLASS                         x0008     MEDIA CHANGER
    SUBSYSTEM                     x0000     DISK
    BUS #                         x0001
                                  x0050     LUN x0
                                            TARGET x2
    
    ----- CAM STRING -----
    
    ROUTINE NAME                            changer_ready
    
    ----- CAM STRING -----
    
                                            Failed to ready
    
    ----- CAM STRING -----
    
    ERROR TYPE                              Hard Error Detected
    
    ----- CAM STRING -----
    
    DEVICE NAME                             DEC     TL820    (C) DEC.TL820   
    (
    
    ----- CAM STRING -----
    
                                            Active CCB at time of error
    
    ----- CAM STRING -----
    
                                            CCB request completed with an
    error
    ERROR - os_std, os_type = 11, std_type = 10
    
    ----- ENT_CCB_SCSIIO -----
    
    *MY ADDR                  x07FBD100
    CCB LENGTH                    x00C0
    FUNC CODE            x01
    CAM_STATUS                    x0084     CAM_REQ_CMP_ERR
                                            AUTOSNS_VALID
    PATH ID              1.
    TARGET ID            2.
    TARGET LUN           0.
    CAM FLAGS                 x000000C0
                                            CAM_DIR_NONE
    *PDRV_PTR                 x07FBCDA8
    *NEXT_CCB                 x00000000
    *REQ_MAP                  x00000000
    VOID (*CAM_CBFCNP)()      x0024981C
    *DATA_PTR                 x00000000
    DXFER_LEN                 x00000000
    *SENSE_PTR                x07FBCDD0
    SENSE_LEN            x4E
    CDB_LEN              x06
    SGLIST_CNT                    x0000
    CAM_SCSI_STATUS               x0002     SCSI_STAT_CHECK_CONDITION
    SENSE_RESID          x3F
    RESID                     x00000000
    CAM_CDB_IO           x000000000000000000000000
    CAM_TIMEOUT               x0000001E
    MSGB_LEN                      x0000
    VU_FLAGS                      x0000
    TAG_ACTION           x00
    
    ----- CAM STRING -----
    
                                            Error, exception, or abnormal
                                             _condition
    
    ----- CAM STRING -----
    
                                            NOT READY - Logical unit is NOT
    ready
    
    ----- ENT_SENSE_DATA -----
    
    ERROR CODE                    x0070     CODE x70
    SEGMENT              x00
    SENSE KEY                     x0002     NOT READY
    INFO BYTE 3          x00
    INFO BYTE 2          x00
    INFO BYTE 1          x00
    INFO BYTE 0          x00
    ADDITION LEN         x07
    CMD SPECIFIC 3       x00
    CMD SPECIFIC 2       x00
    CMD SPECIFIC 1       x00
    CMD SPECIFIC 0       x00
    ASC                  x04
    ASQ                  x03
    FRU                  x00
    SENSE SPECIFIC       x030000
    ADDITIONAL SENSE
    0000:   00000000  00000000  00000000  00000000       
    *................*
    0010:   00000000  00000000  00000000  00000000       
    *................*
    0020:   00000000  00000000  00000000  00000000       
    *................*
    0030:   00000000  00000000  00000000  00000000       
    *................*
    0040:   7E250000  00005E3C  00000000  00000000       
    *..%~<^..........*
    
    Please let me know what other information I can provide.
    
    Thoughts?
    
    Thanks!
    tl
T.RTitleUserPersonal
Name
DateLines
3694.1A {perhaps} lucid summarySANITY::LEMONSAnd we thank you for your support.Tue Mar 11 1997 16:0018
    Hi
    
    In re-reading the mass of information I psosted in .0, I thought I'd
    better add this clarification.  I believe we have two problems:
    
    1. Occasionally, the media changer is unable to remove a tape from
    drive #0 (the bottom tape drive) of our TL822;
    2. When Problem #1 occurs, either the tape drive or the media changer
    seem to disappear from the SCSI bus, and we seem to need to reload to
    bring them back.
    
    I'm guessing that Problem #2 happens in response to Problem #1.  So,
    have there been any other reports of problems with removing tapes from
    drives via the media changer?  Are there any repairs and/or
    calibrations that FS should be checking?
    
    Thanks!
    tl
3694.2SANITY::LEMONSAnd we thank you for your support.Tue Mar 11 1997 16:2335
    More information about this problem from Rich Witherow:
    
    "I was able to recreate the problem we have been having on Robot1. 
    I enabled drive /dev/nrmt0h and started up the cloning process.  After
    the NetWorker software loaded tape O10025 into this drive it was unable
    to read the label (and finally failed out to no such device or address;
    I stopped the cloning process after the failure to read the label). A
    check of  the device found it has been dropped from the SCSI:
    
    # file /dev/nrmt0h | more
    /dev/nrmt0h:    character special (9/19459)
    
    This should have been:
    /dev/nrmt0h:    character special (9/19459)  SCSI #1 TZ88 tape #10 
    (SCSI ID #3) (SCSI LUN #0)  81630_bpi
    
    
    Here's all the devices used (/dev/mc10 is for the robotics) as
    currently seen:
    # file /dev/mc10 | more
    /dev/mc10:      character special (52/18432) SCSI #1 TL820   
    special_device #80
     (SCSI ID #2) (SCSI LUN #0)
    # file /dev/nrmt0h | more
    /dev/nrmt0h:    character special (9/19459)
    # file /dev/nrmt1h | more
    /dev/nrmt1h:    character special (9/36867) SCSI #2 TZ88 tape #11 (SCSI
    ID #4) (
    SCSI LUN #0) 81630_bpi
    # file /dev/nrmt2h | more
    /dev/nrmt2h:    character special (9/37891) SCSI #2 TZ88 tape #12 (SCSI
    ID #5) (
    SCSI LUN #0) 81630_bpi"
    
    tl
3694.3DECWET::RWALKERRoger Walker - Media ChangersTue Mar 11 1997 18:468
	First you need to get the drive fixed so this does not
	continue.  I suppose you tried a cleaning tape just in case.

	If a device disapears the best way to get it back is the	
	scu command "scu scan edt bus n" where n in your case would be 1.

	If this does not work, power cycle the device and try the
	scu command again.
3694.4SANITY::LEMONSAnd we thank you for your support.Tue Mar 11 1997 19:5837
    Roger
    
    Thanks for the ideas.
    
    "        First you need to get the drive fixed so this does not
            continue.  I suppose you tried a cleaning tape just in case."
    
    Since this problem first appeared, FS installed the TL820-to-TL822
    upgrade.  So we've completely replaced the TZ87 drives with TZ88
    drives.
    
    We've manually taken the tape out of the drive, and have noticed that
    the cartridge comes out about an inch, then it won't move any further
    than that.  Then, we push it back in, push the lever down, wait for the
    tape to find BOT, then hit the LOAD/UNLOAD button, wait for the tape to
    unload, flip the lever up, and, voila, the tape comes right out.  So,
    it seems to be an intermittent problem with the drive itself.  We've
    noticed this problem after both data tapes and cleaning tapes.
    
    We tried:
    
    # scu scan edt bus 1
    Scanning bus 1, please be patient...
    #
    This didn't make the missing tape drive on this bus appear.
    
    We then tried:
    
    # scu scan edt bus 2
    Scanning bus 2, please be patient...
    
    and after 5 minutes, it still hasn't returned.
    
    This feels like two separate problems.  What do you think?
    
    Thanks very much!
    tl
3694.5More data, wild ideasSANITY::LEMONSAnd we thank you for your support.Tue Mar 11 1997 20:1429
    Again, we seem to have two problems:  a hardware problem, where the
    tape can't be removed from the drive, and a software problem, where the
    system looses track of the tape drives and/or media changer when this
    other problems occurs.  To my knowledge, the inability to remove a
    tape from a drive has been confined to drive 0 (the bottom TL82n drive).
    We had this problem on our TL820 before the upgrade to TL822, and
    now are having it after the upgrade.  The problem has been seen,
    therefore, on two different drives, and two different drive types
    (TZ87, then TZ88).
    
    We've manually taken the tape out of the drive, and have noticed
    that the cartridge comes out about an inch, then it won't move any
    further than that.  Then, we push it back in, push the lever down,
    wait for the tape to find BOT, then hit the LOAD/UNLOAD button,
    wait for the tape to unload, flip the lever up, and, voila, the tape
    comes right out.   So, it seems to be an intermittent problem with the
    drive itself.
    
    Wild idea time:  could a misalignment of the media changer arm to the
    bottom drive cause this problem?  That is, if either the media changer
    or the bottom drive was slightly tilted vertically, such that
    extracting a tape did not pull it out straight, but at a slight up or
    down angle, could that cause the tape to get jammed during the
    extraction process?  We've noticed that if we manually push the tape in,
    load, unload and pull the tape out BY HAND, we don't see this jamming
    problem.
    
    Thanks!
    tl
3694.6Could be an alignment problemSUBSYS::alcor.shr.dec.com::smithApps EngineerWed Mar 12 1997 14:5112
	It (cartridge being stuck) could be related to a 
mis-alignment...  Have FS get an alignment kit and verify that 
everything is aligned properly.
	The reason the tape gets "stuck" might be related to trying 
to remove the cartridge before the leader has been released. The 
library variant of the drives have different FW than the desktop 
drives. You should wait a few seconds after the tape ejects to make 
sure the leader has released before trying to remove the cartridge 
from the drive...


Joe