[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference decwet::networker

Title:NetWorker
Notice:kits - 12-14, problem reporting - 41.*, basics 1-100
Moderator:DECWET::RANDALL.com::lenox
Created:Thu Oct 10 1996
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:750
Total number of notes:3361

676.0. "tlz6l tape looses power then compression" by REGENT::HUMMERS () Tue May 13 1997 08:21

    Our nsr server (v4.2B + patches up to 11 on D.U V3.2C) experienced a 
    partial power fail on Saturday morning at a time when it should have 
    been idle.  The system, and all disks were 'not' powered off (UPS), 
    but the power that supports the tape drives, TLZ6Ls, was.     
    
    Upon return of power we notice that the TLZ6Ls were 'full' at 
    1900-2000Mb.  "Easy", the compression was not turned back on after 
    the power fail, I thought!   shut down nsr, moved the tape drives
    to the UPS bus and started nsr again.   This morning I see that the
    Incs from last night filled a tape at 1967Mb.    
    
    So, does the tape, once started uncompressed, always stay uncompressed 
    or should restarting nsr have fixed it?     
    
    thanks
    \s\rick
     
T.RTitleUserPersonal
Name
DateLines
676.1DECWET::RWALKERRoger Walker - Media ChangersTue May 13 1997 10:416
	This is a hardware/driver issue that has been reported in
	this notes file before.  NetWorker does not control the
	compression, that is done by the driver when the nrmt*h device
	file is used.  Unloading and reloading the tape in the
	drive by hand will 'fix' the problem until the next
	power outage.
676.2still at 2Gbytes per cartridgeNPSS::HUMMERSTue May 20 1997 09:4721
    Hmm, had looked for notes with a title of power, tlz6l and compression
    and could not find anything related.  If you can point me that
    might help...
    
    We are still not working at 4Gs per tape.
    
    I have removed all tapes cartridges and relabeled all unused tapes,
    next backups wrote at 2Gs again.    Over the weekend this site was 
    powered down, on Sunday we restarted systems, including this nsr server.
    Strange thing was the /dev/*rmt* special files were all missing?  
    Perhaps related to the missing devices, we had grown the /usr partition 
    at this time by saving /usr, repartitioning and restore. Anyway, we
    did a MAKEDEV on all them again and, guess what, 2Gs per tape...
    
    We did a dump to rmt0h and got over 4Gs on that tape.
    
    So any hints would be helpfull.
    
    thanks
    \s\Rick
                                                                      
676.3low rate tooNPSS::HUMMERSTue May 20 1997 09:543
    Oh, maybe related, we are only writing at 170-190 Kb/s too.
    
    
676.4wow, there it is...DECWET::EVANSNSR EngineeringTue May 20 1997 11:0019
the "oh by the way" explains part of it.  by writing at a slow rate, the tape
 drive has insufficient data to cause tape streaming (constant tape media
 motion), ergo, it stops and start.. each stop takes way valuable storage
 space -- examples I've seen from the tape folks is appx *half*, whch matches
 the symptoms you are reporting.

let's go down the big things: what length tape? what firmware rev? what kind
 of data is getting backed up (oracle DB?). what else is going on with
 this system when NetWorker is trying to run?

 for example 60m tapes will never get more than appx 1.5-2G/tape, 90m
 tapes get appx 2.5-3.5 and 120m tapes cannot be used in TLZ06L drives.

  I think the latest rev is 4BQH, but verify that in the STKHLM::MAGTAPE
 conference, ok!?

  Is there a lot of thrashing from the disks while NetWorker is doing the
 backup?? I mean, can you see or hear lots of activity from the disks.. and
 are they being backed up via "save", or via "savegroup"?
676.5The rest of the storyREGENT::HUMMERSWed May 21 1997 09:4050
! let's go down the big things: what length tape? what firmware rev? what kind
!  of data is getting backed up (oracle DB?). what else is going on with
!  this system when NetWorker is trying to run?

OK, but first let me restate that this system was running 4Gb per cartridge for
the first 2 months of it's life with all the same hardware/software...

That said, on to the speeds and feeds

90 meter cartridges, the internal tlz06 is at FW 4BQE but the TLZ6Ls are at
0491 (I have been waitig for FS to find an upgrade cartridge here).   We back
up a bunch of U*ix fileservers that total just over 120Gb.   Fulls go once a
month on 2-5 systems each Sunday morning.  There is a mix of Ultrix and D.U.
fileservers, total of 11 systems (counting this server). No SQL, no Oracle,
no NT (yet), just UFS bytes. 

The system is a 3000 M800 with 160 Mb, FW 6.2, running V3.2C and NSR, err I
mean networker, V4.2B (patches up to 11 as appropriate).  It nfs-serves /usr
(read only) to several (less than 30 active is a good guess) DECathena clients
as it's only other function in life.   The root, /usr, /var and one quarter of
it's swap is on one internal rz26, three quarters of it's swap is on an
internal dedicated rz25.   The CDrom, TLZ06, and internal RZ25+26 are on the
internal SCSI. There is a dual SCSI (PMmumbley_something) that has only a
TLZ6L on each channel.   There is a single SCSI (PMmumbly_something) that has
4 RZ28s for index files, /nsr/index is spread over three of the RZ28s, no two
systems scheduled for FULLs are on the same index disk.    The 'wire' to the
world of clients is a fddi and most clients are fddi.   The system is not
logging any errors, save a few SCSI CAM errors. 

I do label all the cartridges on the internal TLZ06 (the one that I get 4Gbs
on doing a dump -0uf /dev/[n]rmt0h) and generally only back up to the two
TLZ6Ls, although the TLZ06 sometimes gets pressed into service for backups if
we are running behind.    The TLZ06 is always where the restores get done
from. 

!   Is there a lot of thrashing from the disks while NetWorker is doing the
!  backup?? I mean, can you see or hear lots of activity from the disks.. and
!  are they being backed up via "save", or via "savegroup"?

the disks are very quite based on greeen-led activity.
backups are all NSR server scheduled (does this answer the save/savegroup 
question?)

Now my question.
Is the density determined at LABEL time and from then on a particular
cartridge is always that density or can compression be turned on/off on a
per save basis?  

thanks
\s\Rick
676.6DECWET::RWALKERRoger Walker - Media ChangersWed May 21 1997 10:1220
	Compression on a DAT drive is set by a mode select by the driver
	and may change for every data block writen if the software
	desires.

	NetWorker does not deal with the device at this level on UNIX.
	The device file used selects the compression, a /dev/nrmt*h file
	selects compression, /dev/nrmt*a does not.

	Now it is up to the tape driver, knowing that you are using the
	'h' or 'a' device, to set the compression.  The driver sets the
	compression every time it gets a unit attention telling it 
	that the tape has been changed.  (The drive clears the compression
	everytime a tape is loaded).

	Here's the problem.  The TLZ6L does not report the unit attenion
	to both the changer driver and the tape driver as required by 
	the SCSI-2 spec.  If the changer opens the device before the tape
	driver, the tape driver does not know that the tape drive needs
	the compression bit set so the data is not compressed.  We can't
	control or change this.
676.7is 4QBH the answer?REGENT::HUMMERSWed May 21 1997 12:1921
    Thanks for the info on compression, that helps
    
    !	Here's the problem.  The TLZ6L does not report the unit attenion
    !	to both the changer driver and the tape driver as required by 
    !	the SCSI-2 spec.  If the changer opens the device before the tape
    !	driver, the tape driver does not know that the tape drive needs
    !	the compression bit set so the data is not compressed.  We can't
    !	control or change this.
    
    So are you saying that this is my 2Gb vs 4Gb problem?   If so then
    further questions are - 
    1. will upgrading to 4BQH (or whatever is current) help my problem?  
    2. why did this server work for the first two months of it's life
    	at 4Gb then 'go off key'?  Could that have been the power to the 
        drive alone going off (.0)?
    
    if this is not an explanation for my current problem, where do I go
    from here?
    
    thanks
    \s\Rick
676.8DECWET::RWALKERRoger Walker - Media ChangersWed May 21 1997 12:213
	I have not heard of any firmware that fixes this problem.
	I also do not have any idea where you could go since we 
	can't help here.
676.9SighREGENT::HUMMERSThu May 22 1997 07:228
    Thanks for your help with this, at least I have a better handle on
    the problem, still V puzzling why it worked 100% of the time for the 
    first two months and then stained the sheets in the last month?  
    
    This is a 'bumma' sooo do you think if I grump and grouse in the 
    ASK_SSAG conference I could get this fixed?   
    
    \s\Rick
676.10DECWET::RWALKERRoger Walker - Media ChangersThu May 22 1997 09:043
	DIGITAL doesn't buy these units from the vendor any more so I don't
	expect we can get any fixed frimware.  The only option I know
	of is to change to a TLZ9L which is made by a different vendor.
676.11workaround?REGENT::HUMMERSTue Jun 03 1997 08:4710
    Well I found that the drives were defaulting to non-compress mode (the
    default switch setting).  I switched the two tlz6ls to be compressed by 
    default and have been backing up at an average of 4500Mb per tape for 
    the last few days.
    
    It's still a mystery why this changed after two months of consistant
    compressed mode, to consistantly uncompressed?
    
    thanks for your time and patience
    \s\Rick