[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference cookie::archive_backup

Title:	Archive/Backup

Moderator:	COOKIE::MHUAIG

Created:	Wed Sep 08 1993
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	479
Total number of notes:	2283

446.0. "disk skipped for a fatal error" by ROM01::MENCHICCHI () Thu Apr 24 1997 09:20

    Hi,
    
    customer have a save request that perform backup of 4 disks.
    During the backup of the second disk he had a fatal error on the drive.
    ABS then had retired the current volume and after n-minutes 
    it had mounted another volume on another drive in order to continue 
    the backup operation.
    
    It is OK but ABS go to the next disk (the third) giving the whole
    backup incomplete.... (the second disk is not saved)
    
     
    How tell to ABS that it must not skip the current disk (the second) but 
    retry the same disk-backup?
    
    
    Thanks, 
    Lorena

T.R	Title	User	Personal Name	Date	Lines
446.1		COOKIE::MHUA		`Thu Apr 24 1997 11:41`	17
	If the disk has a fatal error, if the same operations is retried in a few minutes, it will most likely to fail again won't it? If it's a fatal error, the right thing is to go to the next operation, won't it? Currently there is no way to have ABS redo the operation on the failed disk only in the sceario you have described. If you wish to make the re-do of particular disk backup operation easily done (has to be done manually since ABS will not know when you fix the disk problem), I recommend to break apart the 4 disk backup to be 4 different save requests. Thanks, Masami
446.2	drive error and not disk error	ROM01::MENCHICCHI		`Tue Apr 29 1997 04:17`	14
	Sorry, Masami but in the note I write that 'a fatal error occurs on the drive' and NOT on the disk. So is correct that ABS change the drive but isn't right that ABS go to the next operation. You recommend to break apart the 4 disks backup to be 4 different save requests, but my 4 disks contain a single Oracle table so in order to have a right backup I must have all the 4 disks saved in the same time. If ABS skips a disk my backup is not good!!! Customer needs a answer. Thanks, Lorena
446.3	retry parameters in env	COOKIE::MHUA		`Wed Apr 30 1997 10:29`	10
	In that case, check the retry count and retry interval in execution environment object that is used for the operation. The failed backup will be retried for the number of times specified before it goes on to the next disk save. Will that help? Masami
446.4	retry parameters setted	ROM01::MENCHICCHI		`Tue May 06 1997 05:53`	11
	> The failed backup will be retried for the number of times specified > before it goes on to the next disk save. I have a retry count = 3 and retry interval = 15 min in EE. Nevertheless ABS skip the current disk save and go to next disk save. Any ideas or this is the correct way to work for ABS? Thanks, Lorena
446.5	Should retry 3 times	COOKIE::HEISLER	Chris Heisler, ABS Engineering	`Tue May 06 1997 08:14`	11
	Lorena, If you have the retry count set to 3, it should retry the backup of the disk 3 times. Can you post the log of that save operation? I would like to see the sequence of the actions in the log file. Thanks, Chris
446.6	log of save operation	ROM01::MENCHICCHI		`Tue May 06 1997 10:14`	140
	This is the log of the save operation; I put here only the section that is interesting... $ SET VERIFY Executing ABS LOGIN.COM Completed execution of ABS LOGIN.COM "@abs_syste:coordinator.com A7A7B86A-C60E-11D0-A990-08002B9730F5" executing, output follow : ____________________________________________________________________________ Starting New Request at 6-MAY-1997 12:54:36.47 Name: P_OUTP_SAVE UID: A7A7B86A-C60E-11D0-A990-08002B9730F5 COORDINATOR: Created new volume set: ABZ026 COORDINATOR: Mounting volume set member: ABZ026 RVN 1 COORDINATOR: (Selected drive $1$MUA10:) COORDINATOR: Initializing scratch volume ABZ026 %MOUNT-I-MOUNTED, ABZ026 mounted on _$1$MUA10: (HSJ011) COORDINATOR: Skipping $1$MUA10: to End of Tape... THREAD #1: Operation #1 starting at 6-MAY-1997 12:56:40.15 Data Movement Type: FULL_SAVE Incremental Level: Full Operation Object Set: Object Type: VMS_FILES Include List: USER: Exclude List: Archive Information: Storage Class Name: P_OUTP_CLASS Saveset Location: ABZ026 Saveset Name: 6MAY199712543661. Execution Enviroment: Name: P_OUT_ENV Number of retries: 3 Retry Interval: 15 minute(s) THREAD #1: $ THREAD #1: SET NOON THREAD #1: $ version = F$EXTRACT(0,4,f$getsyi("VERSION")) THREAD #1: $ IF (VERSION .eqs. "V6.1") then $DEFINE BACKUP ABS$SYSTEM:ALTERNATE _BACKUP.EXE THREAD #1: $ DEFINE SYS$COMMAND SYS$INPUT: THREAD #1: $ BACKUP /IMAGE USER: /LIST=_MBA7874:/FULL/RECORD/ - THREAD #1: _$ IGNORE=(INTERLOCK)/NOCRC/NOVERIFY/VERIFY/CRC - THREAD #1: _$ - THREAD #1: _$ $1$MUA10:6MAY199712543661./SAVE/STOR=V2SLS/NOASSIST - THREAD #1: _$ /EXACT_ORDER THREAD #1: VOL1ABZ026 3 THREAD #1: VOL1ABZ026 3 THREAD #1: HDR16MAY199712543661.ABZ02600010001000100 97126 97126 000000DECVMSBA CKUP THREAD #1: HDR16MAY199712543661.ABZ02600010001000100 97126 97126 000000DECVMSBA CKUP THREAD #1: HDR2F0819208192 M 00 THREAD #1: HDR2F0819208192 M 00 THREAD #1: --- start the backup of USER disk . . . . -- now the error THREAD #1: %BACKUP-E-FATALERROR, fatal error on $1$MUA10:[]6MAY199712543661.; THREAD #1: -SYSTEM-F-VOLIN, volume is not software enabled THREAD #1: %BACKUP-I-SPECIFY, specify option (QUIT or CONTINUE) THREAD #1: BACKUP> CORRDINATOR: Retiring volume set ABZ026 (due to fatal error during save) COORDINATOR: Dismounting volume set member: ABZ026 RVN 1 THREAD #1: Operation will be retried in 15 minutes... COORDINATOR: Created new volume set: ABZ027 COORDINATOR: Mounting volume set member: ABZ027 RVN 1 COORDINATOR: (Selected drive $1$MUA11:) COORDINATOR: Initializing scratch volume ABZ027 %MOUNT-I-MOUNTED, ABZ027 mounted on -$1$MUA11: (HSJ011) COORDINATOR: Skipping $1$MUA11: to End of Tape... THREAD #1: SET NOON THREAD #1: %BACKUP-I-SPECIFY, specify option (QUIT or CONTINUE) THREAD #1: BACKUP> THREAD #1: A fatal error was detected in data mover COORDINATOR: Skipping $1$MUA11: to End of Tape... THREAD #1: THREAD #2: Operation #2 starting at 6-MAY-1997 13:25:25.02 Data Movement Type: FULL_SAVE Incremental Level: Full Operation Object Set: Object Type: VMS Files Include List: SPOOL: Exclude List: Archive Information: Storage Class Name: P_OUTP_CLASS Saveset Location: ABZ026 Saveset Name: 6MAY199713251966. Execution Enviroment: Name: P_OUT_ENV Number of retries: 3 Retry Interval: 15 minute(s) THREAD #2: $ THREAD #2: SET NOON THREAD #2: $ version = F$EXTRACT(0,4,f$getsyi("VERSION")) THREAD #2: $ IF (VERSION .eqs. "V6.1") then $DEFINE BACKUP ABS$SYSTEM:ALTERNATE _BACKUP.EXE THREAD #2: $ DEFINE SYS$COMMAND SYS$INPUT: THREAD #2: $ BACKUP /IMAGE SPOOL: /LIST=_MBA8162:/FULL/RECORD/ - THREAD #2: _$ IGNORE=(INTERLOCK)/NOCRC/NOVERIFY/CRC - THREAD #2: _$ - THREAD #2: _$ $1$MUA11:6MAY199713251966./SAVE/STOR=V2SLS/NOASSIST - THREAD #2: _$ /EXACT_ORDER THREAD #2: VOL1ABZ027 3 THREAD #2: VOL1ABZ027 3 THREAD #2: HDR16MAY199713251966.ABZ02700010001000100 97126 97126 000000DECVMSBA CKUP THREAD #2: HDR16MAY199713251966.ABZ02700010001000100 97126 97126 000000DECVMSBA CKUP THREAD #2: HDR2F0819208192 M 00 THREAD #2: HDR2F0819208192 M 00 THREAD #2: --- start the backup of SPOOL disk . . . and continue to backup the other disk with normal successful completion
446.7	Thanks	COOKIE::HEISLER	Chris Heisler, ABS Engineering	`Tue May 06 1997 14:09`	9
	Lorena, Thanks for posting the log file. We'll take a look at the code to see if the state of the process caused it to skip the retries. We thought that it should be retrying the first save. Chris