| Title: | Archive/Backup |
| Moderator: | COOKIE::MHUA IG |
| Created: | Wed Sep 08 1993 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 479 |
| Total number of notes: | 2283 |
Hi,
after two weeks running backup the following occurs:
Also the following backups are failed because they can't allocate a
volume.
Thomas
$ SET noVERIFY
%SET-W-NOTSET, error modifying DKA0:
-SET-E-INVDEV, device is invalid for requested operation
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
"@abs_system:coordinator.com D7B23A6E-0712-11D0-8001-AA000400110C"
Executing, ou
tput follows :
------------------------------------------------------
---------------------------------------------------------------
Starting New Request at 26-FEB-1997 01:30:46.87
Name: WAK005
UID: D7B23A6E-0712-11D0-8001-AA000400110C
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Mounting volume set member: SLS016 RVN 3
COORDINATOR: (Selected drive WAK005$MKA400:)
%MOUNT-I-MOUNTED, SLS015 mounted on _WAK005$MKA400:
COORDINATOR: Skipping WAK005$MKA400: to End of Tape...
THREAD #1:
Operation #1 starting at 26-FEB-1997 01:48:16.81
Data Movement Type: INCREMENTAL_SAVE
Incremental Level: Level 5 Operation
Object Set:
Object Type: VMS_FILES
Include List: WAK005$DKA100:
Exclude List:
Archive Information:
Storage Class Name: CLIENT_SC
Saveset Location: SLS016
Saveset Name: 26FEB19970130474.
Execution Environment:
Name: SYSTEM_BACKUPS_ENV
Number of retries: 0
Retry Interval: 0 minute(s)
THREAD #1: $
THREAD #1: SET NOON
THREAD #1: $ version = F$EXTRACT(0,4,f$getsyi("VERSION"))
THREAD #1: $ IF (VERSION .eqs. "V6.1") THEN $DEFINE BACKUP
ABS$SYSTEM:ALTERNATE_BACKUP.EXE
THREAD #1: $ DEFINE SYS$COM
THREAD #1: MAND SYS$INPUT:
THREAD #1: $ BACKUP WAK005$DKA100:[000000...]* -
THREAD #1: _$ -
THREAD #1: _$ /LIST=_MBA7150:/FULL -
THREAD #1: _$ /RECORD -
THREAD #1: _$ /IGNORE=(INTERLOCK) -
THREAD #1: _$ /NOCRC/NOVERIFY -
THREAD #1: _$ -
THREAD #1: _$ /SINCE="25-FEB-1997 01:32:30.74" -
THREAD #1: _$ -
THREAD #1: _$ /MODIF -
THREAD #1: _$ WAK005$MKA4
THREAD #1: 00:26FEB19970130474./SAVE -
THREAD #1: _$ /STOR=V2SLS/NOASSIST -
THREAD #1: _$ /EXACT_ORDER
THREAD #1: HDR126FEB19970130474.SLS01600010094000100 97057 970570000000DECVMSBA
CKUP
THREAD #1: HDR126FEB19970130474.SLS01600010094000100 97057 97057000000DECVMSBA
CKUP
THREAD #1: HDR2F0819208192 M 00
THREAD #1: HDR2F0819208192 M 00
THREAD #1: %BACKUP-W-ACCONFLICT,WAK005$DKA100:[PWRK$ROOT.LANMAN]SHAREFILE.;1 i
s open for write by another user
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
address=80808060, PC=805AD6D8, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 00000005
Name = 0000000C
00000004
80808060
805AD6D8
0000001B
Register dump:
R0 = 0000000000000000 R1 = 0000000000C893D5 R2 = 000000000070CBF0
R3 = 0000000000000000 R4 = 0000000000CC6850 R5 = 0000000000000000
R6 = 0000000000CD2BB0 R7 = 0000000000F19B88 R8 = 0000000000CC6ACA
R9 = 0000000000CD2BB0 R10 = 0000000000000000 R11= 0000000000000000
R12 = 0000000000000000 R13 = 0000000000000000 R14= 0000000000000000
R15 = 000000000065E0E0 R16 = 0000000000C897D1 R17= 0000000000CD2BB0
R18 = FFFFFFFFFF32D450 R19 = 0000000000000000 R20= 0101010101010101
R21 = 8080808080808080 R22 = 000000007F3C2878 R23= 000000000000005B
R24 = 0000000000002BB0 R25 = 0000000000000003 R26= 0000000000813FDC
R27 = 000000007F3C2878 R28 = 000000000000000C R29= 0000000000000000
SP = 000000007EE4E000 PC = FFFFFFFF805AD6D8 PS = 000000000000001B
ABS job terminated at 26-FEB-1997 01:49:49.19
Accounting information:
Buffered I/O count: 8831 Peak working set size: 12960
Direct I/O count: 380 Peak page file size: 67376
Page faults: 1103 Mounted volumes: 2
Charged CPU time: 0 00:00:07.01 Elapsed time: 000:19:10.90
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 387.1 | some ideas | COOKIE::MHUA | Wed Feb 26 1997 14:51 | 33 | |
Thomas,
We have never seen this access violation before. I have to discuss
with more people here to see how we can debug this. If ABS gets
accvio, it should get CMA exception from the CMA block we use. I did
not think it was possible to die in the way it's reported in the log...
Please post the save request show output for this request.
I am assuming that you are using V2.1 ABS on V6.2 OpenVMS (alpha).
What is the image id of the backup image you are using? Do analy/image
sys$system:backup.exe and report the image id in the header.
I am assuing that you got the "failure to access storage class" error
in the subsequent backup. The way to fix this problem is to
abs set storage XXXXX/novol
and let ABS pick a new volume. Due to the fatal error during backup,
the volume is no longer useful. However, ABS has mechanism to detect
this and remove the volume automatically from the strorage class.
Before you issue "abs set storage XXXXX/novol" command, do the
following and report us the result.
Do show system and see if you see a process called ABS$COORD_CLEAN is
out there. Do abs show storage XXXXX/full and post the result. Do
storage show volume SLS015 and post the result.
That's it for now. We'll keep looking.
Masami
| |||||
| 387.2 | COORD_CLEANUP also ACCVIOS | SUOBOS::RUCKH | Thu Feb 27 1997 00:38 | 254 | |
Masami,
you're right it's OpenVMS Alpha V6.2, ABS V2.1.
Backup image id: AXP V61R_62-000
Thomas
The remaining clients have no errors in there log-files, they all only shown
...
...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Failed to allocate writeable volume set after 105
attempts
COORDINATOR: Continuing to retry every 60 seconds...
COORDINATOR: Attempting to allocate volume set SLS016...
COORDINATOR: Attempting to allocate volume set SLS016...
Because the tape wasn't allocated from anywhere, I aborted the jobs
after a few hours (yesterday), set the storage-class to /NOVOL and restarted the
jobs. For tonight the backups working fine again.
$TY ABS$COORD_CLEANUP_WAK005.LOG
$ SET noVERIFY
%SET-W-NOTSET, error modifying DKA0:
-SET-E-INVDEV, device is invalid for requested operation
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
address=80808010, PC=00941994, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 00000005
Name = 0000000C
00000004
80808010
00941994
0000001B
Register dump:
R0 = 0000000000000000 R1 = 0000000000000000 R2 =0000000000872F00
R3 = 0000000000000000 R4 = 000000000000004F R5 =00000000009D21D0
R6 = 00000000009D21D0 R7 = 0000000000000000 R8 =000000007EE4D460
R9 = 00000000000113B0 R10 = 000000007FF9D228 R11 =000000007FFBE3E0
R12 = 0000000000000000 R13 = 000000007EEF4900 R14 =0000000000000000
R15 = 000000007EEF3DA0 R16 = 000000007F3DB23C R17 =00000000008A2AE0
R18 = 0000000000000000 R19 = 0000000001346B50 R20 =0000000000000000
R21 = 0000000000000000 R22 = 0000000000000049 R23 = 0000000000000001
R24 = 4F000A3A3A393030 R25 = 0000000000000006 R26 =00000000008C69A4
R27 = 000000000087EB20 R28 = 000000000000000C R29 =FFFFFFFF863B7E80
SP = 000000007EE4E000 PC = 0000000000941994 PS =000000000000001B
ABS job terminated at 26-FEB-1997 01:50:34.54
Accounting information:
Buffered I/O count: 61820 Peak working set size: 7312
Direct I/O count: 40484 Peak page file size: 76608
Page faults: 6305 Mounted volumes: 0
Charged CPU time: 0 00:04:55.75 Elapsed time: 2109:58:54.02
Save Request
Name - WAK005
Version - 5
UID - D7B23A6E-0712-11D0-8001-AA000400110C
Movement Type - SELECTIVE_ARCHIVE
Data Object Set
Sequence Option - SEQUENTIAL
Commit Option - KEEP_PARTIAL
Node Name - WAK005
Include Spec - WAK005$DKA100:
Exclude Spec -
Object Type Name - VMS_FILES
Agent Qualifiers -
Data Safety Options - None
Compression Options - None
Source File System Options - FILE_IGNORE_WRITERS
Span Filesystem Options - SPAN FILESYSTEMS
Symbolic Links Option - LINKS_ONLY
Object Date Options - None
Selection Options - None
Restore Options - RETAIN_EXISTING_VERSIONS
Date Identifier - NONE
Low Limit Date - 17-NOV-1858 00:00:00.00
High Limit Date - 17-NOV-1858 00:00:00.00
Node Name - WAK005
Include Spec - WAK005$DKA200:
Exclude Spec -
Object Type Name - VMS_FILES
Agent Qualifiers -
Data Safety Options - None
Compression Options - None
Source File System Options - FILE_IGNORE_WRITERS
Span Filesystem Options - SPAN FILESYSTEMS
Symbolic Links Option - LINKS_ONLY
Object Date Options - None
Selection Options - None
Restore Options - RETAIN_EXISTING_VERSIONS
Date Identifier - NONE
Low Limit Date - 17-NOV-1858 00:00:00.00
High Limit Date - 17-NOV-1858 00:00:00.00
Owner - WAK005::SYSTEM
Access Right - WAK005::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Control - WAK001::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Control - WAK006::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Control - WAK008::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Control - WAK010::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Media Management Info
Storage Class Name - CLIENT_SC
Media Type - None
Device Name - None
Start Time - NEVER
Schedule Interval - DAILY_WITH_WEEKLY_FULL
Explicit Interval -
Special Day On - None
Special Day Off - None
Execution Envir - SYSTEM_BACKUPS_ENV
Storage Class retention value used.
Original Object Action - RECORD_BACKUP_DATE
Restart Interval - 17-NOV-1858 00:00:00.00
Wait Flag - NO
Prologue Command - None
Epilogue Command - None
Storage Class
Name - CLIENT_SC
Version - 160
UID - 7E708E79-8FC5-11D0-8001-AA000400110C
Node Name -
Archive File System
Primary Archive Location -
Staging Location -
Primary Archive Type - SLS/MDMS
Staging Archive Type - NOT IMPLEMENTED
Staging Retention Period - 17-NOV-1858 00:00:00.00
Owner - WAK009::SYSTEM
Access Right - WAK009::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK009::ABS
Access Granted - READ, WRITE, SET, EXECUTE
Access Right - WAK008::ABS
Access Granted - READ, WRITE, SET, EXECUTE
Access Right - WAK008::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK001::ABS
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK001::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK005::ABS
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK005::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK006::ABS
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK006::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK010::ABS
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Access Right - WAK010::SYSTEM
Access Granted - READ, WRITE, SET, SHOW, DELETE, CONTROL,EXECUTE
Tape Pool - MAG03
Volume Set Name - SLS014
Retention Criteria
Number Of Copies - 4
Retention Period - 40 00:00:00.00
Consolidation Criteria
Count - 0
Capacity - 0
Size - 0
Interval- 20 00:00:00.00
Catalog Name - ABS_CATALOG
Retain When Idle Flag - NO
Streams - 1
Media Management Info
Media Location - WAK
Media Type - tz877
Device Name - None
$STOR SHOW VOL SLS015
Volume: SLS016 Owner: WAK010::ABS
Format: BACKUP Brand:
Allocated: 17-FEB-1997 11:16 Scratch: 7-APR-1997 01:30
Purchased: 23-JUL-1996 09:13 Cleaned: 23-JUL-1996 09:13
Media type: TZ877 Length: 0
Mounts: 2 UIC: [ABS]
Location: WAK Protection: S:RW,O:RW,G:R,W:
Notes:
Offsite: Onsite:
IO Errors: 0 Flag: ALLOCATED
Next volume: SLS020 Previous: *none*
Pool: MAG03 Rec len: 0 Block factor: 0
Initialized: 5-FEB-1997 00:00 Density: COMP
Slot number:
Side: Other side:
Jukebox / slot: WAK_JUKEBOX1 / 2
Drive: *WAK005::WAK005$MKA400
Volume is in a jukebox slot.
Volume is bound to slot 2 in magazine WAKMAG03.
Volume: SLS020 Owner: WAK010::ABS
Format: BACKUP Brand:
Allocated: 22-FEB-1997 11:17 Scratch: 7-APR-1997 01:30
Purchased: 23-JUL-1996 09:13 Cleaned: 23-JUL-1996 09:13
Media type: TZ877 Length: 0
Mounts: 2 UIC: [ABS]
Location: WAK Protection: S:RW,O:RW,G:R,W:
Notes:
Offsite: Onsite:
IO Errors: 0 Flag: ALLOCATED
Next volume: SLS015 Previous: SLS016
Pool: MAG03 Rec len: 0 Block factor: 0
Initialized: 5-FEB-1997 00:00 Density: COMP
Slot number:
Side: Other side:
Jukebox / slot: WAK_JUKEBOX1 / 6
Drive: *WAK005::WAK005$MKA400
Volume is in a jukebox slot.
Volume is bound to slot 6 in magazine WAKMAG03.
Volume: SLS015 Owner: WAK006::ABS
Format: BACKUP Brand:
Allocated: 23-FEB-1997 03:42 Scratch: 7-APR-1997 01:30
Purchased: 23-JUL-1996 09:13 Cleaned: 23-JUL-1996 09:13
Media type: TZ877 Length: 0
Mounts: 3 UIC: [ABS]
Location: WAK Protection: S:RW,O:RW,G:R,W:
Notes:
Offsite: Onsite:
IO Errors: 0 Flag: ALLOCATED
Next volume: *none* Previous: SLS020
Pool: MAG03 Rec len: 0 Block factor: 0
Initialized: 5-FEB-1997 00:00 Density: COMP
Slot number:
Side: Other side:
Jukebox / slot: WAK_JUKEBOX1 / 1
Drive: *WAK005::WAK005$MKA400
Volume is in a jukebox slot.
Volume is bound to slot 1 in magazine WAKMAG03.
| |||||
| 387.3 | more info | COOKIE::MHUA | Thu Feb 27 1997 10:22 | 22 | |
Thomas,
Since abs$coord_clean had access violation and died, it makes sense
that the subsequent save was not able to allocate the tape.
abs$coord_clean automatically does set storage XXXX/novol for you
if the save aborted due to a fatal error (or if you stop/id the save
process).
As for the access violation, both access violation took place within
a minute of time from each other. Is there a possibility that some
sort of system failure went through around that time?
Also, the access violation from the save log is mysterious. Since
the code it is executing is within CMA try - endtry block (CMA
exception handling), it should have been caught by CMA and get some
CMA error along with the access violation informaiton. I posted a
note in CMA notes conference if this is at all possible. I will confer
with a local CMA expert also.
Masami
| |||||
| 387.4 | pathworks also crashed | SUOBOS::RUCKH | Thu Feb 27 1997 11:56 | 7 | |
Masami,
the customer told me today, that just in the moment of the dump the pathworks
license server also crashed with a register-dump. Could this be the
reason of the problem?
Thomas
| |||||
| 387.5 | COOKIE::MHUA | Fri Feb 28 1997 08:58 | 9 | ||
Thomas,
I don't think that pathwork dump will affect us. We do not have
anything to do with pathwork process. However, it's getting more and
more likely that something went wrong with the system at the moment and
caused at least 3 processes to die.
Masami
| |||||