[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Archive/Backup |
|
Moderator: | COOKIE::MHUA IG |
|
Created: | Wed Sep 08 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 479 |
Total number of notes: | 2283 |
477.0. "RDEV 4.1a server inactivity timer timeout explanation ?" by PRSSOS::FONDI () Thu Jun 05 1997 07:39
RDEV 4.1a server inactivity timer timeout causes ABS remote backup to fail
ABS OpenVMS Clients Server Configuration, all nodes are Alpha nodes.
server ABS V2.1 remote nodes: Clients ABS V2.1
VMS V6.2-1H3 VMS V6.2-1H3
MDMS 2.8A MDMS 2.8A
RDEV 4.1a that serves a TL810 for remote nodes RDEV 4.1a
DECNET/OSI V6.2 ECO7 DECNET/OSI V6.2 (no ECO)
or DECNET IV it depends
the client nodes.
Probleme: This customer uses ABS to save disks of each remote nodes to
********* the TL810 connected to the ABS server named LAVX06.
Often remote backups from any remote nodes (phase 4 or 5)abnormaly
stops with
%BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1: %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1: -SYSTEM-F-DEVOFFLINE, device is not in configuration or not
available
In the following LAVX06 is the server node
LAVX08 is one remote node (DECnet/OSI V6.2 no ECO)
Looking at different log this seems to come from RDEV 4.1a server that
deallocates the tape_device served because it has (inactivity timer expired)
freeing the tapedrive unloading it ... see RDV:079 [06/03-23:15:30.79] in
SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG i posted below
The RD client tries to reconnect but see the served tape device is already
freed =>thus disconnecting thus aborting the remote backup with
-SYSTEM-F-DEVOFFLINE, device is not in configuration or not available
(see :79 [06/03-23:17:23.77] in SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
and lavx08::ABS_LOG:X08RMS.LOG the log of the save request showing the error)
Questions:
**********
1)What to do to increase the RDEV 4.1a server inactivity timeout ?
(actually how many seconds is it ?)
2)Could it be a DECNET problem performance, or DECNET/OSI not
up-to-date?
(cutomer will install DECNET/OSI 6.3 ECO6 on every node)
Thank you
Henri
below are:
1) log of the save request lavx08::ABS_LOG:X08RMS.LOG
2) RDserver log SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
3) RDclient log SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
4) opcom message seen on LVXC08 at time of error LAVX08_operator.log
1)
*******************************************
FICHIER DE LOG : lavx08::ABS_LOG:X08RMS.LOG
*******************************************
$ vl=f$verify(0) ! SYLOGIN.COM - debut
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
"@abs_system:coordinator.com 383531EA-C5F2-11D0-87FE-4C4156583036" Executing, o
------------------------------------------------------
---------------------------------------------------------------
Starting New Request at 3-JUN-1997 20:13:07.89
Name: X08RMS
UID: 383531EA-C5F2-11D0-87FE-4C4156583036
COORDINATOR: Attempting to allocate volume set AGA515...
COORDINATOR: Mounting volume set member: AGA515 RVN 1
COORDINATOR: (Selected drive LAVX08$RDEVA0:)
%MOUNT-I-MOUNTED, AGA515 mounted on _LAVX08$RDEVA0:
COORDINATOR: Skipping LAVX08$RDEVA0: to End of Tape...
THREAD #1:
Operation #1 starting at 3-JUN-1997 20:17:02.81
Data Movement Type: INCREMENTAL_SAVE
Incremental Level: Level 6 Operation
Object Set:
Object Type: VMS FILES
Include List: DKA100:
Exclude List:
Archive Information:
Storage Class Name: X08RMS
Saveset Location: AGA515
Saveset Name: 3JUN199720130855.
Execution Environment:
Name: X08RMS_ENV
Number of retries: 0
Retry Interval: 0 minute(s)
THREAD #1: $
THREAD #1: SET NOON
THREAD #1: $ version = F$EXTRACT(0,4,f$getsyi("VERSION"))
THREAD #1: $ IF (VERSION .eqs. "V6.1") THEN $DEFINE BACKUP ABS$SYSTEM:ALTERNAT
THREAD #1: $ DEFINE SYS$COM
THREAD #1: MAND SYS$INPUT:
THREAD #1: $ BACKUP DKA100:[000000...]* -
THREAD #1: _$ -
THREAD #1: _$ /LIST=_MBA176:/FULL -
THREAD #1: _$ /RECORD -
THREAD #1: _$ /IGNORE=(INTERLOCK) -
THREAD #1: _$ /NOCRC/NOVERIFY/CRC -
THREAD #1: _$ -
THREAD #1: _$ /SINCE=" 2-JUN-1997 20:15:14.93" -
THREAD #1: _$ -
THREAD #1: _$ /MODIF -
THREAD #1: _$ LAVX08$R
THREAD #1: DEVA0:3JUN199720130855./SAVE -
THREAD #1: _$ /STOR=V2SLS/NOASSIST -
THREAD #1: _$ /EXACT_ORDER
THREAD #1: HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1: HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1: HDR2F0819208192 M 00
THREAD #1: HDR2F0819208192 M 00
THREAD #1: %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1: %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1: -SYSTEM-F-DEVOFFLINE, device is not in configuration or not availab
THREAD #1: $
COORDINATOR: Retiring volume set AGA515 (due to fatal error during save)
COORDINATOR: Dismounting volume set member: AGA515 RVN 1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 1563, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 1307, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 7971, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error
THREAD #1: %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
THREAD #1: Failed to deallocate volume set
COORDINATOR: Skipping LAVX08$RDEVA0: to End of Tape...
COORDINATOR: Dismounting volume set member: AGA515 RVN 1
COORDINATOR: Failed to access Storage Class
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972013
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 1309, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 801, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 704, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 469, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 1717, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate
COORDINATOR: Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR: %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 4819, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 507, File = RESD$:[SRC]TAPE_OPERATIONS.C;1
COORDINATOR: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR: %SYSTEM-F-DEVOFFLINE, device is not in configuration or not av
COORDINATOR: Final status is Invalid archive file system specified
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1277, File = RESD$:[SRC]COORDINATOR.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1772, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 916, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 811, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1667, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1895, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error in diag blo
%LIB-E-ACTIMAGE, error activating image !AS
%SYSTEM-F-OPINCOMPL, operation is incomplete
ABS job terminated at 3-JUN-1997 23:19:12.23
<CR><LF> Accounting information:
Buffered I/O count: 957 Peak working set size: 12960
Direct I/O count: 345 Peak page file size: 67680
Page faults: 1141 Mounted volumes: 2
Charged CPU time: 0 00:00:03.29 Elapsed time: 0 03:06:16.09
2)
type SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
RDV:079 [06/03-20:11:23.19] Client connecting
RDV:079 [06/03-20:11:23.38] ( RCN) client DECnet addr (get_cli
ent_addr() 0)
RDV:079 [06/03-20:11:23.56] ALLOCATE rqst (from LAVX08::
ABS)
lavx06$mkc500 RDV:079 [06/03-20:11:23.69] Allocated
RDV:080 [06/03-21:10:47.01] Client connecting
RDV:080 [06/03-21:10:47.26] ( RCN) client DECnet addr (get_cli
ent_addr() 0)
RDV:080 [06/03-21:10:47.40] ALLOCATE rqst (from LAVX09::
ABS)
....
lavx06$mkc500 RDV:079 [06/03-23:15:30.79] Deallocating (inactivity ti
mer expired)
lavx06$mkc500 RDV:079 [06/03-23:15:30.88] ( RCN) LL_Done (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:30.88] ( RCN) Freeing (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:30.89] Unloading tape
lavx06$mkc500 RDV:079 [06/03-23:15:30.89] ( RCN) Freeing (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:31.04] Deallocated
3)
LAVX08>TYPE SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
$ vl=f$verify(0) ! SYLOGIN.COM - debut
RDEV - VMS Remote Device Facility (Version V4.1a) - --RDclient-- Procedure
Copyright (c) 1990-1996 by Touch Technologies, Inc.
: [06/03-09:55:45.00] RDCLIENT startup (V4.1)
: [06/03-09:55:45.06] network connect ready (_MBA52:)
: [06/03-09:55:45.25] OPCOM relay ready (Pty=_FTA1:, Mbx=_MBA5
4:)
: [06/03-09:55:45.25] Compression DISABLED for this client ()
: [06/03-09:55:45.25] Startup complete ()
LAVX06:: RDV: [06/03-20:13:15.55] Connecting to server
:79 [06/03-23:17:23.77] 23:17:23.77 80F60D80 NRP done w/error statu
s 20E4, iocount 59, flags 00050035
:79 [06/03-23:17:23.78] 23:17:23.77 80F60D80 NRP error status 20E4
flags 00058035
:79 [06/03-23:17:23.78] 23:17:23.77 Attempting reconnect. Old: 80E
8AEC0 netucb 80E8AF84 netwcb
LAVX06:: RDV:79 [06/03-23:17:23.78] DECnet link error (reconnect requested)
:79 [06/03-23:17:23.80] 23:17:23.77 SEND_RCNCT_CMD ok
LAVX06:: RDV:79 [06/03-23:17:33.57] Attempting reconnect (attempt 1)
LAVX06:: RDV:79 [06/03-23:17:33.62] ...sending rcnct data
LAVX06:: RDV:79 [06/03-23:17:33.78] Served device already freed; server rejecte
d reconnect (%SYSTEM-F-ABORT, abort)
LAVX06:: RDV:79 [06/03-23:17:33.78] Client done with link to server
LAVX06:: RDV:79 [06/03-23:17:33.80] Disconnecting from server
LAVX06:: RDV:79 [06/03-23:17:33.85] Failed resp to pkt from server (%NONAME-W-N
OMSG, Message number 7FF9C9E0)
: [06/03-23:17:33.86] Disconnect request, bad sl ()
4)
%%%%%%%%%%% OPCOM 3-JUN-1997 22:55:32.13 %%%%%%%%%%%
Logfile time stamp
%%%%%%%%%%% OPCOM 3-JUN-1997 23:17:33.79 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS FATAL ERROR: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972
0130855.;
X08RMS, thread 1 X08RMS, thread 1
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:18:22.31 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS FATAL ERROR: Failed to deallocate volume set
X08RMS, thread 1 X08RMS, thread 1
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:19:11.83 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS JOB SUMMARY: X08RMS, thread 1
%BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.;
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:20:03.49 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS_CLEANUP: Cleanup performed for aborted job
Request Name: X08RMS
Scheduler Job: 11
Deleted PID: 000000BD
T.R | Title | User | Personal Name | Date | Lines |
---|
477.1 | | COOKIE::MHUA | | Thu Jun 05 1997 17:30 | 12 |
|
Henri,
We do not know how to get around RDF problems. ABS just let MDMS layer
handle remote tapes (via RDF). We need to get with MDMS people to see
what they say about this.
I would recommend cross-posting the same note to SLS conference to
get exposue that way. ABS just interprets the error and fails and
ABS itself does not cause this failure....
Masami
|
477.2 | < cross posted COOKIE::SLS note 2269.0 > | PRSSOS::FONDI | | Fri Jun 06 1997 04:29 | 6 |
| Thank you Masami,
This note is now cross posted: note 2269.0 in COOKIE::SLS conference
Henri
|