[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Storage Library System |
|
Moderator: | COOKIE::REUTER |
|
Created: | Sun Oct 13 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2270 |
Total number of notes: | 7850 |
2269.0. "RDF 4.1a server inactivity timer timeout explanation ?" by PRSSOS::FONDI () Fri Jun 06 1997 04:24
RDEV 4.1a server inactivity timer timeout causes ABS remote backup to fail
ABS OpenVMS Clients Server Configuration, all nodes are Alpha nodes.
server ABS V2.1 remote nodes: Clients ABS V2.1
VMS V6.2-1H3 VMS V6.2-1H3
MDMS 2.8A MDMS 2.8A
RDEV 4.1a that serves a TL810 for remote nodes RDEV 4.1a
DECNET/OSI V6.2 ECO7 DECNET/OSI V6.2 (no ECO)
or DECNET IV it depends
the client nodes.
Probleme: This customer uses ABS to save disks of each remote nodes to
********* the TL810 connected to the ABS server named LAVX06.
Often remote backups from any remote nodes (phase 4 or 5)abnormaly
stops with
%BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1: %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1: -SYSTEM-F-DEVOFFLINE, device is not in configuration or not
available
In the following LAVX06 is the server node
LAVX08 is one remote node (DECnet/OSI V6.2 no ECO)
Looking at different log this seems to come from RDEV 4.1a server that
deallocates the tape_device served because it has (inactivity timer expired)
freeing the tapedrive unloading it ... see RDV:079 [06/03-23:15:30.79] in
SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG i posted below
The RD client tries to reconnect but see the served tape device is already
freed =>thus disconnecting thus aborting the remote backup with
-SYSTEM-F-DEVOFFLINE, device is not in configuration or not available
(see :79 [06/03-23:17:23.77] in SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
and lavx08::ABS_LOG:X08RMS.LOG the log of the save request showing the error)
Questions:
**********
1)What to do to increase the RDEV 4.1a server inactivity timeout ?
(actually how many seconds is it ?)
2)Could it be a DECNET problem performance, or DECNET/OSI not
up-to-date?
(cutomer will install DECNET/OSI 6.3 ECO6 on every node)
3)In the mean time i have discovered that there is a RDF V4.1D
(Kit SLSFT01A028.A, that corrects crash problem caused by
RDCDRIVER_A62 on deallocating pages , as LAVXC08 sometimes crash like
this, we will install this patch (CFS. 50555) on both server and
clients nodes.
Thank you
Henri
below are:
1) log of the save request lavx08::ABS_LOG:X08RMS.LOG
2) RDserver log SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
3) RDclient log SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
4) opcom message seen on LVXC08 at time of error LAVX08_operator.log
1)
*******************************************
FICHIER DE LOG : lavx08::ABS_LOG:X08RMS.LOG
*******************************************
$ vl=f$verify(0) ! SYLOGIN.COM - debut
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
"@abs_system:coordinator.com 383531EA-C5F2-11D0-87FE-4C4156583036" Executing, o
------------------------------------------------------
---------------------------------------------------------------
Starting New Request at 3-JUN-1997 20:13:07.89
Name: X08RMS
UID: 383531EA-C5F2-11D0-87FE-4C4156583036
COORDINATOR: Attempting to allocate volume set AGA515...
COORDINATOR: Mounting volume set member: AGA515 RVN 1
COORDINATOR: (Selected drive LAVX08$RDEVA0:)
%MOUNT-I-MOUNTED, AGA515 mounted on _LAVX08$RDEVA0:
COORDINATOR: Skipping LAVX08$RDEVA0: to End of Tape...
THREAD #1:
Operation #1 starting at 3-JUN-1997 20:17:02.81
Data Movement Type: INCREMENTAL_SAVE
Incremental Level: Level 6 Operation
Object Set:
Object Type: VMS FILES
Include List: DKA100:
Exclude List:
Archive Information:
Storage Class Name: X08RMS
Saveset Location: AGA515
Saveset Name: 3JUN199720130855.
Execution Environment:
Name: X08RMS_ENV
Number of retries: 0
Retry Interval: 0 minute(s)
THREAD #1: $
THREAD #1: SET NOON
THREAD #1: $ version = F$EXTRACT(0,4,f$getsyi("VERSION"))
THREAD #1: $ IF (VERSION .eqs. "V6.1") THEN $DEFINE BACKUP ABS$SYSTEM:ALTERNAT
THREAD #1: $ DEFINE SYS$COM
THREAD #1: MAND SYS$INPUT:
THREAD #1: $ BACKUP DKA100:[000000...]* -
THREAD #1: _$ -
THREAD #1: _$ /LIST=_MBA176:/FULL -
THREAD #1: _$ /RECORD -
THREAD #1: _$ /IGNORE=(INTERLOCK) -
THREAD #1: _$ /NOCRC/NOVERIFY/CRC -
THREAD #1: _$ -
THREAD #1: _$ /SINCE=" 2-JUN-1997 20:15:14.93" -
THREAD #1: _$ -
THREAD #1: _$ /MODIF -
THREAD #1: _$ LAVX08$R
THREAD #1: DEVA0:3JUN199720130855./SAVE -
THREAD #1: _$ /STOR=V2SLS/NOASSIST -
THREAD #1: _$ /EXACT_ORDER
THREAD #1: HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1: HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1: HDR2F0819208192 M 00
THREAD #1: HDR2F0819208192 M 00
THREAD #1: %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1: %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1: -SYSTEM-F-DEVOFFLINE, device is not in configuration or not availab
THREAD #1: $
COORDINATOR: Retiring volume set AGA515 (due to fatal error during save)
COORDINATOR: Dismounting volume set member: AGA515 RVN 1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 1563, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 1307, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 7971, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1: Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error
THREAD #1: %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
THREAD #1: Failed to deallocate volume set
COORDINATOR: Skipping LAVX08$RDEVA0: to End of Tape...
COORDINATOR: Dismounting volume set member: AGA515 RVN 1
COORDINATOR: Failed to access Storage Class
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972013
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 1309, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 801, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 704, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR: Facility ABS: ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR: Line = 469, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 1717, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate
COORDINATOR: Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR: %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 4819, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR: Facility ABS: ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR: Line = 507, File = RESD$:[SRC]TAPE_OPERATIONS.C;1
COORDINATOR: Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR: %SYSTEM-F-DEVOFFLINE, device is not in configuration or not av
COORDINATOR: Final status is Invalid archive file system specified
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1277, File = RESD$:[SRC]COORDINATOR.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1772, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 916, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 811, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1667, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
Line = 1895, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS: ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error in diag blo
%LIB-E-ACTIMAGE, error activating image !AS
%SYSTEM-F-OPINCOMPL, operation is incomplete
ABS job terminated at 3-JUN-1997 23:19:12.23
<CR><LF> Accounting information:
Buffered I/O count: 957 Peak working set size: 12960
Direct I/O count: 345 Peak page file size: 67680
Page faults: 1141 Mounted volumes: 2
Charged CPU time: 0 00:00:03.29 Elapsed time: 0 03:06:16.09
2)
type SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
RDV:079 [06/03-20:11:23.19] Client connecting
RDV:079 [06/03-20:11:23.38] ( RCN) client DECnet addr (get_cli
ent_addr() 0)
RDV:079 [06/03-20:11:23.56] ALLOCATE rqst (from LAVX08::
ABS)
lavx06$mkc500 RDV:079 [06/03-20:11:23.69] Allocated
RDV:080 [06/03-21:10:47.01] Client connecting
RDV:080 [06/03-21:10:47.26] ( RCN) client DECnet addr (get_cli
ent_addr() 0)
RDV:080 [06/03-21:10:47.40] ALLOCATE rqst (from LAVX09::
ABS)
....
lavx06$mkc500 RDV:079 [06/03-23:15:30.79] Deallocating (inactivity ti
mer expired)
lavx06$mkc500 RDV:079 [06/03-23:15:30.88] ( RCN) LL_Done (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:30.88] ( RCN) Freeing (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:30.89] Unloading tape
lavx06$mkc500 RDV:079 [06/03-23:15:30.89] ( RCN) Freeing (chl 320
, _NET6623:)
lavx06$mkc500 RDV:079 [06/03-23:15:31.04] Deallocated
3)
LAVX08>TYPE SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
$ vl=f$verify(0) ! SYLOGIN.COM - debut
RDEV - VMS Remote Device Facility (Version V4.1a) - --RDclient-- Procedure
Copyright (c) 1990-1996 by Touch Technologies, Inc.
: [06/03-09:55:45.00] RDCLIENT startup (V4.1)
: [06/03-09:55:45.06] network connect ready (_MBA52:)
: [06/03-09:55:45.25] OPCOM relay ready (Pty=_FTA1:, Mbx=_MBA5
4:)
: [06/03-09:55:45.25] Compression DISABLED for this client ()
: [06/03-09:55:45.25] Startup complete ()
LAVX06:: RDV: [06/03-20:13:15.55] Connecting to server
:79 [06/03-23:17:23.77] 23:17:23.77 80F60D80 NRP done w/error statu
s 20E4, iocount 59, flags 00050035
:79 [06/03-23:17:23.78] 23:17:23.77 80F60D80 NRP error status 20E4
flags 00058035
:79 [06/03-23:17:23.78] 23:17:23.77 Attempting reconnect. Old: 80E
8AEC0 netucb 80E8AF84 netwcb
LAVX06:: RDV:79 [06/03-23:17:23.78] DECnet link error (reconnect requested)
:79 [06/03-23:17:23.80] 23:17:23.77 SEND_RCNCT_CMD ok
LAVX06:: RDV:79 [06/03-23:17:33.57] Attempting reconnect (attempt 1)
LAVX06:: RDV:79 [06/03-23:17:33.62] ...sending rcnct data
LAVX06:: RDV:79 [06/03-23:17:33.78] Served device already freed; server rejecte
d reconnect (%SYSTEM-F-ABORT, abort)
LAVX06:: RDV:79 [06/03-23:17:33.78] Client done with link to server
LAVX06:: RDV:79 [06/03-23:17:33.80] Disconnecting from server
LAVX06:: RDV:79 [06/03-23:17:33.85] Failed resp to pkt from server (%NONAME-W-N
OMSG, Message number 7FF9C9E0)
: [06/03-23:17:33.86] Disconnect request, bad sl ()
4)
%%%%%%%%%%% OPCOM 3-JUN-1997 22:55:32.13 %%%%%%%%%%%
Logfile time stamp
%%%%%%%%%%% OPCOM 3-JUN-1997 23:17:33.79 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS FATAL ERROR: %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972
0130855.;
X08RMS, thread 1 X08RMS, thread 1
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:18:22.31 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS FATAL ERROR: Failed to deallocate volume set
X08RMS, thread 1 X08RMS, thread 1
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:19:11.83 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS JOB SUMMARY: X08RMS, thread 1
%BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.;
Include spec = (VMS FILES) DKA100:
Last VMS FILES processed: [BASES.A]DYNAMIQUE_I_1.DBF
%%%%%%%%%%% OPCOM 3-JUN-1997 23:20:03.49 %%%%%%%%%%%
Message from user ABS on LAVX08
ABS_CLEANUP: Cleanup performed for aborted job
Request Name: X08RMS
Scheduler Job: 11
Deleted PID: 000000BD
T.R | Title | User | Personal Name | Date | Lines
|
---|