[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference cookie::archive_backup

Title:Archive/Backup
Moderator:COOKIE::MHUAIG
Created:Wed Sep 08 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:479
Total number of notes:2283

477.0. "RDEV 4.1a server inactivity timer timeout explanation ?" by PRSSOS::FONDI () Thu Jun 05 1997 07:39

RDEV 4.1a server inactivity timer timeout causes ABS remote backup to fail 

ABS OpenVMS Clients Server Configuration, all nodes are Alpha nodes.

   server ABS V2.1			remote nodes: 	Clients ABS V2.1
   VMS V6.2-1H3						VMS V6.2-1H3
   MDMS 2.8A						MDMS 2.8A
   RDEV 4.1a that serves a TL810 for remote nodes	RDEV 4.1a
   DECNET/OSI V6.2 ECO7			                DECNET/OSI V6.2 (no ECO)
 							or DECNET IV it depends
                                                        the client nodes.


Probleme: This customer uses ABS to save disks of each remote nodes to
********* the TL810 connected to the ABS server named LAVX06.
          Often remote backups from any remote nodes (phase 4 or 5)abnormaly 
          stops with
              %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1:  %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1:  -SYSTEM-F-DEVOFFLINE, device is not in configuration or not 
            available

 In the following  LAVX06 is the server node
                   LAVX08 is one remote node (DECnet/OSI V6.2 no ECO) 

 Looking at different log this seems to come from RDEV 4.1a server that 
 deallocates the tape_device served because it has (inactivity timer expired)
 freeing the tapedrive unloading it ... see RDV:079 [06/03-23:15:30.79]  in 
 SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG i posted below

 The RD client tries to reconnect but see the served tape device is already 
 freed =>thus disconnecting thus aborting the remote backup with 
        -SYSTEM-F-DEVOFFLINE, device is not in configuration or not available
 (see :79  [06/03-23:17:23.77] in SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
  and lavx08::ABS_LOG:X08RMS.LOG the log of the save request showing the error)

Questions:
**********
	1)What to do to increase the RDEV 4.1a server inactivity timeout ?   
	  (actually how many seconds is it ?)
        2)Could it be a DECNET problem performance, or DECNET/OSI not           
          up-to-date?
          (cutomer will install DECNET/OSI 6.3 ECO6 on every node)


   Thank you
   Henri


below are:
1) log of the save request   lavx08::ABS_LOG:X08RMS.LOG
2) RDserver log     SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
3) RDclient log     SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
4) opcom message seen on LVXC08 at time of error  LAVX08_operator.log




1)
                *******************************************
                FICHIER DE LOG : lavx08::ABS_LOG:X08RMS.LOG
                *******************************************

$ vl=f$verify(0) ! SYLOGIN.COM - debut
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
"@abs_system:coordinator.com 383531EA-C5F2-11D0-87FE-4C4156583036" Executing, o
------------------------------------------------------
---------------------------------------------------------------
Starting New Request at  3-JUN-1997 20:13:07.89
   Name:   X08RMS
   UID:    383531EA-C5F2-11D0-87FE-4C4156583036

COORDINATOR:  Attempting to allocate volume set AGA515...
COORDINATOR:  Mounting volume set member: AGA515 RVN 1
COORDINATOR:     (Selected drive LAVX08$RDEVA0:)
%MOUNT-I-MOUNTED, AGA515 mounted on _LAVX08$RDEVA0:
COORDINATOR:  Skipping LAVX08$RDEVA0: to End of Tape...
THREAD #1:

Operation #1 starting at  3-JUN-1997 20:17:02.81

   Data Movement Type:    INCREMENTAL_SAVE
   Incremental Level:     Level 6 Operation

   Object Set:
      Object Type:        VMS FILES
      Include List:       DKA100:
      Exclude List:

   Archive Information:
      Storage Class Name: X08RMS
      Saveset Location:   AGA515
      Saveset Name:       3JUN199720130855.

   Execution Environment:
      Name:               X08RMS_ENV
      Number of retries:  0
      Retry Interval:     0 minute(s)


THREAD #1:  $
THREAD #1:  SET NOON
THREAD #1:  $ version = F$EXTRACT(0,4,f$getsyi("VERSION"))
THREAD #1:  $ IF (VERSION .eqs. "V6.1") THEN $DEFINE BACKUP ABS$SYSTEM:ALTERNAT
THREAD #1:  $ DEFINE SYS$COM
THREAD #1:  MAND SYS$INPUT:
THREAD #1:  $ BACKUP DKA100:[000000...]* -
THREAD #1:  _$  -
THREAD #1:  _$ /LIST=_MBA176:/FULL -
THREAD #1:  _$ /RECORD -
THREAD #1:  _$ /IGNORE=(INTERLOCK) -
THREAD #1:  _$ /NOCRC/NOVERIFY/CRC -
THREAD #1:  _$  -
THREAD #1:  _$ /SINCE=" 2-JUN-1997 20:15:14.93" -
THREAD #1:  _$  -
THREAD #1:  _$ /MODIF -
THREAD #1:  _$ LAVX08$R
THREAD #1:  DEVA0:3JUN199720130855./SAVE -
THREAD #1:  _$ /STOR=V2SLS/NOASSIST -
THREAD #1:  _$ /EXACT_ORDER
THREAD #1:  HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1:  HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1:  HDR2F0819208192                     M             00               
THREAD #1:  HDR2F0819208192                     M             00               
THREAD #1:  %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1:  %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1:  -SYSTEM-F-DEVOFFLINE, device is not in configuration or not availab
THREAD #1:  $
COORDINATOR:  Retiring volume set AGA515 (due to fatal error during save)
COORDINATOR:  Dismounting volume set member: AGA515 RVN 1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 1563, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 1307, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 7971, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error
THREAD #1:     %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
THREAD #1:  Failed to deallocate volume set
COORDINATOR:  Skipping LAVX08$RDEVA0: to End of Tape...
COORDINATOR:  Dismounting volume set member: AGA515 RVN 1
COORDINATOR:  Failed to access Storage Class
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972013
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 1309, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 801, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 704, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 469, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 1717, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate 
COORDINATOR:     Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR:     %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 4819, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 507, File = RESD$:[SRC]TAPE_OPERATIONS.C;1
COORDINATOR:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR:     %SYSTEM-F-DEVOFFLINE, device is not in configuration or not av
COORDINATOR:  Final status is Invalid archive file system specified
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1277, File = RESD$:[SRC]COORDINATOR.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1772, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 916, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 811, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1667, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1895, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error in diag blo
   %LIB-E-ACTIMAGE, error activating image !AS
%SYSTEM-F-OPINCOMPL, operation is incomplete
  ABS          job terminated at  3-JUN-1997 23:19:12.23
<CR><LF>  Accounting information:
  Buffered I/O count:             957         Peak working set size:  12960
  Direct I/O count:               345         Peak page file size:    67680
  Page faults:                   1141         Mounted volumes:            2
  Charged CPU time:           0 00:00:03.29   Elapsed time:     0 03:06:16.09


2)
type SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG

                RDV:079 [06/03-20:11:23.19]  Client connecting
                RDV:079 [06/03-20:11:23.38] ( RCN) client DECnet addr   (get_cli
ent_addr() 0)
                RDV:079 [06/03-20:11:23.56]  ALLOCATE rqst        (from LAVX08::
ABS)
 lavx06$mkc500  RDV:079 [06/03-20:11:23.69]  Allocated
                RDV:080 [06/03-21:10:47.01]  Client connecting
                RDV:080 [06/03-21:10:47.26] ( RCN) client DECnet addr   (get_cli
ent_addr() 0)
                RDV:080 [06/03-21:10:47.40]  ALLOCATE rqst        (from LAVX09::
ABS)
....
 lavx06$mkc500  RDV:079 [06/03-23:15:30.79]  Deallocating         (inactivity ti
mer expired)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.88] ( RCN) LL_Done              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.88] ( RCN) Freeing              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.89]  Unloading tape
 lavx06$mkc500  RDV:079 [06/03-23:15:30.89] ( RCN) Freeing              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:31.04]  Deallocated


3)
LAVX08>TYPE SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25

$ vl=f$verify(0) ! SYLOGIN.COM - debut
RDEV - VMS Remote Device Facility (Version V4.1a) - --RDclient-- Procedure
Copyright (c) 1990-1996 by Touch Technologies, Inc.

            :    [06/03-09:55:45.00] RDCLIENT startup     (V4.1)
            :    [06/03-09:55:45.06] network connect ready (_MBA52:)
            :    [06/03-09:55:45.25] OPCOM relay ready    (Pty=_FTA1:, Mbx=_MBA5
4:)
            :    [06/03-09:55:45.25] Compression DISABLED for this client ()
            :    [06/03-09:55:45.25] Startup complete     ()
LAVX06:: RDV:    [06/03-20:13:15.55] Connecting to server
            :79  [06/03-23:17:23.77] 23:17:23.77 80F60D80 NRP done w/error statu
s 20E4, iocount 59, flags 00050035
            :79  [06/03-23:17:23.78] 23:17:23.77 80F60D80 NRP error status 20E4
flags 00058035
            :79  [06/03-23:17:23.78] 23:17:23.77 Attempting reconnect.  Old: 80E
8AEC0 netucb  80E8AF84 netwcb
LAVX06:: RDV:79  [06/03-23:17:23.78] DECnet link error    (reconnect requested)
            :79  [06/03-23:17:23.80] 23:17:23.77 SEND_RCNCT_CMD ok
LAVX06:: RDV:79  [06/03-23:17:33.57] Attempting reconnect (attempt 1)
LAVX06:: RDV:79  [06/03-23:17:33.62] ...sending rcnct data
LAVX06:: RDV:79  [06/03-23:17:33.78] Served device already freed; server rejecte
d reconnect (%SYSTEM-F-ABORT, abort)
LAVX06:: RDV:79  [06/03-23:17:33.78] Client done with link to server
LAVX06:: RDV:79  [06/03-23:17:33.80] Disconnecting from server
LAVX06:: RDV:79  [06/03-23:17:33.85] Failed resp to pkt from server (%NONAME-W-N
OMSG, Message number 7FF9C9E0)
            :    [06/03-23:17:33.86] Disconnect request, bad sl ()


4)
%%%%%%%%%%%  OPCOM   3-JUN-1997 22:55:32.13  %%%%%%%%%%%
Logfile time stamp

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:17:33.79  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS FATAL ERROR:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972
0130855.;
   X08RMS, thread 1   X08RMS, thread 1
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:18:22.31  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS FATAL ERROR:  Failed to deallocate volume set
   X08RMS, thread 1   X08RMS, thread 1
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:19:11.83  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS JOB SUMMARY:     X08RMS, thread 1
   %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.;
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:20:03.49  %%%%%%%%%%%
Message from user ABS on LAVX08
ABS_CLEANUP:  Cleanup performed for aborted job
    Request Name:   X08RMS
    Scheduler Job:  11
    Deleted PID:    000000BD


T.RTitleUserPersonal
Name
DateLines
477.1COOKIE::MHUAThu Jun 05 1997 17:3012
    
    Henri,
    
    We do not know how to get around RDF problems. ABS just let MDMS layer
    handle remote tapes (via RDF). We need to get with MDMS people to see 
    what they say about this.
    
    I would recommend cross-posting the same note to SLS conference to
    get exposue that way.  ABS just interprets the error and fails and
    ABS itself does not cause this failure....
    
    Masami 
477.2< cross posted COOKIE::SLS note 2269.0 >PRSSOS::FONDIFri Jun 06 1997 04:296
	Thank you Masami,

   This note is now cross posted:   note 2269.0 in COOKIE::SLS conference


	Henri