[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference csc32::consolemanager

Title:POLYCENTER Console Manager
Notice:Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS:
Moderator:CSC32::BUTTERWORTH
Created:Thu Aug 06 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1541
Total number of notes:6564

817.0. "PCM T1.6-031 archive; bug/access violation" by HLSM04::GERRIT (Gerrit Woertman, UTO, 838-2535) Thu Jun 15 1995 14:57

    
T.RTitleUserPersonal
Name
DateLines
817.1the real problemHLSM04::GERRITGerrit Woertman, UTO, 838-2535Thu Jun 15 1995 14:581
    
817.2and now finally: the problemHLSM04::GERRITGerrit Woertman, UTO, 838-2535Thu Jun 15 1995 15:11135
Hi,

On PCM T1.6-031 I experience strange things with archiving:

When I do: 
$       console archive all/bef="01-jun-1995"/noconfirm

everything seems to be fine.

A $ cons stat/cont gives

Archive in progress (DUPER)

until the end of the archiving-job.
Normally the system being archived is update continously.

This system DUPER has 0 blocks in the logfile.

HLCC00-PCM_manager> dir/size=all/date console$logfiles:duper*

Directory CONSOLE$ROOT:[LOG]

DUPER.EVENTS;1             0/0        10-MAY-1995 16:24:33.05
DUPER.LOG;1                0/0        10-MAY-1995 16:24:33.26
DUPER.TIMES;1              0/0        10-MAY-1995 16:24:32.99

Total of 3 files, 0/0 blocks.

Excerpt from the batch-logfile:
Starting archive procedure for system DELTA
Archive procedure for system DELTA completed successfully
Starting archive procedure for system DUPER
Archive for system DUPER complete - log file is empty
Starting archive procedure for system ED
Archive procedure for system ED completed successfully

Restarting of the same archive-job: 
$       console archive all/bef="02-jun-1995"/noconfirm

gives the same for node BAMI

and also this system has 0 blocks in the logfile.

Okay, this is a bug in archiving.

But this is not the complete story:
Both times Controller_2 stops with an access violation, and the systems
BAMI and DUPER are not served by this controller!

There is however another point:
In note 774.0 I mentioned a system for which I couldn't do a console
extract.
Simon Jackson pointed me out that the user of this system had messed up
the logfile. This same user had done this for 2 other systems.
Good thing: console verify !
Well, last monday night just before installing PCM T1.6-031, I have
deleted the corrupted logfiles, and PCM just recreated these logfiles.
These 3 systems are served by this controller.
This whole week I have had no problems with PCM, until I did this
console archive.

The controller logfile follows:
HLCC00-PCM_manager> ty console$tmp:controller_02.log
$!
$! This command procedure is always run when anybody on the entire system
$! logs in. It is equivalent to LOGIN.COM except that the instructions
$! contained herein are executed everytime anyone on the VMS system
$! logs in to their account.
$!
$! For interactive processes, turn on Control T, and set the terminal type
$!
$ mode = f$mode()
$ tt_devname = f$trnlnm("TT")
$ session_mgr_login = (mode .eqs. "INTERACTIVE") .and.	-
    (f$locate("WSA",tt_devname) .ne. f$len(tt_devname))
$ session_detached_process = (mode .eqs. "INTERACTIVE") .and. -
    (f$locate("MBA",tt_devname) .ne. f$len(tt_devname))
$ unknown_devtyp = (mode .eqs. "INTERACTIVE") .and. -
    (f$getdvi("sys$command","devtype") .eq. 0) 
$!
$ if (mode .eqs. "INTERACTIVE") .and. unknown_devtyp .and. .not. -
     (session_mgr_login .or. session_detached_process)
$ endif
$!
$ if (mode .eqs. "INTERACTIVE") .and. .not. -
     (session_mgr_login .or. session_detached_process)
$ endif
$!
$! MicroVAX Support Removed from OpenVMS Alpha
$!
$! Place your site-specific LOGIN commands below
$!
$ !
$ ! Start a Child Controller process, name_num 2, child_num 2
$ !
$ CHILD :== $CONSOLE$IMAGE:CONSOLE$DAEMON.EXE
$ CHILD "child" 2
POLYCENTER Console Manager
Console Controller Daemon Version T1.6-031
Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved

%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=00000104, PC=0008082C, PS=0000001B

  Improperly handled condition, image exit forced.
    Signal arguments:   Number = 00000005
                        Name   = 0000000C
                                 00000000
                                 00000104
                                 0008082C
                                 0000001B

    Register dump:
    R0  = 0000000000000000  R1  = 0000000000000001  R2  = 0000000000016CE0
    R3  = 0000000000000000  R4  = 0000000000030090  R5  = 0000000000000000
    R6  = 0000000000091160  R7  = 0000000000000001  R8  = 000000007FF9C208
    R9  = 000000007FF9C410  R10 = 000000007FF9D198  R11 = 000000007FFBE3E0
    R12 = 0000000000000000  R13 = 000000007F9FE820  R14 = 0000000000000000
    R15 = 0000000500000000  R16 = 0000000000000000  R17 = 00000000000200D8
    R18 = 000000007FE74020  R19 = 0000000000000000  R20 = 000000007F955370
    R21 = 0000000000000000  R22 = 0000000000000010  R23 = 000000007FB640B0
    R24 = 0000000000000004  R25 = 0000000000000001  R26 = 0000000000074AB8
    R27 = 0000000000016618  R28 = 000000000004C000  R29 = 000000007F955570
    SP  = 000000007F955570  PC  = 000000000008082C  PS  = 300000000000001B
  SYSTEM       job terminated at 15-JUN-1995 11:02:17.77
  Accounting information:
  Buffered I/O count:         1016935         Peak working set size:   4976
  Direct I/O count:            141382         Peak page file size:    20272
  Page faults:                    544         Mounted volumes:            0
  Charged CPU time:           0 00:23:20.61   Elapsed time:     2 11:47:07.51


regards,

Gerrit
    
817.3error in releasing locksHLSM04::GERRITGerrit Woertman, UTO, 838-2535Thu Jun 15 1995 19:4843
Hi,

I have been looking at some timestamps:

The controllerprocess dies just after having archived all the systems.
This means that the access violation has something to do with
releasing locks.

In my original note (.2; my fault, I used get instead of include)
I mentioned two access violations, and yes, looking at timestamps
this is reproducable.
If there is a patch, I can volunteer.

regards,

Gerrit

HLCC00-PCM_manager> dir console$archive:/since/bef=11:25/date
Directory SYS$SYSDEVICE:[CONSOLE.ARCHIVE]

ALFA.7001010000_TO_9506010000_EVENTS;1
                     15-JUN-1995 09:46:54.11
ALFA.7001010000_TO_9506010000_LOG;1
                     15-JUN-1995 09:46:54.20
ALFA.7001010000_TO_9506010000_TIMES;1
                     15-JUN-1995 09:46:53.97
	.
	.
	.
WIT.9505162250_TO_9506010000_EVENTS;1
                     15-JUN-1995 11:02:09.92
WIT.9505162250_TO_9506010000_LOG;1
                     15-JUN-1995 11:02:09.99
WIT.9505162250_TO_9506010000_TIMES;1
                     15-JUN-1995 11:02:09.86
ZOEF.9505162250_TO_9506010000_EVENTS;1
                     15-JUN-1995 11:02:15.20
ZOEF.9505162250_TO_9506010000_LOG;1
                     15-JUN-1995 11:02:14.78
ZOEF.9505162250_TO_9506010000_TIMES;1
                     15-JUN-1995 11:02:15.01

Total of 294 files.