[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference orarep::nomahs::rdb_60

Title:	Oracle Rdb - Still a strategic database for DEC on Alpha AXP!
Notice:	RDB_60 is archived, please use RDB_70..
Moderator:	NOVA::SMITHISON

Created:	Fri Mar 18 1994
Last Modified:	Thu May 29 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5118
Total number of notes:	28246

4985.0. "SYSTEM-F-ACCVIO from recovery process" by ukvms3.uk.oracle.com::LWILES (Louise Wiles, UK Rdb support) Fri Jan 31 1997 05:20

    Hi,

    Rdb V6.0-14
    VAX VMS V6.1

    A customer of mine found a DBR bugcheck with no exception, but loads of
    ACCVIOs & bugcheck retries.

    There was also an ACCVIO in the monitor log for the recovery process.
    This has happened 2 or 3 times in the past month. Each time, the
    database has recovered transparently - they've not had to shut it down
    etc.

    I just wondered if anyone can shed any light on what's happening.

    The bugcheck has the following in it:

    %SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
    adress=00000000, PC=0007BA5E, PSL=03C00000
    Bugcheck retry count is 0, depth is 0
    %SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
    adress=00000000, PC=0007BA5E, PSL=03C00000
    Bugcheck retry count is 1, depth is 0

    with bugcheck retries increasing to about 100.
     
    The monitor log has the following entries:

27-JAN-1997 11:38:55.84 - received recovery image termination from 2020031D:1
  - user freed 26 global buffers, 1875 free out of 1875
  - recovery failed
  - starting delete-[rpcess shutdown of database <database>
    - "%RDMS-F-DBRABORTED, database recovery process terminated abnormally"
  - database shutdown waiting for recovery to terminate

27-JAN-1997 11:38:56.35 - received request for remote node to join
  - database <database>
  - cluster watcher waiting for MEMBIT lock

27-JAN-1997 11:38:56.38 - received recovery process termination from 2020031D:1
  - final status: "%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=0001AAA3, PC=201C0000, PSL=7FED53CC"
  - recovery failed
  - continuing shutdown of database <database>
  - database shutdown of <database>

    Thanks,
    Louise.

T.R	Title	User	Personal Name	Date	Lines
4985.1		NOVA::R_ANDERSON	Oracle Corporation (603) 881-1935	`Fri Jan 31 1997 07:36`	6
	Hard to say what caused the problem. The PC indicates an accvio in the bugcheck code (duh :-), so without the bugcheck stack trace this is kind of useless... Rick
4985.2	It's a mystery	HOTRDB::PMEAD	Paul, [email protected], 719-577-8032	`Fri Jan 31 1997 09:35`	5
	This looks like another case of the bugcheck code not being able to successfully set the stall message. We see this off and on, but have never been able to determine why. In 6.1 I hacked the code to make sure it could write the stall message before attempting to actually do it. That code will be in the next ECO to 6.1.
4985.3	Accvio @ 201C0000 w/o DMP while RB on VAX ?	NOMAHS::SECRIST	Rdb WWS; [email protected]	`Mon Feb 17 1997 10:45`	26
	; ...looks like another case of the bugcheck code not being able to ; successfully set the stall message ... So if we're bugchecking the bugcheck code I'd expect there to be no bugchecks ? If we get here we're going down at whatever level due to yet anothr problem, only we're not going to have the information we need to diagnose it, right ? This VAX customer was dying with an access violation, reason mask 00, virtual address 0001AAA3, PC 201C0000, with a PSL that is garbled in his FAX that I can retrieve if you want it -- but we can only find an entry in RDMMON.LOG and no dump files anywhere. Is this problem always at that PC on a VAX ? ; That code will be in the next ECO to 6.1. Which ECO to 6.1 would you expect this to be in ? Could this be part of an ECO for V6.0A or a special patch ? Regards, rcs
4985.4		HOTRDB::PMEAD	Paul, [email protected], 719-577-8032	`Mon Feb 17 1997 11:38`	5
	That PC doesn't make any sense to me so I can't say what is going on. The hack I put in the dumper code would be in V6.1A ECO 1 (V6.1-11). I don't know of any immediate plans to begin work on that ECO at this point in time.
4985.5	Bug 421402	CHSR36::LCONS		`Mon Feb 17 1997 11:45`	4
	Could it be related to bug 421402 ? With no Stop/id or Ctrl Y the problem has disapeared Louis
4985.6	Are all the cases of this on VAX ?	NOMAHS::SECRIST	Rdb WWS; [email protected]	`Mon Feb 17 1997 12:44`	17
	; That PC doesn't make any sense to me so I can't say what is going on. That is the same PC and VA from .0 of this note and that my customer saw today. They were at 6.0-14 so we're going to start by brining them up to 6.0-16, but if they get this a lot more they're going to be interested in your hack so we can get to the root of the problem (it still happens infrequently and they can't reproduce it at will, but since it takes ~1.5-2 hours to roll everything back when whatever DOES bite them they're getting real sick of not having a clue what it is). Regards, rcs
4985.7		HOTRDB::PMEAD	Paul, [email protected], 719-577-8032	`Mon Feb 17 1997 13:18`	8
	> That is the same PC and VA from .0 of this note and that my > customer saw today. Oh, the one from the monitor log. Well that is junk. You need to make sure they don't have any files in SYS$SYSTEM. If you don't find the RDMDBRBUG.DMP files then look for KOD$TT. files. In any case, you might want to look at the bug Louis refered to.