[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference orarep::nomahs::rdb_60

Title:	Oracle Rdb - Still a strategic database for DEC on Alpha AXP!
Notice:	RDB_60 is archived, please use RDB_70..
Moderator:	NOVA::SMITHISON

Created:	Fri Mar 18 1994
Last Modified:	Thu May 29 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5118
Total number of notes:	28246

5106.0. "rdb 7.0 aij busy why?" by ukvms3.uk.oracle.com::SHISCOCK (stand and deliver) Wed Mar 05 1997 12:07

    Hi,
    
    I have a database that hangs when trying to do an aij backup with
    waiting for busy AIJ sequence 24
    for no apparent reason. I've run through the scenario on other
    databases and they're fine. So I've no idea why this hang is
    occuring.
    
    restored from a backup.
    Alter database to reserve 5 journal slots.
    Alter database to enable ALS, add 5 journals and enable fast commit.
    rmu/backup offline
    rmu/set after/switch     <---   empty aij
    rmu/back/after/quiet/nowait db_name aij_back_file
    
    Database is not opened so no users are in.
    
    AIJ info screen shows
    
    S_AIJ1                              24 *BACKUP NEEDED* Written Backing up
    S_AIJ2                              25     512       2 Current Accessible
    
    The backing up process is stuck in LEF. Current PC is 800A1308. There
    are no waiting or blocking locks.
    
    If I then disable fast commit it all works.
    
    
    rdb 7.0 axp/vms 7.0
    
    Any ideas,
    
    Steve

T.R	Title	User	Personal Name	Date	Lines
5106.1		NOVA::R_ANDERSON	Oracle Corporation (603) 881-1935	`Wed Mar 05 1997 13:05`	33
	Hmmm - works for me... I restored from backup, added 5 journals (20 slots), ALS enabled, ABS disabled, fast-commit enabled - see below... What process does SHOW STATS "Checkpoint Information" screen indicate is holding the AIJ journal 24??? Rick ALL> rmu/backup mf_personnel mf_personnel.rbf ALL> rmu/set after/switch mf_personnel ALL> rmu/backup/after/quiet/nowait/log mf_personnel backup.aij %RMU-I-AIJBCKBEG, beginning after-image journal backup operation %RMU-I-OPERNOTIFY, system operator notification: Oracle Rdb Database KODH$:[R_AN DERSON.WORK.ALS]MF_PERSONNEL.RDB;1 Event Notification AIJ backup operation started %RMU-I-AIJBCKSEQ, backing up after-image journal sequence number 4 %RMU-I-LOGBCKAIJ, backing up after-image journal RICK4 at 13:02:24.97 %RMU-I-LOGCREBCK, created backup file KODH$:[R_ANDERSON.WORK.ALS]BACKUP.AIJ;4 %RMU-I-AIJBCKSEQ, backing up after-image journal sequence number 5 %RMU-I-LOGBCKAIJ, backing up after-image journal RICK5 at 13:02:25.91 %RMU-I-QUIETPT, waiting for database quiet point %RMU-I-OPERNOTIFY, system operator notification: Oracle Rdb Database KODH$:[R_AN DERSON.WORK.ALS]MF_PERSONNEL.RDB;1 Event Notification AIJ backup operation completed %RMU-I-AIJBCKEND, after-image journal backup operation completed successfully %RMU-I-LOGAIJJRN, backed up 2 after-image journals at 13:02:26.92 %RMU-I-LOGAIJBLK, backed up 508 after-image journal blocks at 13:02:26.92 ALL>
5106.2	no clues in show stats	ukvms3.uk.oracle.com::SHISCOCK	stand and deliver	`Thu Mar 06 1997 03:40`	13
	Checkpoint info showed nothing. The checkpoint info (unsorted) just had the 1 line with the process id of the backup process and an entry beneath QuietVno. I've run through identical scenarios with other databases and they work fine. Any other ideas how I can find out why this one stalls? thanks, Steve
5106.3		NOVA::R_ANDERSON	Oracle Corporation (603) 881-1935	`Thu Mar 06 1997 08:42`	23
	What exactly does the SHOW STATS "Active Stall Messages Screen" display when the AIJ backup is waiting for available journal? Here's something to try: 1. $ define/sys RDM$BIND_ABS_LOG_FILE "device:[directory]RDMABS70_PID.LOG" 2. $ rmu/set after/backup=(automatic,backup_file=aij_back_file) db_name 3. $ rmu/set after/switch db_name Once the ABS backup fails, examine the logfile. Note that the "_PID" portion of the filename will have been replaced with the ABS process' PID, so the filename will be something like "RDMABS70_12345678.LOG". You should see a line (or several) that say "waiting for journal 24 (oldest checkpoint X:Y)". I would expect the "X:Y" to be something like "24:2". RMU/DUMP/HEADER should list the active users, including their current checkpoint information. If this is NOT the case, use the RMU/DUMP/HEADER/OPTION=DEBUG command and search for an occurrence of "CKPT_VNO = n." where "n <> -1". This is the process stopping the AIJ backup. (Note that "n" should match the "X" above). Rick
5106.4	thanks	ukvms3.uk.oracle.com::SHISCOCK	stand and deliver	`Fri Mar 07 1997 02:21`	23
	Many thanks Rick. From the ABS log 7-MAR-1997 02:12:35.66 - Oldest RCS checkpoint found 23:2 which repeated itself until the backup timed out. The dump/head/opt=debug gave the 23. No active users except the backup process. CKPT_VNO = 23. DBID = 1. CKPT_VNO = 4294967295. DBID = 2. The second ckpt 4294967295 had a dozen or more occurances aganst different DBIDs. Certainly a big gap in the numbers which I presume to be incremental. Does that mean a really old checkpoint failed and it's never been cleared? cheers, Steve
5106.5		NOVA::R_ANDERSON	Oracle Corporation (603) 881-1935	`Fri Mar 07 1997 07:04`	8
	Ignore the 4294967295 (that's a fancy "-1" :-). Those are "no initial checkpoint" indications. For the RTUPB entry with the CKPT_VNO=23, what is the process indicated by the corresponding PID? Could you post the entire RTUPB entry, please? (just the entry with the CKPT_VNO=23). Rick
5106.6	no process reference	ukvms3.uk.oracle.com::SHISCOCK	stand and deliver	`Fri Mar 07 1997 07:48`	21
	There is no rtupb with 23 in it. There's only 1 in use with any detail and that's the first and that belongs to the backup process. The rest are RTUPB_ENT[2.] @00C91840 00000000000000000000000000000000 0000 '................' :::: (1 duplicate line) They're only 30 user slots. A word of warning is that this database has been tested against the ft release of eco1 for 7.0. cheers, Steve
5106.7		NOVA::R_ANDERSON	Oracle Corporation (603) 881-1935	`Mon Mar 10 1997 08:53`	18
	An examination of the database header dump has revealed that a few important details were missing from previous descriptions. 1. Record caching is enabled (a major missing detail). 2. While "fast commit" was disabled (implicitly disabling record cache), 3 AIJ journals were backed up (26 is now "current" - 23, 24 & 25 backed up). 3. After re-enabling "fast commit" (implicitly re-enabling record cache), the record cache checkpoint remained at "23" even though the "current" AIJ journal is "26". 4. The database open mode is "manual" but it appears that the AIJ backup operation (the failing one) is being done while the database is closed. I would recommend manually opening the database before performing the AIJ backup operation. Also, before starting the AIJ backup operation, verify using the "Checkpoint Information" SHOW STATS screen that the record cache checkpoint was advanced successfully. Rick