T.R | Title | User | Personal Name | Date | Lines |
---|
5106.1 | | NOVA::R_ANDERSON | Oracle Corporation (603) 881-1935 | Wed Mar 05 1997 13:05 | 33 |
| Hmmm - works for me... I restored from backup, added 5 journals (20 slots), ALS
enabled, ABS disabled, fast-commit enabled - see below...
What process does SHOW STATS "Checkpoint Information" screen indicate is holding
the AIJ journal 24???
Rick
ALL> rmu/backup mf_personnel mf_personnel.rbf
ALL> rmu/set after/switch mf_personnel
ALL> rmu/backup/after/quiet/nowait/log mf_personnel backup.aij
%RMU-I-AIJBCKBEG, beginning after-image journal backup operation
%RMU-I-OPERNOTIFY, system operator notification: Oracle Rdb Database KODH$:[R_AN
DERSON.WORK.ALS]MF_PERSONNEL.RDB;1 Event Notification
AIJ backup operation started
%RMU-I-AIJBCKSEQ, backing up after-image journal sequence number 4
%RMU-I-LOGBCKAIJ, backing up after-image journal RICK4 at 13:02:24.97
%RMU-I-LOGCREBCK, created backup file KODH$:[R_ANDERSON.WORK.ALS]BACKUP.AIJ;4
%RMU-I-AIJBCKSEQ, backing up after-image journal sequence number 5
%RMU-I-LOGBCKAIJ, backing up after-image journal RICK5 at 13:02:25.91
%RMU-I-QUIETPT, waiting for database quiet point
%RMU-I-OPERNOTIFY, system operator notification: Oracle Rdb Database KODH$:[R_AN
DERSON.WORK.ALS]MF_PERSONNEL.RDB;1 Event Notification
AIJ backup operation completed
%RMU-I-AIJBCKEND, after-image journal backup operation completed successfully
%RMU-I-LOGAIJJRN, backed up 2 after-image journals at 13:02:26.92
%RMU-I-LOGAIJBLK, backed up 508 after-image journal blocks at 13:02:26.92
ALL>
|
5106.2 | no clues in show stats | ukvms3.uk.oracle.com::SHISCOCK | stand and deliver | Thu Mar 06 1997 03:40 | 13 |
|
Checkpoint info showed nothing.
The checkpoint info (unsorted) just had the 1 line with the process
id of the backup process and an entry beneath QuietVno.
I've run through identical scenarios with other databases and they
work fine.
Any other ideas how I can find out why this one stalls?
thanks,
Steve
|
5106.3 | | NOVA::R_ANDERSON | Oracle Corporation (603) 881-1935 | Thu Mar 06 1997 08:42 | 23 |
| What *exactly* does the SHOW STATS "Active Stall Messages Screen" display when
the AIJ backup is waiting for available journal?
Here's something to try:
1. $ define/sys RDM$BIND_ABS_LOG_FILE "device:[directory]RDMABS70_PID.LOG"
2. $ rmu/set after/backup=(automatic,backup_file=aij_back_file) db_name
3. $ rmu/set after/switch db_name
Once the ABS backup fails, examine the logfile. Note that the "_PID" portion of
the filename will have been replaced with the ABS process' PID, so the filename
will be something like "RDMABS70_12345678.LOG".
You should see a line (or several) that say "waiting for journal 24 (oldest
checkpoint X:Y)". I would expect the "X:Y" to be something like "24:2".
RMU/DUMP/HEADER should list the active users, including their current checkpoint
information. If this is NOT the case, use the RMU/DUMP/HEADER/OPTION=DEBUG
command and search for an occurrence of "CKPT_VNO = n." where "n <> -1". This
is the process stopping the AIJ backup. (Note that "n" should match the "X"
above).
Rick
|
5106.4 | thanks | ukvms3.uk.oracle.com::SHISCOCK | stand and deliver | Fri Mar 07 1997 02:21 | 23 |
|
Many thanks Rick.
From the ABS log
7-MAR-1997 02:12:35.66 - Oldest RCS checkpoint found 23:2
which repeated itself until the backup timed out.
The dump/head/opt=debug gave the 23. No active users except the
backup process.
CKPT_VNO = 23. DBID = 1.
CKPT_VNO = 4294967295. DBID = 2.
The second ckpt 4294967295 had a dozen or more occurances aganst
different DBIDs. Certainly a big gap in the numbers which I presume
to be incremental. Does that mean a really old checkpoint failed
and it's never been cleared?
cheers,
Steve
|
5106.5 | | NOVA::R_ANDERSON | Oracle Corporation (603) 881-1935 | Fri Mar 07 1997 07:04 | 8 |
| Ignore the 4294967295 (that's a fancy "-1" :-). Those are "no initial
checkpoint" indications.
For the RTUPB entry with the CKPT_VNO=23, what is the process indicated by the
corresponding PID? Could you post the entire RTUPB entry, please? (just the
entry with the CKPT_VNO=23).
Rick
|
5106.6 | no process reference | ukvms3.uk.oracle.com::SHISCOCK | stand and deliver | Fri Mar 07 1997 07:48 | 21 |
|
There is no rtupb with 23 in it. There's only 1 in use with any detail
and that's the first and that belongs to the backup process.
The rest are
RTUPB_ENT[2.] @00C91840
00000000000000000000000000000000 0000 '................'
:::: (1 duplicate line)
They're only 30 user slots.
A word of warning is that this database has been tested against
the ft release of eco1 for 7.0.
cheers,
Steve
|
5106.7 | | NOVA::R_ANDERSON | Oracle Corporation (603) 881-1935 | Mon Mar 10 1997 08:53 | 18 |
| An examination of the database header dump has revealed that a few important
details were missing from previous descriptions.
1. Record caching is enabled (a *major* missing detail).
2. While "fast commit" was disabled (implicitly disabling record cache), 3 AIJ
journals were backed up (26 is now "current" - 23, 24 & 25 backed up).
3. After re-enabling "fast commit" (implicitly re-enabling record cache), the
record cache checkpoint remained at "23" even though the "current" AIJ journal
is "26".
4. The database open mode is "manual" but it appears that the AIJ backup
operation (the failing one) is being done while the database is closed.
I would recommend manually opening the database before performing the AIJ backup
operation. Also, before starting the AIJ backup operation, verify using the
"Checkpoint Information" SHOW STATS screen that the record cache checkpoint was
advanced successfully.
Rick
|