Title: | VAX DBMS |
Notice: | THIS NOTESFILE IS NOT A FORMAL SUPPORT CHANNEL |
Moderator: | SCARY::CHARLAND |
Created: | Thu Feb 20 1986 |
Last Modified: | Tue Jun 03 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2642 |
Total number of notes: | 11044 |
6.1-1, ECO 1 not yet applied. Database is configured for 5 journals, using 3, 75,000 blocks each. FAST COMMIT disabled (as seen in the header and in the bugchecks). ALS enabled, OVERWRITE enabled. Customer reports that for the past 3 nights one of her journals has become inacessible and she receives an ALS bugcheck w/no exception but with AIJUTL$ABORT in the stack: 19-OCT-1995 17:05:27.89: Linked ALS (DBMS) DBM$ALPHA_STD:[KIT] 19-OCT-1995 11:34:06.42: Compiled ALS (KODA) KOD$ALPHA_V0611D:[CODE] 19-OCT-1995 11:33:47.93: Compiled KOD$LIBRARY (KODA) KOD$ALPHA_V0611D:[CODE] ================================================================================ Stack Dump ================================================================================ Saved PC = 00133A38 : STD$DUMP_ALPHA_VMS_STACK + 00000088 Saved PC = 00119044 : KOD$BUGCHECK_DUMP + 00001014 Saved PC = 0006AACC : AIJUTL$ABORT + 000002E4 Saved PC = 00071040 : AIJUTL$SWITCH_FILE + 00000CE8 Saved PC = 000730B4 : AIJUTL$UPDATE_LEOF + 0000043C Saved PC = 00066A40 : ALS$FLUSH_ONE_CACHE + 00000158 Saved PC = 00066540 : ALS$FLUSH + 00000180 Saved PC = 000662B8 : ALS$MAIN + 000009B8 Saved PC = 9EE69C44 : S0 address ================================================================================ One from the VAX looks like this: 1 00000000 Handler = 0007CC15, PSW = 0000, CALLS = 1, STACKOFFS = 0 Saved AP = 7FE1D2D4, Saved FP = 7FE1D2B4, PC Opcode = DD SR2 = 00173008: 00000000 00000000 00000000 00009C22 00063A24 000054FE 00000000 SR3 = 0015EA00: 000007A8 00000000 00000000 00000005 0000002C 00000000 00000000 SR4 = 00143700: 00000000 01000000 00000000 00000000 00000001 01C9007C 00000001 SR5 = 00000004 SR6 = FFFFFFFF SR7 = 00000E10: 0001C2B3 0000454C 49465F45 5441434E 5552545F 4F492449 534F4315 SR8 = 000007A9: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 SR9 = 00000001 16 bytes of stack data from 7FE1D2A4 to 7FE1D2B4: 0025B2100025B2200000000000000001 0000 '........ �%..�%.' Saved PC = 0002639F : AIJUTL$ABORT + 00000187 ARG# Argument [data...] ----------------------------------------------------- 1 00000005 2 00000001 Handler = 0002641D, PSW = 0000, CALLS = 1, STACKOFFS = 0 Saved AP = 7FE1D864, Saved FP = 7FE1D828, PC Opcode = D4 SR2 = 0015FA00: FFFFFFFE FFFFFFFF 0000000A 00010001 00110001 00000001 268109D5 SR3 = 00000005 Judging from argument 1 it looks like the AIJ is full. Well, that's ok, now we just want to switch to another and overwrite it if necessary so we can continue processing. There also is a recovery bugcheck dump produced that looks like this: ****************************** SYS$SYSROOT:[SYSEXE]DBMDBRBUG.DMP;4 ***** Exception at 0007ACC4 : DBR$RECOVER + 000009C4 %DBM-F-FILACCERR, error opening run-unit journal file SYS$COMMON:[SYSMGR.SHIPPIN G.RUJ.ROB]MANDB$009B3CAB4233DBFB.RUJ;1 -RMS-E-FNF, file not found ****************************** SYS$SYSROOT:[SYSEXE]DBMDBRBUG.DMP;3 ***** Exception at 0007ACC4 : DBR$RECOVER + 000009C4 %DBM-F-FILACCERR, error opening run-unit journal file SYS$COMMON:[SYSMGR.SHIPPIN G.RUJ.ROB]MANDB$009B3CAB39F7183B.RUJ;1 -RMS-E-FNF, file not found ****************************** SYS$SYSROOT:[SYSEXE]DBMDBRBUG.DMP;2 ***** Exception at 0007ACC4 : DBR$RECOVER + 000009C4 %DBM-F-FILACCERR, error opening run-unit journal file SYS$COMMON:[SYSMGR.SHIPPIN G.RUJ.ROB]MANDB$009B3CAB39F7183B.RUJ;1 -RMS-E-FNF, file not found ****************************** SYS$SYSROOT:[SYSEXE]DBMDBRBUG.DMP;1 ***** Exception at 0007ACC4 : DBR$RECOVER + 000009C4 %DBM-F-FILACCERR, error opening run-unit journal file SYS$COMMON:[SYSMGR.SHIPPIN G.RUJ.ROB]MANDB$009B3CAB4233DBFB.RUJ;1 -RMS-E-FNF, file not found The 4 dbr bugchecks were produced only seconds apart from one another. The only reason I could come up with for overwrite not to overwrite is if fast commit is enabled which it is not in her case. Have I missed something obvious? (or something subtle...?) Thanks, Liz
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
2625.1 | m5.us.oracle.com::LWILCOX | How about Fireworks? | Tue May 06 1997 15:42 | 7 | |
One other tidbit I forgot to mention is she tried to use ROTATE and this command appeared to hang. She also mentioned at one time she noted a stall message along the lines of "waiting for AIJ submission". Thanks. Liz | |||||
2625.2 | NOVA::R_ANDERSON | Oracle Corporation (603) 881-1935 | Tue May 06 1997 21:38 | 15 | |
For the benefit of other readers, I'll provide a synopsis of what I sent Liz offline. The ALS switch-over operation failed because 1 (or more) of 3 events occurred: 1. The switch-over timed out (not likely). 2. The switch-over was being performed by DBR (impossible :-) 3. A DBR was invoked after ALS was unable to locate a valid journal for over-writing (most likely). I'm not sure given the available information why the journals could not be over-written (especially because fast-commit is disabled so there are no checkpoint requirements), so further analysis is required. Rick | |||||
2625.3 | ACE *might* be the problem? | M5::LWILCOX | How about Fireworks? | Wed May 21 1997 15:12 | 8 |
After turning off the AIJ on electronic cache feature the database has not had any problem with this. I think I remember seeing something about that somewhere, but can't come up with it. Anyone else have any kind of similar experience (bugchecks so long as ACE was on)? Thanks. Liz | |||||
2625.4 | More occurrences | M5::DMACKENZ | Wed May 28 1997 12:57 | 28 | |
This problem has occurred two more times. Today's incident was when AIJ cache was disabled and DBMS is 6.1-11. Bugchecks are on their way; I expect to have more information soon. The first bugcheck occurred on May 4th, DBMS 6.1-1, AIJ cache enabled. No exception in the bugcheck. The AIJ_STATUS is 1 (filaccerr right?) Following is noteworthy information about each AIJ file: o Current: Activated May 4th, modified, inaccessible, never backed up, AIJ_STATUS=5 o Last Used: Activated May 4th, modified, 99% full, backup in progress, never backed up, AIJ_STATUS=0 o Oldest: Activated April 28th, modified, 99% full, backup in progress, last backed up April 28th, next to be backed up, AIJ_STATUS=0 I've requested operator notification be enabled since it is not at this point. The problem was corrected (disable AIJs, drop AIJs, create AIJs, BU database) before today's problem was reported. Would there by any clues remaining as to the cause for the AIJ being marked inaccessible and the reason the AIJ backups are not completing? What does an AIJ_STATUS of 5 mean? Any other insights? Thanks, Diane | |||||
2625.5 | AIJ backups not completing | M5::DMACKENZ | Wed May 28 1997 14:55 | 12 | |
I received the ALS bugcheck from May 22nd and took a look at it. DBMS is 6.1-11 and AIJ caching is disabled. Again no exception. Two AIJ files are in the process of being backed up and the third is current and marked full and inaccessible. It appears to me that the current AIJ file is being marked inaccessible because it is full and there are no AIJ files available to write to, since the other two are in the process of being backed up. I'll be looking for clues as to why the backups aren't completing. Would there be any left behind? Diane |