T.R | Title | User | Personal Name | Date | Lines |
---|
4961.1 | | DUCATI::LASTOVICA | Is it possible to be totally partial? | Fri Jan 24 1997 19:25 | 3 |
| presuming that they are already at the current version of Rdb (you
didn't mention it), it might be interesting to dump the AIJ file and
then look for all references to the DBKEY(s) of interest.
|
4961.2 | | M5::PSOEHL | Go see THE RELIC!!! | Fri Jan 24 1997 20:03 | 6 |
| You're right I didn't say what version: it's 6.1-4.
As I did mention in my note, I would like to see the AIJ as well, but
regrettably they blew it away.
Thanks
|
4961.3 | | HOTRDB::LASTOVICA | Is it possible to be totally partial? | Sat Jan 25 1997 01:01 | 3 |
| > regrettably they blew it away.
along with all hope of figuring anything out.
|
4961.4 | Progress ... | NOVA::JIANG | Oracle Corporation (603) 881-0815 | Sat Jan 25 1997 10:06 | 5 |
| FYI, a day-one problem in collect locked space code has been found in
7.0 code and the fix has been backported to 6.0 and 6.1.
Our lab tests so far showed very positive results with heavy loads and
DELPRC. The fix will be available in the next ECO.
|
4961.5 | i hope we found it... | NOVA::SMITHI | Don't understate or underestimate Rdb! | Sat Jan 25 1997 15:19 | 21 |
| I'd just like to set correct expectations here. This (and other reports) of
occassional corruption have been very hard to track down. The problem we just
discovered *could* produce the type of problem described by this note.
Obviously without an AIJ we can not say for sure. However, we have some
optimism.
The problem we found deals with collecting lock free space, in the area of
reusing the LDX/TSN index entries. Under high load environments it is
possible (although extremely unlikely) that two different processes would
reuse the same line on a page. This requires (a) high concurrency, (b)
multiple processing trying to use the same *page* and (b) the right conditions
present in the LDX vector itself. This final point in particular is why this
problem is so rare.
I believe this problem has existed for many versions. It is only in the last
year with faster processors and faster I/O that the symptoms have appeared.
Thanks go to Rick and Richard (and others) for working so hard on these
problems.
Ian
|
4961.6 | | M5::DGROBERT | | Thu Jan 30 1997 14:54 | 10 |
| Would the bugcheck dump that Pat has for this customer be of any value
to look at? There were multiple processes writting to this page at the
time. The records that were targeted to this paged and fragmented,
wrote to nearby pages in the buffer. Actually it is the fragment page
that is missing the rest of the record. There were multiple records
that fragmented and the fragments went to the same two pages prior to
the target page. If we can determine that the suspected problem caused
this customers corruption it would be great. It would be even greater
if we could get them a fix. This cost them alot, my hide will grow
back.
|
4961.7 | | HOTRDB::PMEAD | Paul, [email protected], 719-577-8032 | Thu Jan 30 1997 15:29 | 3 |
| If you have an AIJ then there could be some interesting tidbits.
Realistically, there is little chance we will really be able to say for
sure what happened.
|
4961.8 | May have db & aijs for this... | M5::BLITTIN | | Tue Feb 18 1997 11:29 | 5 |
|
I've got a ct with the same bugcheck, diff offset (+34D) with the db
and aijs available. Waiting to see if they will be able to send these
on tape. RDB version is 6.0-1, but believe info may be of help. Will
post updates...
|
4961.9 | +8E4 | M5::BLITTIN | | Thu Feb 20 1997 10:01 | 4 |
|
Another ct called yesterday with same bugcheck (+8E4). Running
a report and can dup it from interactive sql. Bugcheck available.
Also on an Alpha OpenVMS 6.1; RDB6.1-04.
|
4961.10 | | M5::DGROBERT | | Mon Feb 24 1997 12:51 | 6 |
| It would seem to me that its possible, if the conditions are right, that
entire record(s) could be lost due to this problem. We are only
seeing it discovered/reported on the retrival of fragmented records.
Ouch! How does rmu/recover of an aij that contains records referencing
the same line react? Does it die with a specific error or write the
last entry referencing the line? Anyone write an ALERT on this yet?
|
4961.11 | Another one | svrav1.au.oracle.com::MBRADLEY | I was dropped on my head as a baby. What's your excuse? | Wed Mar 05 1997 01:17 | 10 |
| I have one of these on 6.0-12.
Do we have a projection on what version/ECO may resolve the known problem?
Is Eng. interested in the DB and AIJ (which would be fairly sizable in this
case - 8M rows for the table)?
Cheers,
Mark.
|
4961.12 | Another one bites the dust | NOMAHS::SECRIST | Rdb WWS; [email protected] | Thu Mar 13 1997 16:20 | 9 |
|
I've got a customer who just got bit for the second time by this,
allegedly in the same table ! Rdb V6.1-1 and VMS 6.1 with ACMS.
Always at DIOFETCH$FETCH_ONE_LINE + 3A5. Has anyone submitted a
reproduceable case yet ?
Regards,
rcs
|
4961.13 | diofetch$fetch_one_line + 3a5 | NOMAHS::SECRIST | Rdb WWS; [email protected] | Fri Mar 14 1997 15:58 | 9 |
|
BUGCHECK AT DIOFETCH$FETCH_ONE_LINE + 3A5 has bit the same table
twice, and contention and fragmentation are a factor just like
in bug 352454, only this is a VAX. This may be worthy of an
attempt to reproduce it if anyone wants the table information,
etc.
Regards,
rcs
|
4961.14 | Which ECO please | svrav1.au.oracle.com::MBRADLEY | I was dropped on my head as a baby. What's your excuse? | Sun Mar 16 1997 21:57 | 15 |
| > <<< Note 4961.4 by NOVA::JIANG "Oracle Corporation (603) 881-0815" >>>
> -< Progress ... >-
>
> FYI, a day-one problem in collect locked space code has been found in
> 7.0 code and the fix has been backported to 6.0 and 6.1.
>
> Our lab tests so far showed very positive results with heavy loads and
> DELPRC. The fix will be available in the next ECO.
I have seen this on 6.0-12, adn the customer was wondering which ECO may
fix the problem?
Thanks,
Mark.
|
4961.15 | Anybody seen this in V7.0 ? | NOMAHS::SECRIST | Rdb WWS; [email protected] | Mon Mar 17 1997 09:22 | 10 |
|
; I have seen this on 6.0-12, adn the customer was wondering which
; ECO may fix the problem?
I have seen this on 6.1-1 and I have a customer that wonders that
same thing ;-)
Regards,
rcs
|
4961.16 | | M5::LWILCOX | Chocolate in January!! | Tue Mar 18 1997 13:30 | 12 |
| <<< Note 4961.13 by NOMAHS::SECRIST "Rdb WWS; [email protected]" >>>
-< diofetch$fetch_one_line + 3a5 >-
>> This may be worthy of an
>> attempt to reproduce it if anyone wants the table information,
>> etc.
Richard, I suspect that *you* will be the one tasked with this if it needs
to be bugged.
:-).
|