[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | ase |
|
Moderator: | SMURF::GROSSO |
|
Created: | Thu Jul 29 1993 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2114 |
Total number of notes: | 7347 |
2078.0. "drd-data-compare=3 ???" by 22740::TERENCELEUNG () Wed May 21 1997 07:25
Hello,
We have a very similar situation of note 1991 in this conference.
It's a dual 2100A DU4.0B, TCR 1.4 system, Oracle OPS, dual HSZ40
with 6 RAID-5 configured RZ29's.
Problem : When drd-data-compare=3, one system panics immediately when
DRD service is accessed remotely by any cluster member.
For easy reference, one system is called System A, the
other is called System B. System B always panics whenever
DRD service is accessed remotely by Oracle. For example,
svrmgrl > startup parallel
svrmgrl > drop user user_a (or any operation access DRD
remotely)
System B will panic immediately regardless of which system
executes the above command, as long as the system executing
the commands is accessing the DRD service remotely. That is,
if System A owns DRD service, B will panic if B accesses
the service remotely;
if System B owns the service, it will still B panic if A
accesses the service remotely.
Actions taken so far :
- The most updated 4.0B patch DUV40BAS00003-19970425 was applied.
- New wire method was turned off by "new-wire-method=0".
- "Simport" patch has been applied.
- Swap all hardwares from System A (no panic) to System B(panic), which
includes CPU, IO board, Memory module, memory channel, KZPSA, all
other expansion cards in PCI slots, system disk(together with OS) and
a local data disk.
- Replace memory channel cable.
- Turn off the two HSZ40 one at a time.
- By using the shell script given by 1991.8, dd two 8k files to DRD,
read back and compare. There is no comparision error and both system
do not panic. We have test this on both System A and B, local and
remote access DRD.
Question :
1. What is "drd-data-compare" ? What is its default value ? What is
the significance of setting it to 3 ? Before setting it to 3, there
is no panic, data corruption occurs once in one or two week. After
setting it to 3, System B panics immediately if one cluster member
accesses DRD remotely.
2. What is the difference, as far as UNIX and TCR is concerned,
between remote access a DRD service by "dd" and Orcale ?
Thanks in advance,
Terence
T.R | Title | User | Personal Name | Date | Lines |
---|
2078.1 | drd-data-compare set equal on all hosts? | NNTPD::"[email protected]" | Pelle | Wed May 21 1997 15:00 | 37 |
| Be sure to set drd-data-compare to the same value on ALL the hosts. See
notes 2065.1
Here is an excerpt od the man page for drd(8):
drd-data-compare
When this attribute is set to 1, 2, or 3, the DRD subsystem
performs a checksum of the data portion of read and write
requests. For proper operation, this attribute must be set to
the same value on all cluster members.
When this attribute is 0, no data check summing and comparisons
are performed.
When this attribute is 1, the bsc_stats.bsc_read_miscompares stat
counter is incremented on DRD client read miscompares and the
bss_stats.bss_write_miscompares stat counter is incremented on
DRD server write miscompares.
When this attribute is 2, the stat counters are incremented as
appropriate and one of the following error messages is written to
the console and kernel log files:
bsc_do_unmap_RM: READ check sum failure server = # client =
#
bsc_rm_docopyinout: READ checksum failure server = # client
#
bss_rm_server: WRITE checksum failure client = # server = #
When this attribute is 3, the stat counters are incremented as
appropriate, the pertinent messages are written to the log files,
and the system panics.
All cluster members must use the same drd-data-compare value.
Otherwise, some cluster members will not initialize the checksum
value, causing other members to erroneously report that data
corruption has occurred. <tuning not supported>
[Posted by WWW Notes gateway]
|