T.R | Title | User | Personal Name | Date | Lines |
---|
5313.1 | | BSS::JILSON | WFH in the Chemung River Valley | Mon May 19 1997 13:54 | 4 |
| We don't test unsupported operations so how can we describe what behaviours
happen when you do something unsupported?
Jilly
|
5313.2 | Use a different Allo class | SSDEVO::MARTENS | Bert Martens, CXO Storage Solutions | Mon May 19 1997 15:41 | 14 |
| The CI address is not part of the issue. The cause of the problem is
MSCP will not support 2 units with the same unit number in the same
allocation class. It should spin down the unit(s). Since the only
component that will notice the duplicate unit numbers is OpenVMS,
that is what took the action. Notice that the placing of the drive
into mountverify is used to prevent data integrity issues.
I don't remember what the "rules" that OpenVMS uses to manage this
condition.
Regards,
Bert
|
5313.3 | See 5254.* For Another Unit Number Discussion | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Mon May 19 1997 18:48 | 0 |
5313.4 | Both drives spun down | CSC32::S_DANNEN | Live long and slobber | Tue May 20 1997 09:54 | 3 |
| Last time I saw this, both drives spun down
/steve
|
5313.5 | | STAR::CROLL | | Tue May 20 1997 10:33 | 18 |
| What do you mean by "preempt"?
Can you give some more details about exactly what
happened?
I've been poking around in DUDRIVER, and I don't see any
place where DUDRIVER spins down a unit when there are
duplicate unit numbers; DUDRIVER appears to treat
duplicate unit numbers (i.e., new unit attention
messages that match an existing UCB) as a new path to
the existing unit. From DUDRIVER's perspective, the
scenario described in .0 would simply be a third (and
fourth) path to the same gizmo.
I'll go ask the DUDRIVER maintainer to see if my
interpretation is correct....
John
|
5313.6 | It's a feature..... | STAR::CROLL | | Tue May 20 1997 11:00 | 30 |
| I talked with the DUDRIVER maintainer, and he confirmed
my interpretation in the prior reply.
If by "preeempt" in the base note, you meant that I/O
operations started going to the new D210 instead of the
old one, this probably happened as a result of the
static load balancing DUDRIVER performs when it
discovers a new path.
DUDRIVER gets a snapshot of the current load on the HSJ
at the time it forms a connection, and uses this in path
assignment when new paths to existing units show up.
What probably happened is when the new path to D210 was
discovered, DUDRIVER noticed that the new HSJ had less
load then the old one, and therefore switched the path
to the new HSJ.
DUDRIVER matches the allocation class, device class
letters (the "dd" part of the "ddcu" device name), and
the unit number against existing units. If there's a
match, DUDRIVER assumes it's another path to the same
unit. This is why you must always have different
allocation classes for different units -- this is a good
idea even if you don't have units with the same unit
number.
Is this enough of an explanation? If you need something
more formal, log an IPMT....
John
|
5313.7 | 'Bout what I thought | CRLRFR::BLUNT | | Tue May 20 1997 14:36 | 12 |
|
Yes, I/O ops started going to the new unit. In their config, this was
unfortunately an Oracle index. Bad thing. However, I've received the
information that I needed. While I understand that "we" don't test
unsupported configurations, I can't imagine that "we" haven't at some
time (either planned or not) idiot tested our gear (or KNOW explicitly
what would happen). The bottom line was that the customer did, and
wanted an explanation.
So, John, this level of explanation is fine. Thanks!
bob
|
5313.8 | Ancient history | CSC32::S_DANNEN | Live long and slobber | Tue May 20 1997 15:57 | 13 |
| John,
I am referring to old history, of course. I was putting together
several stacks of RA82's on hsc70's, cut the wrong piece of off
the unit plug on one drive (duplicate unit numbers). Drive would
go through self-test at power up, spin up, then set the fault led's
indicating microprocessor board, HDA, power supply, hybrid board,
and spin down. Imagine how upset I was after replacing everything
except checking the unit plug! :) stange thing was that the duplicate
unit number drive was in another SAxxx cab, and would only fail when
it completed it's idle loop self test (these disks were not on-line
to VMS at the time) Ah the good old days!
/steve
|
5313.9 | | STAR::CROLL | | Wed May 21 1997 10:59 | 19 |
| re .8:
I believe that if an HSC or HSJ sees duplicate unit
numbers on the drives directly connected, it'll spin one
(or both) down. The HSx has more knowledge about the
configuration and what's "legal" -- DUDRIVER on the
other hand, has to deal with a lot more configuration
complexity.....
re .7:
As for idiot testing: we do do a huge amount of
testing, but we concentrate on making sure the stuff we
officially support works properly. Stuff that is not
supported either doesn't work, or was never designed to
work in the unsupported ways. The supported stuff is
complicated enough without opening up the test matrix to
everything else. Besides, we *did* know what was going
on in this situation; it just took a bit of digging.
John
|