T.R | Title | User | Personal Name | Date | Lines |
---|
1977.1 | Shared scsi not supported without ASE | NETRIX::"[email protected]" | Dave Cherkus | Wed Apr 02 1997 15:14 | 16 |
| I think the root cause is when one node leaves your 'cluster' it
issues a SCSI bus reset, and because the ASE code is not installed
on the other nodes they don't know what to do about it.
The NFS hangs are probably due to IOs that will never complete
because of this.
Shared SCSI is not supported without the ASE product installed.
Shutting down everything of course clears the problem.
Why are the two client nodes on the shared scsi if they aren't
serving the data?
Getting them off the bus will prove or disprove my theory.
[Posted by WWW Notes gateway]
|
1977.2 | No-shared bus, same behaviour | ZPOVC::JUSTIN | | Thu Apr 03 1997 04:50 | 14 |
| Hello,
I have tried what you suggested. However the symptoms remain.
We also had shared SCSI when using FDDI for interconnect, it did not
have these problems. However I do see that having shared bus without
ASE (which I understand prevents dual mounting?) is potentially
unsafe.
I am in the process of getting a full TCR-UA license to test if it
will still behave the same.
Is there any other options that I can try?
Justin
|
1977.3 | more info | ZPOVC::JUSTIN | | Thu Apr 03 1997 07:02 | 102 |
|
Hello,
Here is some additional info . When the Master system is booted
(others shutdown) this is what we get
>>> boot
.
.
.
Dual TLEP at node 4
Dual TLEP at node 3
Dual TLEP at node 2
Dual TLEP at node 1
Dual TLEP at node 0
monitorBoot: doing it...
Cluster Memory Channel primary adaptor is online.
Rev 14 adaptor is the primary channel (pci bus 1, slot 0)
connected to virtual hub (VH1) as node 1.
dli: configured
clubase: configured
skipping test/delay for VH0/VH1 system
drd: configured.
dlmsl: configured
cnxagent: configured
dlm: configured.
memory channel thread init
checking for existing memory channel nodes
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
cam_logger: CAM_ERROR packet
cam_logger: bus 0 target 1 lun 0
ss_perform_timeout
timeout on disconnected request
cam_logger: CAM_ERROR packet
cam_logger: bus 0 target 1 lun 0
isp_termio_abort_bdr
Failed to abort specified IO - scheduling chip reinit
cam_logger: CAM_ERROR packet
cam_logger: bus 0
isp_reinit
Begining Adaptor/Chip reinitialization
cam_logger: CAM_ERROR packet
cam_logger: bus 0
isp_cam_bus_reset_tmo
SCSI Bus Reset performed
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
unresponsive mc nodes - waiting for node mask 1
crashing unresponsive node 0
It then hangs here forever. If the memory channel hub is turned off
then this is the boot up sequence.
.
.
.
Dual TLEP at node 4
Dual TLEP at node 3
Dual TLEP at node 2
Dual TLEP at node 1
Dual TLEP at node 0
monitorBoot: doing it...
Cluster Memory Channel primary adaptor is online.
Rev 14 adaptor is the primary channel (pci bus 1, slot 0)
connected to virtual hub (VH1) as node 1.
dli: configured
clubase: configured
skipping test/delay for VH0/VH1 system
drd: configured.
dlmsl: configured
cnxagent: configured
dlm: configured.
memory channel thread init
checking for existing memory channel nodes
booting as primary memory channel node on mc0
memory channel software inited - node 1 on mc0
ccomsub: configured
mcnet: configured
Starting secondary cpu 1
Starting secondary cpu 2
Starting secondary cpu 3
Starting secondary cpu 4
Starting secondary cpu 5
Starting secondary cpu 6
Starting secondary cpu 7
Starting secondary cpu 8
Starting secondary cpu 9
.
.
.
|
1977.4 | Bad MC jumper settings | NETRIX::"[email protected]" | Dave Cherkus | Thu Apr 03 1997 08:57 | 6 |
| Ah! You are using a real hub, yet your MC board is jumpered
for virtual hub. The MC board should have come with a manual
explaining how to change this. If not, let me know and I'll
vector you to a web page that explains it.
[Posted by WWW Notes gateway]
|
1977.5 | pin 1-2 jumpered? | ZPOVC::JUSTIN | | Thu Apr 03 1997 09:04 | 6 |
| Hi,
We've double checked it with the manual, the jumper is across
pin 1 and 2, of the 3 pins. Is was also the factory default. This is
the line card on the PCI bus that we are talking about right?
Justin
|
1977.6 | Bad board? | NETRIX::"[email protected]" | Dave Cherkus | Thu Apr 03 1997 15:36 | 7 |
| According to my info, you are correct, so if Digital UNIX is
still reporting the MC board is in virtual hub mode I would
suspect a defective board. It will never work till UNIX
reports a STD (real hub) setting instead of VH0 or VH1.
Dave
[Posted by WWW Notes gateway]
|
1977.7 | BTW... | NETRIX::"[email protected]" | Dave Cherkus | Thu Apr 03 1997 15:39 | 5 |
| ...your printout says VH1, which is the 'no jumper installed' setting.
I really suspect a defective board or jumper, or a misinstalled jumper.
Dave
[Posted by WWW Notes gateway]
|
1977.8 | bad jumper setting | ZPOVC::JUSTIN | | Fri Apr 04 1997 00:59 | 10 |
| Hello,
Yes the jumper on the line card on the master node was not
inserted properly, hence virtual hub. Once it is properly inserted
everything works fine. Including the NFS/NIS. We will disconnect the
clients from the shared FWD-SCSI bus for safety reasons.
Thanks Dave for your help.
Justin
|
1977.9 | You're welcome. | NETRIX::"[email protected]" | Dave Cherkus | Tue Apr 08 1997 09:53 | 6 |
| > Thanks Dave for your help.
No problem. Glad things are working fine now.
Dave
[Posted by WWW Notes gateway]
|