| Hi Ivan,
I think that your HO CLuster is not setup correctly for the DDS.
It's not allowed to have the cluster-members (HOVAX2,HOVAX3) and the
ClusterAlias (HO) in the DDS-network.
Specifying only the ClusterAlias is fairly enough. But make sure, that DDS
is started on all (HOVAX2,HOVAX3) nodes.
I remember this, because I ran at this problem at a customer long time ago.
Below you can find also a TIMA/STARS article.
HTH,
Cheers
Charly
DDS: Incorrect use of DDS on clusters
COPYRIGHT (c) 1988, 1993 by Digital Equipment Corporation.
ALL RIGHTS RESERVED. No distribution except as provided under contract.
Copyright (c) Digital Equipment Corporation 1990, 1992. All rights reserved
PRODUCT: Message Router V3.1
SOURCE: Digital Customer Support Center
\by UKCSSE/SPE and MIG Engineering (999997) -
\Based on FORTY2::MAILBUS_UPDATE_V31 conference note # 41
PROBLEM:
A cluster must be treated as a single node for DDS and the cluster alias
must be used as the node name in the DDS nodes list.
Problems can occur if both the alias and an actual node name are entered
in the DDS nodes list. Problems can also occur if more than one node in
the cluster is entered in the nodes list and the cluster alias is not.
In these cases all transactions affecting maintenance objects will
be duplicated on each of the nodes. The reason for this is that the
server sends out these transactions to all named "nodes" in the nodes
list.
Should one of these node names also be a world search node, then all
transactions originating on 'this' node would be duplicated. Should
BOTH of these node names be reflected as world search nodes, then all
transactions originating on ALL nodes would be duplicated.
\
\ All the effects on Node number inconsistencies (described in Note 40 of the
\ FORTY2::MAILBUS_UPDATE_V31 VAXnotes conference) can also occur in this
\ case as the DDS config file will have one node number in it so that
\ each node in the cluster will have the same configuration node number,
\ but each will have a different nodes list node number.
\ This is described in another article in this database. To find this article
\ search on MAILBUS_UPDATE_V31, then subquery (keypad PF3) on the note
\ number: 40
SOLUTIONS:
If cluster aliasing is correctly enabled but both the cluster alias and
a single node from the cluster are included in the nodes list, use MBMAN
and remove the actual node name from the nodes list.
MBMAN> DELETE DDS NODE name
This will leave the cluster node name in the nodes list. You must ensure
that the DDS configuration number is the number for the cluster alias.
You can check the configuration number by using the command:
MBMAN> SHOW DDS CONFIG /NUMBER
Check the cluster alias number with NCP>SHOW EXEC CHAR and look for a line
with this format:
Alias node = 28.222 (NODE1)
where NODE1 is the alias name and 28.222 is the alias node number in this
example.
If cluster aliasing is not correctly enabled on the cluster and multiple
nodes from the cluster are in the nodes list, then the only real
solution is to remove the whole cluster from the DDS system, delete its
database files and start again using the cluster alias.
NOTE: Cluster aliasing in this case does not refer to the transfer service
(TS). The TS can use aliasing optionally. But DDS must use the ALIAS
if installed on a cluster. IF cluster aliasing is correctly enabled
then:
1) NCP>SHOW EXEC CHAR will show a line for the alias node name and
number using this format: Alias node = 11.111 (NODENAME)
2) Using @SYS$MANAGER:MB$CONFIG SHOW DDS will show the master
node. This should be the ALIAS name and NOT the node name when
referring to clusters.
3) On a properly set up master node, the MBMAN>SHO DDS NODE *
command will give the list of nodes in the DDS network as well
as their CORRECT DDS node number. By default, the DDS node number
is the same as the DECnet node number. (Calculate this using the
procedure in section 1.5.3 of the Message Router Management Reference
Manual, i.e., DECnet area number * 1024 + DECnet node number)
\\ %QE730 VER=3.1_MSG-RTR PROD=MSG-RTR SPD=26.33 OS=VMS GRP=MAILBUS CAT=COMM
\\ STATUS=EXPIRED DB=MAILBUS
|
| I have had the customer make some changes, and now the Node list looks
like the following.
A1_HOVAX2=> mc mbman sho dds node *
BIGTOS : 20501 World search node
ITDEV : 1026
BIGTNY : 20496 World search node
TIBOS : 20500 World search node
ITVAX1 : 1031 World search node
X400GW : 20546 World search node Master node
BIGTLN : 20495 World search node
CGVAX1 : 1035 World search node
ITVAX2 : 1028 World search node
HO : 1224 World search node
ITDXMR : 20532 World search node
The problems still persist.
The searches from the remote ALL-IN-1 system as example do not display
H* users, but will display b* users. A user with the name Brady at the
same site, same DDS owning node and same mail address (similar), will
display, when no users named Hanrahan will display atall.
Looking in the DDS logs, there are errors, an extract of which are
below. THe only explanations I can find of these erros is regarding
vague requests, and subsequent creation of more processes.
Server output logs have the following %DDS errors -
sear/win=10 $1$DUA3:[MB$.DDS.SCR]DDS$SERVER_OUTPUT_HOVAX2_1.LOG;79 "%DDS-"
� Digital Equipment Corporation 1978, 1994. All rights reserved.
26-APR-1997 11:16:13.99, %DDS-I-SL_SERV_START, Event type 3.3 - Server started,9
26-APR-1997 11:20:18.20, %DDS-I-SL_SERV_SHUTDOW, Event type 3.4 - Server, PID=2n
%DDS-E-INTERR, Internal logic error 2 4 - please report to DIGITAL
%DDS-E-INTERR, Internal logic error 2 5 - please report to DIGITAL
%DDS-E-SHUTDOWN, DDS has shut down
%DDS-E-INTERR, Internal logic error 2 4 - please report to DIGITAL
Also the error logs have many of the following errors.
%DDS-I-NET_OUTCON_FAIL, Event type 4.8 - Failed to establish outbound
connection to CGVAX2
-SYSTEM-F-NOSUCHOBJ, network object is unknown at remote node
%DDS-I-NET_OUTCON_FAIL, Event type 4.8 - Failed to establish outbound
connection to CGVAX2
-SYSTEM-F-NOSUCHOBJ, network object is unknown at remote node
%DDS-I-NET_OUTCON_FAIL, Event type 4.8 - Failed to establish outbound
connection to CGVAX2
-SYSTEM-F-NOSUCHOBJ, network object is unknown at remote node
Though these last errors seem explanatory enough, if this were the
case, I would expect DDS to fail all the time. As apose to just on H's
??????
All help gratefully reveived.
Ivan
|
| Thanks for that, I have requested that the customer does a
Make a copy of all [MB$.DDS.DB].DAT files
MBMAN> susp dds @mb$tools:dds$compress
I believe this will reorganize the AIF files, and highlight any
potential problems. I will post any results, though the customer wants
this done out of hours, so it maybe a while before this gets posted.
Cheers
Ivan
|
| Thanks for your help here, this was close to becoming the proverbial
hot potatoe. The MBMAN CREATE DDS AIFPERM command worked a treat, and
the customer now seems exceptionally happy.
Birds are singing, the sky is blue etc etc.
Cheers
Ivan.
|