[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

5788.0. "dir node4 * fails with dns." by PLUNDR::LOWEG (WANTED!! A modern day Robin Hood.) Thu Dec 16 1993 13:05

    I have cross posted this note from the DNS conference (980).
    
    One of my customers is using a 75MB DNS as the DECmcc MIR.  We find
    that some wild-carded directory commands from DECmcc fail with the oft
    mentioned infamous 'unable to communicate with server' error message. 
    I wonder if this is because DNS is taking 'too long' to scan the
    database and return the information to DECmcc or whether there is a
    more serious problem?  Our naming convention uses directories based on
    the class name, ie all node4's are in the .node4 directory, collectors
    are in .collector directory etc.
    
    The DECmcc commands:
    	dir node4 *
    	dir circuit *
    	dir remote_station *
    all fail but
    	dir collector *
    for some unknown reason works.  There are about 700 collectors
    registered and just about 30 circuits.  Thousands of node4s and
    thousands of remote_stations.
    
    If I 'limit' the DNS search of node4 and remote_station by saying:-
    	dir node4 .node4.*
    	dir remote_station .remote_station.*
    then both commands work fine.
    
    This 'anoyance' has finally become a real problem since we installed
    TeMIP because it performs a "dir operation_context *" in order to give
    a list of o_cs from which you pick the ones you want.  We have only
    created one but the dir o_c * gives the unable...error message rather
    than our one o_c and we can't get further now.
    
    Also, is it possible to stop DNS building a new clearing-house
    checkpoint file from the old + transaction log on every reboot?  We
    have to reboot the system quite often due to, well ehm, DECmcc
    problems, and it takes about 25 mins to rebuild the clearing house. 
    Obviously the operation needs to be done periodically but does it have
    to be done on every reboot?
    
    System is 5000/240, ULTRIX 4.3 or 4.3A (does it on both), DECnet/OSI 5.1.
    (DECmcc can't use V5.1A because of the change in the DNS clerk.)
    
    chris
T.RTitleUserPersonal
Name
DateLines
5788.1Be carefule when using full wildcardTOOK::KWAKMon Dec 20 1993 10:5915
    
    
    RE: .0
    
    Whenever you issue 'full wildcarded' directory operation, DECmcc
    searches ENTIRE namespace.  It is possible that all DNS servers
    (Master and Read-only) can be visited by the directory operation.
    When none of the DNS server for a directory in the namespace is not
    reachable from MCC system, you get "Unable to Communicate with
    Server" from DNS clerk.
    
    Your idea of limiting the DNS seach should be used when dealing
    with a large namespace with multiple nameservers.
    
    William
5788.2If Timeout, why do some wildcards work?BAHTAT::BONDMon Dec 20 1993 12:3415
    There is only one server that contains the whole namespace.  It is the
    same system that DECmcc is running on.  MCC_DNS_CONF is set to low so
    hopefully the information will be coming from the cache anyway.  
    
    I suggested in the base note that I entered to the DNS notesfile (980)
    that timeouts might be occuring but the response in 980.1 thought this
    to be unlikely and said that a response indicating 'timeout' would be
    expected rather than 'unable to communicate with server'.  But if it is
    a timeout, why do some of the dir class * commands consistently work
    and others consistently fail?  Surely all of them have to search the
    whole namespace if they aren't limited to a particular tree?  Can we
    prove it is a timeout by setting a parameter to a really large number
    as a test (and if so, what is that parameter - the DNS is Ultrix)?
    
    Thanks, chris
5788.3TOOK::KWAKMon Dec 20 1993 15:0939
    
    
>    There is only one server that contains the whole namespace.  It is the
>    same system that DECmcc is running on.  MCC_DNS_CONF is set to low so
>    hopefully the information will be coming from the cache anyway.  
>    
>    I suggested in the base note that I entered to the DNS notesfile (980)
>    that timeouts might be occuring but the response in 980.1 thought this
>    to be unlikely and said that a response indicating 'timeout' would be
>    expected rather than 'unable to communicate with server'.  But if it is
>    a timeout, why do some of the dir class * commands consistently work
>    and others consistently fail?  Surely all of them have to search the
>    whole namespace if they aren't limited to a particular tree?  Can we
>    prove it is a timeout by setting a parameter to a really large number
>    as a test (and if so, what is that parameter - the DNS is Ultrix)?
>   
    
    MCC code explicitly catches DNS_NOCOMMUNICATION, and other error codes 
    from DNS clerk.  You would not see "Unable to communicate with server"
    unless the error code from DNS is DNS_TIMEOUTNOTDONE.
    
    When you issue MCC> show class *, the following calls to DNS happen:
    	Starting from the root (.) of the DNS namespace
        1. Enumerate (this is DNS API) DNS objects with DNS$Class values
    	     ("DNS$Node" for DNA4 and DNA5 objects, "DNS$Group" for Domain
    	      objects, and "MCC_" for other MCC objects) as filter.
        2. Enumerate child directories
    
        3. For each child directory returned from step 2, perform steps
           1 and 2 recursively.
    
    From MCC point of view, the calls made to DNS are the same for all
    classes except the 'filter value' (DNS$Class).
    
    Also when MCC gets "Unable to communicate with server", MCC retries the
    operation upto 5 times after waiting 1 second before the next retry.
    
    William
    
5788.4work-around for the TeMIP partTAEC::FLAUWMarc Flauw, CBS Network Mgt Eng, VBOWed Dec 22 1993 04:2833
Chris,

I think I have a work-around regarding your TeMIP problem. It is totally
unsupported and it is not a recommanded way, but it should help you go
around the fact that the command "dir oc *" is not returning a correct
response.

The command "dir oc *" is only needed to get the list of all know OCs to be
able to add them to the panel. When added to the panel, the name of these
OCs are stored in the temip_resource.dat file.  It is possible to edit this
file to add the OCs to be monitored directly, bypassing the "customize panel
entries". 

First you need to have a temip_resource.dat file present in the local
directory of the user. If there is none, you can create one by using the
Options menu, entry "Save Options" in the Alarm Handling window.

Then, you stop TeMIP Alarm Handling and you edit the temip_resource.dat file,
looking for the following line:
temip.gen_listOfOc:

You add the names of the OC to be monitored at the end of the line,
separated with commas. Note that there is a <tab> between the : and the
first OC name. In the example below, I have added the OCs .t1 and .t2

temip.gen_listOfOc:	.t2,.t1

Restart the Iconic Map PM and the TeMIP Alarm Handling and the OCs that you
have added should be present in the panel window.

Best regards,

Marc.
5788.5But what's wrong? I have the database...!BAHTAT::BONDWed Dec 22 1993 08:1028
    re .4, Thanks Marc, I will get the customer to try this workaround. 
    Whilst there are only a few OCs, this is no problem but if we had a lot
    then we would need the TeMIP/DECmcc MIR routines to expand them
    properly.  Which suggest we still need to get to the bottom of the
    problem...
    
    William, your response in .3 explains how DECmcc asks DNS for the
    objects when a show class * (or dir class * - I guess it is the same)
    command is given.  Basically, the whole tree is walked but you get DNS
    to filter the object class for you.  You say that node/node4 uses
    DNS$GROUP/DNS$NODE as the filter but the other class entites would
    filter on an MCC classname.  If this is the case, why does dir
    collector * work and dir circuit * fail?  They both have to walk the
    tree in the same way.  And because the actual objects are stored in
    .circuit and .collector directories, *if* the directories are actually
    found alphabetically (which I accept, they may not) then wouldn't we
    expect the circuit command to work and the collector command to fail?
    
    I must admit, I didn't understand your opening sentence about the
    timeout.  Were you meaning that the DNS_NOCOMMUNICATION error means
    a timeout didn't occur and that something else is wrong?  If so, what
    because obviously the server is reachable.
    
    I have copied the customer's /var/dss/dns directory contents and can
    make them available for investigation if anybody would like to take
    this futher - please!
    
    chris
5788.6TOOK::KWAKWed Dec 29 1993 11:0749
>    William, your response in .3 explains how DECmcc asks DNS for the
>    objects when a show class * (or dir class * - I guess it is the same)
>    command is given.  Basically, the whole tree is walked but you get DNS
>    to filter the object class for you.  You say that node/node4 uses
>    DNS$GROUP/DNS$NODE as the filter but the other class entites would
>    filter on an MCC classname.  If this is the case, why does dir
>    collector * work and dir circuit * fail?  They both have to walk the
>    tree in the same way.  And because the actual objects are stored in
>    .circuit and .collector directories, *if* the directories are actually
>    found alphabetically (which I accept, they may not) then wouldn't we
>    expect the circuit command to work and the collector command to fail?

I agree with you. I expect that the two "directory" commands do walk through
the same namespace tree using the same filter "MCC_".

BTW, if all of your circuit entities are in .circuit directory, and the
collectors in .collector directory, you can use "partial wildcard":
	MCC> dir circuit .circuit.*
	MCC> dir collector .collector.*

>    
>    I must admit, I didn't understand your opening sentence about the
>    timeout.  Were you meaning that the DNS_NOCOMMUNICATION error means
>    a timeout didn't occur and that something else is wrong?  If so, what
>    because obviously the server is reachable.

There was typo in the reply: DNS_TIMEOUTNOTDONE should have been
    DNS_NOCOMMUNICATION.
    
The meaning of the DNS_COMMUNICATION is defined by DECdns not by DECmcc.
An old definition of DNS$_COMMUNICATION (from The VMS Distributed Name
Service Clerk manual) says:
	No communication was possible with any name server capable of
	processing the request. Check NCP event 353.5 for the DECnet error.


>    
>    I have copied the customer's /var/dss/dns directory contents and can
>    make them available for investigation if anybody would like to take
>    this futher - please!

I do not know how to use the namespace database on my system if I copied the
files over to my system. Do you know how to make DNS clerk to use the
files. For one thing the namespace UID has ethernet address. Does the
DNS clerk use the ethernet address to talk to the server?