[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1958.0. "Node not currently accessible." by KETJE::PACCO () Tue Dec 17 1991 12:21

    On a VAXstation 3100 M38 running DECmcc BMS V1.1 polled rules regularely
    gives the exception "Node not currently accessible".  Regularely means 5-10
    times a day where in average 2500 poll/day are being done.  Although it
    may seem "low", it is unacceptable to ring automatically somebody at
    home to tell him the node not currently accessible, because if you look
    at the network at the same time, everything is right.
    
    The strange thing is that it is not reproductible at will, and that it
    appears both during daytime as nighttime ( even when traffic on the network
    is low).
    
    The Entities polled are circuits on DEMSA's.  No network resource
    problems on the DEMSA's or on the workstation are being encountered
    because all error DECnet counters remains at 0.  The number of "MAximum
    active logical links" also is far under the "Maximum logical links" allowed.
    I suspect another kind resource problem on the workstation, something
    which perhaps was not mentionned in the release notes.  The
    quota's used are the ones described in the V1.1 release notes.  There are
    only 12 rules created and activated in the workstation.  There is only 1
    DECmcc workstation in the network.
    
    Can anybody be clear on the possible causes of that NODE4 EXCEPTION,
    What are e.g. the possible VMS error codes causing that exception.  Is
    it a time-out problem???
    
    An implementation will stay or be cancelled due to this (for the customer)
    unacceptable behaviour.  I do not like to put pressure to anybody, but
    I am unable to do it without any additional aid.  Can the code of the
    node4_am be searched to know where that exception could be generated,
    after which VMS error cause ?
    
    Regards,
    	Dominique.

T.R	Title	User	Personal Name	Date	Lines
1958.1	exception vs rule fire	DADA::DITMARS	Pete	`Thu Dec 19 1991 14:51`	47
	Hi, Please forgive what may be a silly question. Does this condition cause the rule to "fire" or to encounter an exception? If the customer is currently using the same procedure for both his "procedure" and the "exception handler", perhaps it would be acceptable to simply create a different procedure for the exception handler which would be smart enough to ignore these spurious problems. From the V11 online help: ENTITY ALARMS RULE CREATE Exception_Handler Information When an error occurs during the evaluation of an alarm expression, the error is passed, as string values, to the command procedure through the following parameters: P1 - Rule name P2 - Description P3 - Category P4 - Expression P5 - Time of error detection (Binary Absolute Time output format) P6 - Error that occurred. This parameter will also contain the string "The rule has been disabled." if the error caused the rule to be automatically disabled. There may be a number of separate text strings used to describe the error and/or the state of the rule. When this is the case, they will be separated by the string "<EOS>" (End Of Segment). P7 - Parameter P8 - Data file. ENTITY ALARMS RULE CREATE Exception_Handler Subtopic?
1958.2	The unexpected alarm is an exception.	KETJE::PACCO		`Fri Dec 20 1991 06:03`	24
	Pete, The alarm rule fires an Exception, but the silly thing is that there should be no exception (neither a normal alarm) being fired if (an I am assuming this) the network is stable and operational. If the cause of the exception is something related to the management system (And I presume this), that situation is not acceptable. If the problem is in the management station then, if it is a resource problem, that should be solved. if it is a DECmcc code problem, that should also be solved. if it is e.g. because the management station does not see the network, that would be acceptable, but then there should be hard evidence (with DECnet EVL) that the station was temporarely "disconnected". For this last assumption, no event has been seen around the intervals when the exception happens, although event logging has been sinked to the management station, and at other moments "normal" events are being collected. Therefore I want to investigate the cause of the "spurious" exceptions. Regards, Dominique.
1958.3	entered as QAR 1965 in NACQAR mcc_internal database	DADA::DITMARS	Pete	`Mon Dec 23 1991 11:56`	2
	You might send/post an exact log of what's going on so the Phase IV folks can be precise in figuring out what's going on? Jean Lee is responsible for DNA4.
1958.4	QIO told us it was unreachable	TOOK::PURRETTA		`Mon Dec 23 1991 15:20`	9
	I looked into this. No place in the AM do we explicitly set this exception. We truly returned from a QIO with SS$_UNREACHABLE and translated it to the specialized exception which you are seeing. As far as VMS was concerned, the node was unreachable. We will continue to look into this problem but I figured I'd give what info we have in the meantime. -- John
1958.5	What's the network like between the two systems?	TOOK::BIRNBAUM		`Mon Dec 23 1991 17:28`	15
	One possible explanation of this behavior is that the router is unable respond to the management system's connection initiate message before the outgoing timer on the management system expires. This type of timeout can be caused by a number of factors including: network congestion and network routing cost transitions. Could you please check the value of the outgoing timer and the number of response timeouts on the management system and the value of the incoming timer on the router. Also can you give me some idea of the topology between the management system and the router. Thanks, Bill
1958.6	Problem is in DECnet.	KETJE::PACCO		`Mon Dec 30 1991 10:09`	9
	Thanks for specifying that the error "Node not currently accessible" unambiguisly corresponds with SS$_UNREACAHBLE (could have been documented !) Now it's also clear that DECmcc is out of suspicion. I can reproduce the error (quite difficultly) with NCP procedures. It looks at this time as a DECrouter/X25 gateway problem. Investigation goes further. Dominique.