[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1551.0. "%MCC-E-TRANSMITERROR & %SYSTEM-DEVINACT" by ANTIK::WESTERBERG (Stefan Westerberg CS Stockholm) Thu Sep 26 1991 07:41

Has anybody seen this alarm message:

Requsted operation cannot be completed
%MCC-E-TRANSMITERROR, error trying transmit
%SYSTEM-F-DEVINACT, device inactive

I get this alarm message 1-6 times a day from a alarms rule for one Translan
bridge 350 and from one LAN bridge 150 (Its always the same bridge). 
The bridges aren't located on the same network (Two different customers).

Any bright ideas ?

T.R	Title	User	Personal Name	Date	Lines
1551.1	Bright ideas...	CHRISB::BRIENEN	DECmcc Bridge\|Station\|SNMP Management.	`Thu Sep 26 1991 09:56`	13
	The errors you are seeing probably come from the MCC_EA routines (which are used by Bridge AM, TransLAN AM, Ethernet AM, and Concentrator AM). The type of error would indicate that the Target entity is not the problem, but rather the Ethernet Host Port (guess: its being reset due to some error condition). Can you provide more information about the system these errors are appearing on (e.g., what are the ethernet host ports? DELUAs? DEBNAs? what type/version of vax system/vms software? Is the ethernet host port being heavily used?)... Chris
1551.2	They aren't member of any idle club !	ANTIK::WESTERBERG	Stefan Westerberg CS Stockholm	`Thu Sep 26 1991 11:55`	8
	I seen this type of errors on a VAXstation 3100 M38 and VAX4000-300. VMS version on both system is 5.4-2. The load on 3100 is about 3 Export every 120s and 11 alarms every 60s and on the 4000 30 Export every 120s and 30 alarms every 30s. /Stefan
1551.3	Still problems	STKMCC::LUND		`Fri Oct 18 1991 10:26`	18
	Hello We have patched the ezdriver on the 4000-300 (CSCPAT_0252) but the problem still exsists. I have seen this on different sites and the problem seems to be related to the load on the MCC host. Is 30 Export every 120s and 30 alarms every 60s (not 30) to much for MCC on a VAX 4000-300. We are using the Translan AM for almost all alarms and exports, accessing the bridges via 10 mbit Ethernet. If the error message indicates a problem on the MCC host this should be investigated. No problems are seen on the MCC host ethernet line counters and VMS version is 5.4-2 Regards Niklas.
1551.4	"120s" seconds?	TOOK::CALLANDER	MCC = My Constant Companion	`Tue Oct 29 1991 16:44`	14
	how much stuff are you set up to record/export. All partitions or only some? and by 120s do you mean every 120 seconds (2 minutes), if so then that's 1 minute between rules and 2 between exports...if that is the case it might be that the intervals are a bit close together, based upon the load on your net, load on your system, memory in the system, and what type of devices you are looking at. I will send this off to the Translan guy to see if he knows what the characterization of their AM is in regards to overhead/response time to show requests. thanks
1551.5	try longer intervals; less system load.	TOOK::MCPHERSON	i'm only 5 foot one...	`Tue Oct 29 1991 17:18`	39
	The briefest exporting interval I've ever been able to make work (with the Translan AM) was 00:05:00. Note that This was with no alarms outstanding and no other exporting going on. I haven't done any sort of workload characterization of the Translan AM. Has anyone else? MCC> set person doug opinion_flag = true PERSON MCDOUG_NS:.Doug AT 29-OCT-1991 17:12:31 Characteristics Modification completed successfully. opinion_flag = TRUE MCC> MCC> show person doug Personal Opinion PERSON MCDOUG_NS:.Doug AT 29-OCT-1991 17:14:31 Characteristics "I'm not sure it's even worth it to try to export on that brief an interval (for long-term exporting) since you'll create Sagans and Sagans of attribute data records... and your reporter (DTR32 or DECdecision) is going to have to dutifully plow through it..." MCC> MCC> set person doug opinion_flag = false BRIDGE KAJUN_NS:.br4 PERSON MCDOUG_NS:.Doug AT 29-OCT-1991 17:14:35 Characteristics Problem modifying attribute. opinion_flag = TRUE MCC> exit
1551.6	get help now ;-)	MKNME::DANIELE		`Tue Oct 29 1991 17:29`	1
	You been doin' this too long Doug.
1551.7	Customer aren't happy !	ANTIK::WESTERBERG	Stefan Westerberg CS Stockholm	`Wed Nov 06 1991 17:19`	14
	This is a very irritating exception that constantly lits up the screen. And the customer aren't happy with that at all. The load of 25 translan export with a 2 minute duration and about 120 alarms with a duration spaning from 1 hour to 15s don't sounds to be a overwhelmy load for a 8 vups VAX 4000-300. Infact it seems that we have to increase the number of alarms close to 700. So this problem has to be solved if we are going to be able to trust that the alarms triggerd are live, not false ! Is there anybode else that have seen this type of behaviour ? Need a fix for this very soon. /Stefan
1551.8	Did you try lengthening the EXPORT interval?	MCDOUG::MCPHERSON	My object paradigm needs integration...	`Thu Nov 07 1991 13:09`	24
	Did you try legthening the export interval on the Translans (as I suggested earlier)? Again, the shortest export interval that I was able to make work was 5 min. That might help lessen the load on the ethernet interface somewhat. This is just a guess: When you do an NCP> sho known line count NCP> show known circuit count are you seeing a lot of "System buffer unavailable" or similar counters? If so, you _may_ be able to alleviate the problem by upping some sysgen parameters (that elude me right now... Maybe lrpcount? srpcount? dunno. Help?) I know you're looking for a solution and not more questions, but please try to work with us to isolate the problem. If anyone else out there has any ideas, please feel free to chime in. ./doug
1551.9	Some explanatons	STKMCC::LUND	Niklas Lund	`Thu Nov 07 1991 15:49`	66
	Hello Doug ! Thanks for helping us with this problem. >>Did you try legthening the export interval on the Translans (as I suggested >>earlier)? No we haven't and that's just because this is a LIVE network monitoring system. We are managing a big financial Value Added Network, with +30 Translan bridges in it. We must be able to detect problems, like broken lines, in the network within 30 Seconds. The utlization graphs that we produce daily on each 64 Kbps line should not have polling intervalls bigger than 120s (60s for the most importent lines). We are exporting all line attributes on 5 Translan bridges with +5 synch ports active. The exports are most of the time working well and we get an RDB database that have the size of 23000 blocks each day. The alarms are changed to START with 2 seconds "duration". Like this Enable mcc 0 alarms rule bridge1_line2_nofwd, at start (+00:00:02) Enable mcc 0 alarms rule bridge1_line3_nofwd, at start (+00:00:02) Enable mcc 0 alarms rule bridge1_line4_nofwd, at start (+00:00:02) . . Remember that we have seen these errors even on systems that have maybe 30% of the above described load and that the AM's vary from Translan and Bridge to Ethernet station AM. The problem are seen most on VAX 4000-300 systems. I have included two more error messages of the same type that shows up as exceptions, but much less freqvently. Exception: The requested operation cannot be completed %MCC-E-TRANSMITERROR, error trying to transmit a packet %SYSTEM-F-DEVINACT, device inactive Exception: The requested operation cannot be completed %MCC-E-RECEIVEERROR, error trying to receive a packet %SYSTEM-F-DEVINACT, device inactive Exception: The requested operation cannot be completed %MCC-F-STRTDEVERROR, start Ethernet device failed. %SYSTEM-F-BADPARAM, bad parameter value (This one is only seen when using Ethernet station AM) >>When you do an >> NCP> sho known line count >> NCP> show known circuit count >>are you seeing a lot of "System buffer unavailable" or similar counters? >> >>If so, you _may_ be able to alleviate the problem by upping some sysgen >>parameters (that elude me right now... Maybe lrpcount? srpcount?) No problems are seen in line and circuit counters, no pool expansion either. The load on the customers ethernet is 5-10% with peaks up to 30% Regards Niklas
1551.10	Snake eyes. sorry.	MCDOUG::MCPHERSON	My object paradigm needs integration...	`Thu Nov 07 1991 16:06`	16
	Whew. I dunno Niklas.... From your description (and of course the meaning of the error you're getting) it doesn't sound like there's anything that can be done to help you form within the TRanslan AM. Unless you can get some flexibilty on the export interval, there's nothing further I can think of to help you. I hope someone else can come up with something. /doug. P.S. You do know that Digital must purchase the Translan AM for ALL USE other than for use within DEC and for demo purposes, yes? I trust that the appropriate monies and licenses have changed hands for the usage of the Translan AM in this network, or we (Digital) are liable for breach of contract (among other things).
1551.11	This needs to be hidden from the user.	CHRISB::BRIENEN	DECmcc Bridge\|Station\|SNMP Management.	`Thu Nov 07 1991 17:06`	25
	This error is relatively common (at least to us) when pounding on the Ethernet device. It has nothing to do with CPU utilization, and is not a "code bug" in the MCC_EA Routines (they're just reporting what happens). There are two possible solutions to the problem, both involve hiding the problem from the user (and neither are easy to patch into V1.1): (1) Modify the MCC_EA Routines to do retries when encountering the device inactive error - this wouldn't be the first time we did something like this (e.g., there is special code in place which handles the DELUA "differently") (2) Tell AM developers to do the retries themselves if they don't want to bother the user with this information - the set of AMs using the EA routines is still fairly small, so this isn't as big of a deal as one would think. We will be looking at which makes sense very soon. This decision will be based partly on how long the fix would take to implement and the risk associated with making the change (e.g., change in the MCC_EA at this point is more risky than having the AMs do retries). Chris Brienen
1551.12	Maybe we'll just _hide_ it next time..	MCDOUG::MCPHERSON	My object paradigm needs integration...	`Thu Nov 07 1991 17:29`	7
	Thanks for the note, Chris. That which we cannot fix, we hide. Fair enough. I'll add this to the Vitalink's engineering "To Do" list for the next release of the Translan AM. /doug
1551.13	Filed as MCC_INTERNAL QAR#1335 [Priority 3]	CHRISB::BRIENEN	DECmcc Bridge\|Station\|SNMP Management.	`Fri Nov 08 1991 15:58`	0
1551.14	PATCH ?	ANTIK::WESTERBERG	Stefan Westerberg CS Stockholm	`Fri Nov 15 1991 05:48`	9
	When could we expect a patch for this problem ? At one customer site where we have aubout 700 alarms rule we get 10 to 32 %MCC-E-TRANSMITERROR per hour ! A patch for this problem is badly needed ! /Stefan
1551.15	Please re-read .11	TOOK::MCPHERSON	My object paradigm needs integration...	`Fri Nov 15 1991 08:26`	21
	I do understand your urgency, but I think Chris made it pretty clear: The change would need to be made either a) to the mcc_ea routines or b) to the AM(s) that are calling the mcc_ea routines (in this case, the Translan AM "a" is fairly risky, given the amount of work that;'s focused on getting the 1.2 stuff out the door. Also, the mcc_ea routines really are working the way that they're supposed to. The _calling routine_ should really handle the retries on failure. "b" is really the _correct_ thing to do, but you'll need to go to Vitalink to get them to make a patch (or new .exe). AM maintenance is Vitalink's responsibility (personally, I doubt that they'll be able to get you a patch for the Translan AM any quicker than we could fix the mcc_ea routines. I know this is exactly what you _don't_ want to hear, but my input is pick one option and work it through the appropriate escalation mechanism(s); Digital's for 'a' and Vitalink's for 'b'. /doug