Title: | ALL-IN-1 (tm) Support Conference |
Notice: | Please spell ALL-IN-1 correctly - all CAPITALS! |
Moderator: | IOSG::PYE CE |
Created: | Fri Jul 01 1994 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2716 |
Total number of notes: | 12169 |
I have had a recurring problem. A couple of times a week I will get hundreds of mails overnight in the Manager account like this :- Date: 05-Feb-1997 01:05 GMT From: Mail Postmaster POSTMASTER Dept: Tel No: TO: System Manager ( MANAGER ) Subject: Send Failure Notification ALL-IN-1 was unable to send a message to Message Router. It has been removed from the sender queue and placed in the following files :- Envelope - OA_SYS:[ALLIN1.DATA_SHARE]ZWRT1IW8K.NBS; Message - OA_SYS:[ALLIN1.DATA_SHARE]ZWRT1IW8F.NBS;1 ------------------------------------ BOTTOM ------------------------------------ The MTI$ERROR log looks like this :- ALL-IN-1 IOS Sender/Fetcher Error* Created: 5-FEB-1997 01:04:55. 5-FEB-1997 01:04:55 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:02 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:11 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:13 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:13 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:13 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:14 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:14 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 01:05:14 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR also :- %EMD-I-SEND_COUNT_EXCE, SENDER re-try count exceeded 5-FEB-1997 08:08:03 'Default Sender' %EMD-I-SEND_COUNT_EXCE, SENDER re-try count exceeded 5-FEB-1997 08:08:06 'Default Sender' %EMD-I-SEND_COUNT_EXCE, SENDER re-try count exceeded 5-FEB-1997 08:08:18 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:18 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:18 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:18 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:19 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:19 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:19 'Default Sender' %EMD-E-POSTERR, Error posting message - 0935E8BA returned from MR 5-FEB-1997 08:08:19 'Default Sender' %EMD-I-SEND_COUNT_EXCE, SENDER re-try count exceeded 5-FEB-1997 08:09:09 'Default Sender' Analysing the removed .NBS files only show valid off node addreses. The quick fix is to STOP/RESTART the SENDER. The problem stops. The Version of ALL-IN-1 is :- ALL-IN-1 IOS Server for OpenVMS V3.1 BL121 BRITISH 17-JUN-1994 The Remote MR Version is:- MRMAN V3.3-313 exit %X0935E8BA brings up :- %NONAME-E-NOMSG, Message number 0935E8BA I had suspected a dodgy autoforward because we are migrating to exchange. And Indeed on investigation I found some invalid autoforwards that when mailed created a one off "Sender retry count exceeded" but did not create constant errors every few minutes. Has anyone seen the 0935E8BA error before ? Thanks in advance, Karl. [Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
2501.1 | Resources problem | ZUR01::ASHG | Grahame Ash @RLE | Wed Feb 05 1997 11:30 | 12 |
$ set mess SYS$COMMON:[SYSMSG]MRAPPMSG.EXE;1 $ exit %X0935E8BA %MRIF-E-SYSERROR, system interface error From the MRIF book: This error message relates to system resources. Look in the Message Router error log MRERR.INF [which is in MR$:] for a second message, which will be an error described in VMS System Messages and Recovery Procedures Reference Manual. grahame | |||||
2501.2 | insufficient virtual memory | NETRIX::"Karl [email protected]" | Karl Strong | Wed Feb 05 1997 14:35 | 41 |
Thanks Grahame, In the MRERR.INF at the same time as the SENDER errors :- %MROUTER-I-FAILOG_LSTN_S, 19970205010453 The application, ALLIN1 on node WOT* %MROUTER-E-OPSYS, System interface error -LIB-F-INSVIRMEM, insufficient virtual memory %MROUTER-I-FAILOG_LSTN_S, 19970205010455 The application, ALLIN1 on node WOT* %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970205010458 The application, ALLIN1 on node WOT* %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970205010500 The application, ALLIN1 on node WOT* %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970205010501 The application, ALLIN1 on node WOT* %MROUTER-E-OPSYS, System interface error -LIB-F-INSVIRMEM, insufficient virtual memory %MROUTER-I-FAILOG_LSTN_S, 19970205010501 The application, ALLIN1 on node WOT* %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970205010503 The application, ALLIN1 on node WOT* %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" Time to find out who 'borrowed' our VMS System Messages and Recovery Procedures Reference Manual. Regards, Karl. CCS Reading. [Posted by WWW Notes gateway] | |||||
2501.3 | ACISS2::LENNIG | Dave (N8JCX), MIG, @CYO | Wed Feb 05 1997 15:08 | 11 | |
From your examination of the netserver log file, how long had the mrlisten been running//how many messages had it processed? I seem to recall a small memory leak (10-20 bytes?) in mrlisten on an error path, which unfortunately ALL-IN-1 exercizes every time it processes an auto-forward message (see note 817.1 for details). So lot's of auto-forwarded messages over a single very long duration connection will eventually make mrlisten get sick (insvirmem). Dave | |||||
2501.4 | PGFLQUOTA Increase ? | NETRIX::"Karl [email protected]" | Karl Strong | Thu Feb 06 1997 13:43 | 9 |
Thanks Dave, The processes had only been up for 3-4 days. Closing/restarting MR more often should avoid this memory leak ? Will increasing MBMANAGER PGFLQUOTA extend the amount of time before running out of Virtual Memory ? Regards, Karl. [Posted by WWW Notes gateway] | |||||
2501.5 | ACISS2::LENNIG | Dave (N8JCX), MIG, @CYO | Fri Feb 07 1997 13:08 | 9 | |
If you want to try tweaks on the MR node, you want the MRNET account, but raising pgflquota has implications (virtualpagecnt sysgen param, actual pagefile size and consumption...). Or simply restart the ALL-IN-1 sender periodically (nightly?). Yours is only the second site I've heard of to run into this; they must have a large volume of auto-forwarding going on to hit this in 3-4 days... Dave | |||||
2501.6 | Its getting worse ! | NETRIX::"Karl [email protected]" | Karl Strong | Fri Feb 07 1997 15:42 | 108 |
Thanks Dave, We got the error 1day after restarting the sender on WOTVAX yesterday. System has about 400 off node Autoforwards. You can tell how busy the autoforwards are by the frequency of the code "M" errors from WOTVAX RG71RW> typ/tail/cont mr$:mrerr.inf %MROUTER-I-FAILOG_LSTN_S, 19970207150454 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" MROUTER-I-FAILOG_LSTN_S, 19970207150708 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970207150738 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970207150742 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970207150748 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage %MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970207150752 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" %MROUTER-I-FAILOG_LSTN_S, 19970207150755 The application, ALLIN1 on node WOTVAX, identified to mailbox A1WOTVAX, is sending a mes sage MROUTER-E-MSEMANT, Message semantics unacceptable with code "M" This is the MRNET account :- RG71RW> mc authorize sho mrnet Username: MRNET Owner: Account: HV6 DC UIC: [351,4] ([DECNET,MRNET]) CLI: DCL Tables: DCLTABLES Default: DISK$NETWORK_12:[MB$.MR.SCR] LGICMD: SYS$SYSTEM:MB$MRNET Flags: DisCtlY Restricted DisMail Primary days: Mon Tue Wed Thu Fri Secondary days: Sat Sun rimary 000000000011111111112222 Secondary 000000000011111111112222 Day Hours 012345678901234567890123 Day Hours 012345678901234567890123 Network: ##### Full access ###### ##### Full access ###### Batch: ----- No access ------ ----- No access ------ Local: ----- No access ------ ----- No access ------ Dialup: ----- No access ------ ----- No access ------ Remote: ----- No access ------ ----- No access ------ Expiration: (none) Pwdminimum: 14 Login Fails: 11386 Pwdlifetime: (none) Pwdchange: 29-NOV-1995 15:12 Last Login: 14-MAR-1995 19:50 (interactive), 7-FEB-1997 15:15 (non-interactive) Maxjobs: 50 Fillm: 80 Bytlm: 60000 Maxacctjobs: 50 Shrfillm: 0 Pbytlm: 0 Maxdetach: 50 BIOlm: 50 JTquota: 1024 Prclm: 2 DIOlm: 12 WSdef: 1024 Prio: 4 ASTlm: 80 WSquo: 8192 ueprio: 0 TQElm: 20 WSextent: 16384 CPU: (none) Enqlm: 6000 Pgflquo: 20000 This is the virtual page param :- RG71RW> mc sysgen sho virt Parameter Name Current Default Min. Max. Unit Dynamic -------------- ------- ------- ------- ------- ---- ------- VIRTUALPAGECNT 532288 12032 512 4194304 Pages RG71RW> sho mem/page Physical Memory Usage (pages): Total Free In Use Modified Main Memory (256.00Mb) 524288 269889 248346 6053 Virtual I/O Cache Usage (pages): Total Free In Use Maximum Cache Memory 20924 2018 18906 66389 Slot Usage (slots): Total Free Resident Swapped Process Entry Slots 420 274 146 0 Balance Set Slots 333 189 144 0 Dynamic Memory Usage (bytes): Total Free In Use Largest Nonpaged Dynamic Memory 19999744 11095680 8904064 7509952 Paged Dynamic Memory 10413568 7200400 3213168 7041632 Paging File Usage (pages): Free Reservable Total DISK$NETWORK_12:[PAGE_AND_SWAP.RG71RW]PAGEFILE.SYS;1 108801 75260 119992 DISK$NETWORK_12:[PAGE_AND_SWAP.RG71RW]PAGEFILE2.SYS;1 282864 192315 299992 DISK$NETWORK_12:[PAGE_AND_SWAP.RG71RW]PAGEFILE3.SYS;1 373691 254563 399992 f the physical pages in use, 92556 pages are permanently allocated to VMS. Regards, Karl. CCS Reading. [Posted by WWW Notes gateway] | |||||
2501.7 | ACISS2::LENNIG | Dave (N8JCX), MIG, @CYO | Fri Feb 07 1997 17:30 | 13 | |
400 sounds way to low; are you sure about the count? Are you sure your sender restart actually disconnected the link to the mrlisten process? This should cause the image to exit, and thus the new link would get a 'fresh' listener invocation. Is pgflquota for the mrnet account low? Is virtualpagcnt on the MR system low? The last time I saw this (an internal MTS machine) the count was somewhere around 30,000 before it fell over (as I recall). Dave | |||||
2501.8 | check noone has Autofirward to themselves | IOSG::COTTINGHAM | Alan Cottingham | Thu Feb 13 1997 08:46 | 5 |
I seem to remember this error occurring if a user had AutoForward set to themselves (remotely). Regards Alan | |||||
2501.9 | Thanks. | NETRIX::"Karl [email protected]" | Karl Strong | Mon Mar 03 1997 10:25 | 8 |
Re .5 Changed the quotas as suggested and killed off some dodgy autoforwards. Our problem seems to have gone away. Thanks for the help. Regards, Karl. [Posted by WWW Notes gateway] |