Title: | POLYCENTER Console Manager |
Notice: | Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS: |
Moderator: | CSC32::BUTTERWORTH |
Created: | Thu Aug 06 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1541 |
Total number of notes: | 6564 |
My customer is running PCM V1.5 ECO1 on a VAX/VMS station running V6.0. From time to time the console processes go into a RWMBX wait state. The corresponding mailbox does not show up as busy. Attached you can find an SDA output of the running system. The messages in the mailboxes seem to be error messages. No process is there to read them. Thanks for any help, Joerg VAX/VMS System analyzer SDA> show sum Current process summary ----------------------- Extended Indx Process name Username State Pri PCB PHD Wkset -- PID -- ---- --------------- ----------- ------- --- -------- -------- ------ 00000081 0001 SWAPPER HIB 16 826051A0 82605000 0 00000085 0005 IPCACP SYSTEM HIB 10 82C82AC0 831F4200 86 00000086 0006 ERRFMT SYSTEM HIB 8 82C830C0 832FA200 122 00000088 0008 AUDIT_SERVER AUDIT$SERVER HIB 10 82C838C0 83400200 82 00000089 0009 JOB_CONTROL SYSTEM HIB 10 82C83AC0 83483200 91 0000008A 000A QUEUE_MANAGER SYSTEM HIB 8 82C83EC0 83506200 142 0000168C 000C OPSMGR_2 OPSMGR LEF 6 82C9E440 84F1F200 94 00002B0D 000D OPCOM SYSTEM HIB 7 82C7FFC0 8337D200 138 00001790 0010 _FTA65: OPS HIB 8 82CB2180 848FB200 87 00000092 0012 NETACP DECNET HIB 10 82C87CC0 8368F200 822 00000093 0013 EVL DECNET HIB 6 82C884C0 83589200 122 00000094 0014 SNS$WATCHDOG SYSTEM HIB 5 82C896C0 8391E200 269 00000097 0017 REMACP SYSTEM HIB 9 82C818C0 839A1200 70 00000099 0019 SCHED_REMOTE SYSTEM LEF 6 82C898C0 83AA7200 54 0000009B 001B SMISERVER SYSTEM HIB 9 82C87AC0 83795200 52 0000009C 001C LATACP SYSTEM HIB 13 82C81AC0 8360C200 109 0000009E 001E UCX$INET_ACP INTERnet HIB 10 82C872C0 83818200 203 0000009F 001F DECW$SERVER_0 OPS HIB 6 82C882C0 83B2A200 5237 00001B20 0020 OPSMGR_1 OPSMGR LEF 6 82CA7D40 8414E200 97 00001DA1 0021 FIELD FIELD CUR 10 82C53E00 8435A200 1364 000000A5 0025 DECW$SESSION OPS LEF 7 82C854C0 83712200 87 000000A7 0027 DECW$MWM OPS LEF 4 82C866C0 83277200 811 000000A9 0029 VUE$OPS_3 OPS LEFO 5 82BB9940 00000000 337 000000AA 002A VUE$OPS_4 OPS LEFO 5 82C852C0 00000000 311 000000AB 002B VUE$OPS_5 OPS LEF 7 82C86CC0 83D36200 239 000000AC 002C DECW$TE_00AC OPS LEF 6 82C86AC0 83DB9200 4649 000000AD 002D _FTA8: OPS HIB 5 82C876C0 83A24200 79 000000AF 002F _FTA10: OPS HIB 6 82C880C0 83EBF200 87 000000B0 0030 Console Notify SYSTEM RWMBX 6 82C812C0 83F42200 1792 000000B1 0031 Console Daemon SYSTEM HIB 6 82C860C0 83FC5200 311 000000B2 0032 Console Ctrl 01 SYSTEM RWMBX 6 82C816C0 8389B200 3871 000000B3 0033 Console Ctrl 02 SYSTEM RWMBX 6 82C81EC0 84048200 3497 000000B4 0034 OPS OPS LEF 4 82C94580 840CB200 222 00002A37 0037 OPSMGR_3 OPSMGR HIB 6 82CA9040 843DD200 99 SDA> set proc/ind=30 SDA> show dev/add=@r5 I/O data structures ------------------- MBA1408 MBX UCB address: 82C54900 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 1640 ORB address 82C35640 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 00000008 Byte count 0000 LNM address 82AFB140 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 00010030 *** I/O request queue is empty *** SDA> define iob=ucb SDA> @see_mbx 00000000 00B00321 00000002 00000001 ........!.�..... 82C56024 00230010 000C0042 000F0017 00000000 ........B.....#. 82C56034 72674D43 2EE6BE96 00000000 2FFC0007 ..�/.....��.CMgr 82C56044 756F4620 544F4E20 656C6F73 6E6F4320 Console NOT Fou 82C56054 6567616E 616D454C 4F534E4F 4300646E nd.CONSOLEmanage 82C56064 6E65706F 206F7420 656C6261 6E550072 r.Unable to open 82C56074 73797320 726F6620 656C6F73 6E6F6320 console for sys 82C56084 59532520 2D203153 47574E43 206D6574 tem CNWGS1 - %SY 82C56094 6261202C 54524F42 412D462D 4D455453 STEM-F-ABORT, ab 82C560A4 736E6F43 20646567 616E614D 0074726F ort.Managed Cons 82C560B4 61766120 746F6E20 656E696C 20656C6F ole line not ava 82C560C4 4D430031 5347574E 4300656C 62616C69 ilable.CNWGS1.CM 82C560D4 6C6F736E 6F43006C 616E7265 746E4920 Internal.Consol 82C560E4 82C56100 00000072 6567616E 614D2065 e Manager....a�. 82C560F4 SDA> @see_mbx 0001FF30 000000B8 00000000 00000010 ........�...0... 82BC9924 SDA> set proc/ind=32 SDA> show dev/addr=@r5 I/O data structures ------------------- MBA94 MBX UCB address: 82C57400 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 2737 ORB address 82BFB8C0 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 00000009 Byte count 0000 LNM address 82AFA170 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 00010030 *** I/O request queue is empty *** SDA> @see_mbx 00000000 00B00322 00000005 00000001 ........".�..... 82C8FFE4 002A0010 000C0023 000F0010 00000000 ........#.....*. 82C8FFF4 72674D43 2EE6BE51 00000000 2FFC0004 ..�/....Q��.CMgr 82C90004 534E4F43 00746365 6E6E6F63 73694420 Disconnect.CONS 82C90014 20726573 55007265 67616E61 6D454C4F OLEmanager.User 82C90024 72662064 65746365 6E6E6F63 73696420 disconnected fr 82C90034 73550053 4541206D 65747379 73206D6F om system AES.Us 82C90044 63656E6E 6F637369 64207361 68207265 er has disconnec 82C90054 454C4F53 4E4F4320 6D6F7266 20646574 ted from CONSOLE 82C90064 49204D43 00534541 00726567 616E616D manager.AES.CM I 82C90074 20656C6F 736E6F43 006C616E 7265746E nternal.Console 82C90084 4D430031 5347574E 00726567 616E614D Manager.NWGS1.CM 82C90094 SDA> set proc/ind=33 SDA> show dev/addr=@r5 I/O data structures ------------------- MBA90 MBX UCB address: 82C55C00 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 432 ORB address 82BF8B00 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 0000000A Byte count 0000 LNM address 82AFA070 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 00010030 *** I/O request queue is empty *** SDA> @see_mbx 00000000 00B00323 00000001 00000001 ........#.�..... 82C78BE4 00220010 000C004B 000F0012 00000000 ........K.....". 82C78BF4 72674D43 2EE6BEB2 00000000 2FFC0007 ..�/....���.CMgr 82C78C04 4F430074 736F4C20 656C6F73 6E6F4320 Console Lost.CO 82C78C14 6E6F4300 72656761 6E616D45 4C4F534E NSOLEmanager.Con 82C78C24 79732072 6F66206B 6E696C20 656C6F73 sole link for sy 82C78C34 74736F6C 2031424E 43535620 6D657473 stem VSCNB1 lost 82C78C44 20646573 696E676F 6365726E 55202D20 - Unrecognised 82C78C54 646F6320 726F7272 65206D65 74737973 system error cod 82C78C64 69746365 6E6E6F43 00363138 35203A65 e: 5816.Connecti 82C78C74 67616E61 6D206F74 2074736F 6C206E6F on lost to manag 82C78C84 31424E43 5356006D 65747379 73206465 ed system.VSCNB1 82C78C94 6E6F4300 6C616E72 65746E49 204D4300 .CM Internal.Con 82C78CA4 82C92D00 72656761 6E614D20 656C6F73 sole Manager.-�. 82C78CB4 SDA> @see_mbx 0000FFD0 000000BB 00000000 00000010 ........�...�... 82BD61A4 SDA> show proc/chan Process index: 0033 Name: Console Ctrl 02 Extended PID: 000000B3 -------------------------------------------------------------------- Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA300: 0020 82BC9B00 DKA300:[CONSOLE.IMAGES]CONSOLE$DAEMON.EXE;6 0030 82BCCD40 DKA300:[VMS$COMMON.SYSLIB]SMGSHR.EXE;1 (section file) 0040 82BCDFC0 DKA300:[VMS$COMMON.SYSLIB]LIBRTL.EXE;3 (section file) 0050 82BCC680 DKA300:[VMS$COMMON.SYSLIB]PTD$SERVICES_SHR.EXE;1 (section file) 0060 82BCE580 DKA300:[VMS$COMMON.SYSLIB]DECC$SHR.EXE;1 (section file) 0070 82BCE640 DKA300:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file) 0080 82BC9D40 DKA300:[VMS$COMMON.SYSLIB]UVMTHRTL.EXE;3 (section file) 0090 82BD0EC0 DKA300:[VMS$COMMON.SYSLIB]DECW$XLIBSHR.EXE;3 (section file) 00A0 82BCE280 DKA300:[VMS$COMMON.SYSLIB]VAXCRTL.EXE;3 (section file) 00B0 82BCE700 DKA300:[VMS$COMMON.SYSLIB]DECW$TRANSPORT_COMMON.EXE;1 (section file) 00C0 82BD1C80 DKA300:[VMS$COMMON.SYSLIB]DECW$DXMLIBSHR.EXE;1 (section file) 00D0 82BCCA40 DKA300:[VMS$COMMON.SYSLIB]LBRSHR.EXE;3 (section file) 00E0 82BD03C0 DKA300:[VMS$COMMON.SYSLIB]DECW$DWTLIBSHR.EXE;3 (section file) 00F0 82BD1A80 DKA300:[VMS$COMMON.SYSLIB]DECW$XMLIBSHR.EXE;1 (section file) 0100 82BD1200 DKA300:[VMS$COMMON.SYSLIB]DECW$XEXTLIBSHR.EXE;1 (section file) 0110 82BD5680 DKA300:[VMS$COMMON.SYSLIB]DECW$TERMINALSHR.EXE;1 (section file) 0120 00000000 MBA88: 0130 00000000 Busy MBA89: 0140 00000000 MBA90: 0150 00000000 Busy MBA91: 0160 00000000 LTA5578: 0170 82C04740 DKA300:[CONSOLE.LOG]AES.TIMES;1 0180 82C0F900 DKA300:[CONSOLE.LOG]AES.EVENTS;1 0190 82C1B300 DKA300:[CONSOLE.LOG]AES.LOG;1 01A0 82BF0DC0 DKA300:[CONSOLE.LOG]AXCRNA.TIMES;1 01B0 00000000 LTA5064: 01C0 82BEF740 DKA300:[CONSOLE.LOG]ATLAS1.TIMES;1 01D0 82BEF2C0 DKA300:[CONSOLE.LOG]ATLAS1.EVENTS;1 01E0 82C31080 DKA300:[CONSOLE.LOG]ATLAS1.LOG;1 01F0 00000000 Busy MBA1225: 0200 00000000 LTA5065: 0210 82C3D500 DKA300:[CONSOLE.LOG]ATLAS3.TIMES;1 0220 82C21FC0 DKA300:[CONSOLE.LOG]ATLAS3.EVENTS;1 0230 82C2F040 DKA300:[CONSOLE.LOG]ATLAS3.LOG;1 0240 00000000 Busy MBA1226: 0250 82C00480 DKA300:[CONSOLE.LOG]AXCRNA.EVENTS;1 0260 82BF4D80 DKA300:[CONSOLE.LOG]AXCRNA.LOG;1 0270 00000000 Busy MBA1953: 0280 00000000 LTA5579: 0290 82C23040 DKA300:[CONSOLE.LOG]DXCERN.TIMES;1 02A0 00000000 LTA5075: 02B0 82C19140 DKA300:[CONSOLE.LOG]AXCRNB.TIMES;1 02C0 82BF6A00 DKA300:[CONSOLE.LOG]AXCRNB.EVENTS;1 02D0 82C04C80 DKA300:[CONSOLE.LOG]AXCRNB.LOG;1 02E0 00000000 Busy MBA1234: 02F0 00000000 LTA5080: 0300 82C3FD80 DKA300:[CONSOLE.LOG]AXCRNC.TIMES;1 0310 82C28B00 DKA300:[CONSOLE.LOG]AXCRNC.EVENTS;1 0320 82C0C900 DKA300:[CONSOLE.LOG]AXCRNC.LOG;1 0330 00000000 Busy MBA1239: 0340 00000000 LTA5066: 0350 82BFB2C0 DKA300:[CONSOLE.LOG]CHORUS_TEST.TIMES;1 0360 82C30A80 DKA300:[CONSOLE.LOG]CHORUS_TEST.EVENTS;1 0370 82C3BAC0 DKA300:[CONSOLE.LOG]CHORUS_TEST.LOG;1 0380 00000000 Busy MBA1227: 0390 00000000 Busy LTA5067: 03A0 82C47C40 DKA300:[CONSOLE.LOG]CMS1.TIMES;1 03B0 82C2C400 DKA300:[CONSOLE.LOG]CMS1.EVENTS;1 03C0 82C4D580 DKA300:[CONSOLE.LOG]CMS1.LOG;1 03D0 00000000 Busy MBA1228: 03E0 00000000 Busy LTA5068: 03F0 82C2E980 DKA300:[CONSOLE.LOG]CMS2.TIMES;1 0400 82C0F9C0 DKA300:[CONSOLE.LOG]CMS2.EVENTS;1 0410 82C135C0 DKA300:[CONSOLE.LOG]CMS2.LOG;1 0420 00000000 Busy MBA1229: 0430 82C3F240 DKA300:[CONSOLE.LOG]DXCERN.EVENTS;1 0440 82C25D40 DKA300:[CONSOLE.LOG]DXCERN.LOG;1 0450 00000000 Busy MBA1954: 0460 00000000 Busy LTA5580: 0470 82C0C480 DKA300:[CONSOLE.LOG]HSC1.TIMES;1 0480 82C2FA00 DKA300:[CONSOLE.LOG]HSC1.EVENTS;1 0490 82C2FF40 DKA300:[CONSOLE.LOG]HSC1.LOG;1 04A0 00000000 Busy MBA1955: 04B0 00000000 Busy LTA5581: 04C0 82C26100 DKA300:[CONSOLE.LOG]HSC8.TIMES;1 04D0 00000000 Busy LTA5583: 04E0 82BF8740 DKA300:[CONSOLE.LOG]HSC7.TIMES;1 04F0 82C1FA40 DKA300:[CONSOLE.LOG]HSC7.EVENTS;1 0500 82BEAD00 DKA300:[CONSOLE.LOG]HSC7.LOG;1 0510 00000000 Busy MBA1964: 0520 82C06F00 DKA300:[CONSOLE.LOG]HSC8.EVENTS;1 0530 82C42600 DKA300:[CONSOLE.LOG]HSC8.LOG;1 0540 00000000 Busy MBA1956: 0570 00000000 LTA5081: 0580 82C4B6C0 DKA300:[CONSOLE.LOG]UXACB.TIMES;1 0590 82C516C0 DKA300:[CONSOLE.LOG]UXACB.EVENTS;1 05A0 82BFDF00 DKA300:[CONSOLE.LOG]UXACB.LOG;1 05B0 00000000 Busy MBA1240: 05C0 00000000 LTA5082: 05D0 82C47640 DKA300:[CONSOLE.LOG]UXCSB1.TIMES;1 05E0 82C28200 DKA300:[CONSOLE.LOG]UXCSB1.EVENTS;1 05F0 82C2F7C0 DKA300:[CONSOLE.LOG]UXCSB1.LOG;1 0600 00000000 Busy MBA1241: 0610 00000000 LTA5076: 0620 82C24FC0 DKA300:[CONSOLE.LOG]UXPLOT.TIMES;1 0630 82C44580 DKA300:[CONSOLE.LOG]UXPLOT.EVENTS;1 0640 82C33240 DKA300:[CONSOLE.LOG]UXPLOT.LOG;1 0650 00000000 Busy MBA1235: 0660 82BCD840 DKA300:[VMS$COMMON.SYSMSG]SHRIMGMSG.EXE;1 (section file) 0670 82BCEAC0 DKA300:[VMS$COMMON.SYSMSG]VAXCMSG.EXE;1 (section file) 0680 82BD4100 DKA300:[VMS$COMMON.SYSMSG]DECW$XLIBMSG.EXE;1 (section file) 0690 82BD3F00 DKA300:[VMS$COMMON.SYSMSG]DECW$TRANSPORTMSG.EXE;1 (section file) 06A0 82BD5B00 DKA300:[VMS$COMMON.SYSMSG]DECW$TERMINALMSG.EXE;1 (section file) 06B0 00000000 NLA0: 06C0 00000000 NLA0:
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
510.1 | OPG::PHILIP | And through the square window... | Thu Dec 08 1994 11:35 | 12 | |
Joerg, It is quite normal for the PCM processes to go into RWMBX as you normally have lots of controller daemons feeding one notification process. The data you see in the mailbox are not error messages, they are in fact events!! Are you saying that the processes never leave the resource wait state? Cheers, Phil | |||||
510.2 | OPCO::TSG_SJM | Coming live to you from Rosebery | Thu Dec 08 1994 20:58 | 5 | |
This is the same problem we were experiencing, until I reduced the number of systems per control process down to 5. Now PCM runs as advertised. Steve | |||||
510.3 | OPG::PHILIP | And through the square window... | Thu Dec 08 1994 21:34 | 14 | |
Steve, et al. >> This is the same problem we were experiencing, until I reduced the >> number of systems per control process down to 5. Now PCM runs as >> advertised. In that case could you try the new CONSOLE$DAEMON.EXE_VAX thats in OPG::CM$KIT: I have done a lot of work adjusting the quotas the controller daemons are started with. Hopefully the new quotas will alleviate some of the problems people have been experiencing, but we do need some feedback about how it works in your environments. Cheers, Phil | |||||
510.4 | Process stays forever in RWMBX state | LEMAN::NEUWEILER | System Support Geneva | Fri Dec 09 1994 10:34 | 6 |
The processes stay forever in the RWMBX state. I will try the new CONSOLE$DAEMON.EXE_VAX and report back if it doesn't fix the problem. Thanks for your help, Joerg | |||||
510.5 | Tried it... | OPCO::TSG_SJM | Coming live to you from Rosebery | Mon Dec 12 1994 20:58 | 8 |
Phil, I have copied the new daemon image, set the number of systems per controller back up to 16, and restarted PCM. I will let you know how we go. Thanks Steve | |||||
510.6 | how to limit systems per controller ? | LEMAN::NEUWEILER | System Support Geneva | Tue Dec 13 1994 11:01 | 8 |
.5 Question from the my customer: How do you limit the number of systems per controller? Thanks for your help Joerg | |||||
510.7 | OPG::PHILIP | And through the square window... | Tue Dec 13 1994 12:23 | 8 | |
Answer to your customer: You dont, changing this is strictly unsupported. Hopefully, the RWMBX problems will dissapear with the MUP kit. Cheers, Phil | |||||
510.8 | Almost got it... | OPCO::TSG_SJM | Coming live to you from Rosebery | Tue Dec 13 1994 23:50 | 21 |
Phil, We ran with the new version of the daemon image (compile date 5/12/94) successfully for about 24 hours, but then several of the control processes and the notify process went into RWMBX. Slowly one after the other, the remaining control processes went into RWMBX. I was in the processes of identifying which of the processes was the culprit, when they all went out of RWMBX. Slowly the systems began reconnecting to PCM, and then rather quickly began disconnecting themselves again. I have since gone back to the previous image, and also changed that other parameter back, as everything was running fine under that configuration. I think you have almost got it, considering that before it only ran for about 4-5 hours before falling over. I'll try another image if you want to have another go at it. Steve | |||||
510.9 | CM and OSCint | BER::MUELLER | Wed Dec 14 1994 08:35 | 26 | |
---------------------------------------------------------------------------- This entry is cross-posted in the OSCINT conference ---------------------------------------------------------------------------- Hallo, I'm running CM 1.5 ECO 1 on VMS 6.1 together with OSCint V2 also got a problem with processes getting into RWMBX state. The following processes got into RWMBX state about every 2 hours: Console Notify Console Ctrl 01 Console Ctrl 02 OSCINT$PSW_PCM I copied over the latest CONSOLE$DAEMON.EXE mentioned in this topic and tried to run CM with this new exe, but I still have the RWMBX problem, but now only for the Console Notify and OSCINT$PSW_PCM processes. When I run the CM without OSCINT it seems to be fine (at least for 1 day). Sabine Mueller OSC-GY | |||||
510.10 | OPG::PHILIP | And through the square window... | Wed Dec 14 1994 09:51 | 19 | |
Steve, Sorry its not working for you, you dont happen to be running OSCint too do you? Sabine, When the processes go into RWMBX, is it as a result of a large number of events from PSW via the PSC to PCM feeder? It would suggest that as you dont appear to be having problems when the PSW feeder isnt running, then that may well be the cause of our problems! I am clutching at straws here as I dont really know what the problem is, I only know that we CANNOT get this stuff to behave badly on our test configuration, however, we are running a pure PCM installation, with no other external applications. Cheers, Phil | |||||
510.11 | BER::MUELLER | Wed Dec 14 1994 12:45 | 14 | ||
>> When the processes go into RWMBX, is it as a result of a large number of >> events from PSW via the PSC to PCM feeder? It would suggest that This morning I removed the filter which does automatic repairs (it automatically restarts printqueue), and OSCint was coming up fine. I could reproduce the behaviour with the RWMBX when I included the repair again. Before this RWMBX problem appeared the autom. repair was running and behaving normally. - Sabine | |||||
510.12 | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Wed Dec 14 1994 18:12 | 15 | |
I haven't said anything up to now but there seems to be a lot of occurences of late so here goes. I have been able to a brief look at only one of these situations as the customer had to get it back up quickly. What happens here is that ENS backs up waiting on an action to pull stuff from a mailbox which can eventually cause *all* of the controller daemons to go into RWMBX. In the one case I looked at it appeared the Action routine had terminated abnormally and the timing was such that ENS got stuck in the RWMBX state. I too have never been able to reproduce the exact scenario that I dealt with. What needs to be done is for us to access a brken system and be given the time to really debug this. Regs, Dan | |||||
510.13 | Access to system | OPCO::TSG_SJM | Coming live to you from Rosebery | Wed Dec 14 1994 21:47 | 13 |
Phil Nope not currently running OSCint, but when I was I was having all sorts of problems, these have now disappeared with the deinstallation of OSCint. The main problem we were experiencing with OSCint was that ENS would shutdown on a very regular basis. If you want to look at the system running PCM here, I can arrange that. The system running PCM isn't on the network, we can discuss access off-line. Cheers Steve | |||||
510.14 | available for testing | BER::MUELLER | Thu Dec 15 1994 08:53 | 11 | |
If you are interested in the CM / OSCint configuration you can access my system. As I mentioned before, I could reproduce the 'error'. Unfortunatelly, today I have to use CM to access a remote system for installation, but tomorrow testing won't be a problem. Just give me a call. - Sabine 859-3334 | |||||
510.15 | RWMBX again | LEMAN::NEUWEILER | System Support Geneva | Fri Dec 16 1994 10:45 | 332 |
The problem RWMBX problem reappeard again with the new CONSOLE$DAEMON.EXE installed. There had been a temporary network problem at the customer site, the console processes went into RWMBX state and never left it again. At that point the customer took a crash dump. Attached you can find an extract showing the situation: Current process summary ----------------------- Extended Indx Process name Username State Pri PCB PHD Wkset -- PID -- ---- --------------- ----------- ------- --- -------- -------- ------ 00000081 0001 SWAPPER HIB 16 826051A0 82605000 0 00000B02 0002 OPSMGR_1 OPSMGR LEF 6 82CC8200 844E3200 131 00000085 0005 IPCACP SYSTEM HIB 10 82C99A40 831F4200 86 00000086 0006 ERRFMT SYSTEM HIB 8 82C9A040 832FA200 113 00000088 0008 AUDIT_SERVER AUDIT$SERVER HIB 10 82C9A840 83400200 143 00000089 0009 JOB_CONTROL SYSTEM HIB 10 82C9AA40 83483200 63 0000008A 000A QUEUE_MANAGER SYSTEM HIB 8 82C9AC40 83506200 153 00000E8E 000E _RTA5: OPSMGR LEF 5 82CA1A40 84878200 1190 00000B10 0010 UCX$RSHD_BG49 WGSMONIT LEF 7 82CB2E00 84772200 288 00000091 0011 NETACP DECNET HIB 10 82C9E640 8368F200 903 00000092 0012 EVL DECNET HIB 4 82C9EE40 83589200 127 00000093 0013 SNS$WATCHDOG SYSTEM HIB 5 82CA0040 8389B200 357 00001315 0015 TAMI TAMI LEF 5 82C6BD00 847F5200 366 00000096 0016 NSCHED SYSTEM LEF 8 82C97C40 83795200 150 00000097 0017 REMACP SYSTEM HIB 9 82C99C40 83A24200 70 00000098 0018 SCHED_REMOTE SYSTEM LEF 6 82C9AE40 83AA7200 70 00001219 0019 TAMI_1 TAMI LEF 6 82D379C0 8497E200 262 0000121A 001A TAMI_2 TAMI LEF 6 82CF1780 84A84200 1027 0000131B 001B TAMI_3 TAMI HIB 5 82CE5240 83E3C200 1115 0000009C 001C SMISERVER SYSTEM HIB 9 82C9C440 8391E200 65 0000009D 001D UCX$INET_ACP INTERnet HIB 10 82C9FE40 8360C200 267 0000009E 001E OPSMGR OPSMGR LEF 6 82C99240 83712200 310 0000141F 001F DECW$TE_141F TAMI LEF 7 82CBE200 84B07200 187 000000A0 0020 LATACP SYSTEM HIB 14 82C9D840 83B2A200 117 00001421 0021 TAMI_4 TAMI LEF 5 82C92F80 84B8A200 368 000000A6 0026 DECW$SERVER_0 OPS HIB 6 82C99840 83DB9200 5471 00001627 0027 FIELD FIELD LEF 10 82C9F840 84566200 272 000000A8 0028 DECW$SESSION OPS LEF 7 82C9C640 83EBF200 852 000000AA 002A DECW$MWM OPS LEF 4 82C9DA40 83818200 959 000000AC 002C VUE$OPS_3 OPS LEF 5 82C9E440 83F42200 411 000000AD 002D VUE$OPS_4 OPS LEF 5 82C9F240 83FC5200 385 000000AE 002E DECW$TE_00AE OPS LEF 7 82C9F440 84048200 4538 000000AF 002F _FTA5: OPS LEF 6 82CA1040 83277200 225 00000130 0030 _FTA9: OPS LEF 5 82C6B700 8435A200 284 000000B1 0031 OPS OPS LEF 10 82C9EC40 8414E200 255 00000BB8 0038 _RTA4: CLUSMGR LEF 8 82CA5E00 843DD200 336 00000BB9 0039 _RTA3: OPSMGR LEF 5 82C96900 83BAD200 307 00000E41 0041 Console Notify OPSMGR RWMBX 6 82CA0C40 842D7200 1337 000013C2 0042 Console Daemon OPSMGR HIB 5 82CB1940 84254200 677 000013C3 0043 Console Ctrl 01 OPSMGR RWMBX 6 82C9DC40 8466C200 2864 000013C4 0044 Console Ctrl 02 OPSMGR RWMBX 6 82CE5580 84E19200 3004 000011C5 0045 CLUSMGR CLUSMGR LEF 5 82CE5A40 845E9200 372 000000CB 004B _FTA8: OPS LEF 5 82C9BC40 83D36200 276 000014D0 0050 SERVER_0014 OPSMGR LEF 7 82C66B00 848FB200 423 00000951 0051 BATCH_294 OPSMGR LEF 6 82CD1280 846EF200 736 00001752 0052 SERVER_0012 OPSMGR LEF 7 82C65F80 841D1200 437 000016D3 0053 SERVER_0011 OPSMGR LEF 7 82C88300 84A01200 423 00000AD4 0054 OPCOM SYSTEM HIB 7 82C9E040 8337D200 146 00000D55 0055 SERVER_000F OPSMGR LEF 7 82C82680 84D96200 435 00000CD7 0057 OPSMGR_2 OPSMGR HIB 6 82C62C00 83CB3200 331 00001358 0058 DECW$TE_1358 OPSMGR LEF 7 82C69240 83C30200 1534 000016D9 0059 OPSMGR_3 OPSMGR LEF 4 82C8A240 84460200 1184 000003E3 0063 _FTA12: OPS LEF 5 82CC2940 839A1200 254 SDA> set proc/ind=41 SDA> show proc/chan Process index: 0041 Name: Console Notify Extended PID: 00000E41 ------------------------------------------------------------------- Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA300: 0020 82C93B80 DKA300:(3039,7,0) 0030 82BCCDC0 DKA300:(1108,3,0) (section file) 0040 82BCB600 DKA300:(191,3,0) (section file) 0050 82BCE040 DKA300:(643,3,0) (section file) 0060 82C98040 DKA300:(5266,2,0) (section file) 0070 00000000 NLA0: 0080 82C262C0 DKA300:(303,159,0) 0090 82BCC700 DKA300:(1095,3,0) (section file) 00A0 82BCE600 DKA300:(415,3,0) (section file) 00B0 82BCE6C0 DKA300:(400,3,0) (section file) 00C0 82BC9DC0 DKA300:(1232,2,0) (section file) 00D0 82BD5BC0 DKA300:(2011,2,0) (section file) 00E0 82BCE300 DKA300:(1233,2,0) (section file) 00F0 82BD1E40 DKA300:(1437,2,0) (section file) 0100 82BD6D00 DKA300:(2001,2,0) (section file) 0110 82BCCAC0 DKA300:(642,54,0) (section file) 0120 82BD6300 DKA300:(1999,2,0) (section file) 0130 82BD6A00 DKA300:(2012,2,0) (section file) 0140 82BD60C0 DKA300:(2009,2,0) (section file) 0150 82BCFC00 DKA300:(2004,2,0) (section file) 0160 00000000 DKA300: 0170 00000000 Busy MBA8360: 0180 00000000 MBA8361: 0190 00000000 Busy MBA8362: 01A0 00000000 Busy MBA8363: 01B0 00000000 MBA8364: 01C0 00000000 Busy MBA8365: 01D0 00000000 MBA8366: 01E0 00000000 Busy MBA8367: 01F0 00000000 MBA8368: 0220 00000000 MBA8372: 0230 00000000 MBA8373: 0240 00000000 MBA8378: 0250 00000000 MBA8379: 0260 00000000 MBA8380: 0270 00000000 MBA8381: 0280 00000000 Busy MBA8805: 0290 00000000 MBA8806: 0320 00000000 Busy MBA8812: 0330 00000000 MBA8813: 0340 00000000 Busy MBA8824: 0350 00000000 MBA8825: SDA> show dev/addr=@r5 I/O data structures ------------------- MBA8825 MBX UCB address: 82C63500 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 1864 ORB address 82C27700 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 00000008 Byte count 0000 LNM address 82AFDEA0 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 001C0041 *** I/O request queue is empty *** SDA> set proc/ind=43 SDA> show dev/addr=@r5 I/O data structures ------------------- MBA8378 MBX UCB address: 82C68000 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 162 ORB address 82C1C840 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 00000008 Byte count 0000 LNM address 82AFB280 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 001C0041 *** I/O request queue is empty *** SDA> show proc/chan Process index: 0043 Name: Console Ctrl 01 Extended PID: 000013C3 -------------------------------------------------------------------- Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA300: 0020 82BCE380 DKA300:(469,278,0) 0030 82BCCDC0 DKA300:(1108,3,0) (section file) 0040 82BCB600 DKA300:(191,3,0) (section file) 0050 82BCE040 DKA300:(643,3,0) (section file) 0060 82C98040 DKA300:(5266,2,0) (section file) 0070 82C3C340 DKA300:(3425,24,0) 0080 82C26500 DKA300:(3016,75,0) 0090 82BCC700 DKA300:(1095,3,0) (section file) 00A0 82BCE600 DKA300:(415,3,0) (section file) 00B0 82BCE6C0 DKA300:(400,3,0) (section file) 00C0 82BC9DC0 DKA300:(1232,2,0) (section file) 00D0 82BD5BC0 DKA300:(2011,2,0) (section file) 00E0 82BCE300 DKA300:(1233,2,0) (section file) 00F0 82BD1E40 DKA300:(1437,2,0) (section file) 0100 82BD6D00 DKA300:(2001,2,0) (section file) 0110 82BCCAC0 DKA300:(642,54,0) (section file) 0120 82BD6300 DKA300:(1999,2,0) (section file) 0130 82BD6A00 DKA300:(2012,2,0) (section file) 0140 82BD60C0 DKA300:(2009,2,0) (section file) 0150 82BCFC00 DKA300:(2004,2,0) (section file) 0160 00000000 DKA300: 0170 00000000 MBA8374: 0180 00000000 MBA8375: 0190 00000000 MBA8378: 01A0 00000000 Busy MBA8379: 0480 00000000 LTA5105: 0490 82C0A6C0 DKA300:(3725,5,0) 04A0 82C52180 DKA300:(3726,114,0) 04B0 82C20740 DKA300:(3720,106,0) 04C0 00000000 Busy MBA8424: 0520 00000000 LTA5109: 0530 82C05440 DKA300:(485,11,0) 0540 82C2E3C0 DKA300:(486,127,0) 0550 82BFAF40 DKA300:(492,56,0) 0560 00000000 Busy MBA8428: 0570 00000000 LTA5111: 0580 82C2F380 DKA300:(505,20,0) 0590 82C4C180 DKA300:(520,135,0) 05A0 82BF1E80 DKA300:(525,202,0) 05B0 00000000 Busy MBA8430: 06F0 82BCD8C0 DKA300:(1310,2,0) (section file) 0700 82BCEB40 DKA300:(1316,2,0) (section file) 0710 82BD9D80 DKA300:(2028,2,0) (section file) 0720 82BD96C0 DKA300:(1450,2,0) (section file) 0730 82BD90C0 DKA300:(2026,2,0) (section file) SDA> set proc/ind=44 SDA> show dev/addr=@r5 I/O data structures ------------------- MBA8380 MBX UCB address: 82C86F00 Device status: 00000010 online Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv 00000200 nnm Owner UIC [000001,000004] Operation count 1613 ORB address 82BF7880 PID 00000000 Error count 0 DDB address 8286BAF8 Class/Type A0/01 Reference count 2 DDT address 8258CF08 Def. buf. size 1024 BOFF 0000 CRB address 8286C28C DEVDEPEND 00000009 Byte count 0000 LNM address 82AFB340 DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue empty FLCK index 28 DEVSTS 0002 DLCK address 82866800 Charge PID 001C0041 *** I/O request queue is empty *** SDA> show proc/chan Process index: 0044 Name: Console Ctrl 02 Extended PID: 000013C4 -------------------------------------------------------------------- Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA300: 0020 82BDA840 DKA300:(469,278,0) 0030 82BCCDC0 DKA300:(1108,3,0) (section file) 0040 82BCB600 DKA300:(191,3,0) (section file) 0050 82BCE040 DKA300:(643,3,0) (section file) 0060 82C98040 DKA300:(5266,2,0) (section file) 0070 82C34A80 DKA300:(3441,97,0) 0080 82BFA040 DKA300:(3440,23,0) 0090 82BCC700 DKA300:(1095,3,0) (section file) 00A0 82BCE600 DKA300:(415,3,0) (section file) 00B0 82BCE6C0 DKA300:(400,3,0) (section file) 00C0 82BC9DC0 DKA300:(1232,2,0) (section file) 00D0 82BD5BC0 DKA300:(2011,2,0) (section file) 00E0 82BCE300 DKA300:(1233,2,0) (section file) 00F0 82BD1E40 DKA300:(1437,2,0) (section file) 0100 82BD6D00 DKA300:(2001,2,0) (section file) 0110 82BCCAC0 DKA300:(642,54,0) (section file) 0120 82BD6300 DKA300:(1999,2,0) (section file) 0130 82BD6A00 DKA300:(2012,2,0) (section file) 0140 82BD60C0 DKA300:(2009,2,0) (section file) 0150 82BCFC00 DKA300:(2004,2,0) (section file) 0160 00000000 DKA300: 0170 00000000 MBA8376: 0180 00000000 MBA8377: 0190 00000000 MBA8380: 01A0 00000000 Busy MBA8381: 02A0 00000000 LTA5094: 02B0 82BCD8C0 DKA300:(1310,2,0) (section file) 02C0 82BCEB40 DKA300:(1316,2,0) (section file) 02D0 82BD9D80 DKA300:(2028,2,0) (section file) 02E0 82BD96C0 DKA300:(1450,2,0) (section file) 02F0 82BD90C0 DKA300:(2026,2,0) (section file) 0300 82C022C0 DKA300:(3552,59,0) 0310 82C06640 DKA300:(3556,54,0) 0320 82C5D700 DKA300:(4111,4,0) 0330 00000000 Busy MBA8401: 0390 00000000 LTA5098: 03A0 82BF3E00 DKA300:(3745,43,0) 03B0 82BFDAC0 DKA300:(3746,322,0) 03C0 82C19E40 DKA300:(3744,196,0) 03D0 00000000 Busy MBA8409: 03E0 00000000 LTA5100: 03F0 82C10300 DKA300:(3749,69,0) 0400 82C42C40 DKA300:(3751,150,0) 0410 82C52E40 DKA300:(3748,203,0) 0420 00000000 Busy MBA8411: 04D0 00000000 LTA5106: 04E0 82C2E840 DKA300:(3759,341,0) 04F0 82C0C1C0 DKA300:(3761,46,0) 0500 82C03F40 DKA300:(3757,191,0) 0510 00000000 Busy MBA8425: 05C0 00000000 LTA5112: 05D0 82C46480 DKA300:(3778,58,0) 05E0 82BF8600 DKA300:(3780,304,0) 05F0 82C3A240 DKA300:(3777,130,0) 0600 00000000 Busy MBA8431: 0610 00000000 LTA5114: 0620 82C0AC00 DKA300:(3631,4,0) 0630 82C4EAC0 DKA300:(3641,4,0) 0640 82C27400 DKA300:(3620,4,0) 0650 00000000 Busy MBA8433: 0660 00000000 LTA5116: 0670 82BEA680 DKA300:(502,317,0) 0680 82BEB100 DKA300:(521,71,0) 0690 82C0BD40 DKA300:(528,16,0) 06A0 00000000 Busy MBA8435: 06F0 00000000 LTA5119: 0700 82C25900 DKA300:(4114,70,0) 0710 82C3D000 DKA300:(4115,18,0) 0720 82C61300 DKA300:(4116,8,0) SDA> exit | |||||
510.16 | AZUR::HEUSBOURG | Fri Dec 16 1994 14:02 | 34 | ||
<<< EASE::DISK$ALLIN1V24:[NOTES$LIBRARY]OSC_TOOLS_DEVELOPMENT.NOTE;1 >>> -< *** DIGITAL INTERNAL USE ONLY *** >- ================================================================================ Note 160.4 RWMBX Problems 4 of 4 AZUR::HEUSBOURG 27 lines 16-DEC-1994 13:20 -------------------------------------------------------------------------------- Sabine, All, OSCint Repair is an ENS action routine. ENS is using MBXes on OVMS to perform IPC with the action routines (image mode). So of course as soon as ENS receives an event, and the filter matches, the event is written to the dedicated MBX. The action routine then read this MBX to get the events to be treated. What's happening, is that the action routine, depending on the action it has to perform, can be in a situation where it read the MBX slower compared to the speed ENS writes the events into the same MBX. Hence, when the MBX is full, ENS goes temporarily into RWMBX state to let the action routine read more events. To bypass this problem, we changed the OSCint Repair module to make it read the MBX and perform the action asynchronously, so that, even if the action takes time (ie long timeout on a DECnet connection), it still continues to read the MBX and queues the events. I'm going to rebuild a kit and put it public at the begining of next week. I'll put "FT" somewhere in the name till I get some validations from you after deeper tests. Regards, Christian. | |||||
510.17 | AZUR::HEUSBOURG | Fri Dec 16 1994 14:38 | 20 | ||
RE .13 Steve, > Nope not currently running OSCint, but when I was I was having all > sorts of problems, these have now disappeared with the deinstallation > of OSCint. The main problem we were experiencing with OSCint was that > ENS would shutdown on a very regular basis. I'll be glad to help you use OSCint, as I don't feel that the ONLY solution is to deinstal it. OSCint is using PCM intensely so can reveal problems which would take longer to appear otherwhise. OSCint is beeing used now on a lot of customer sites without the ENS shutdown problem you explained. So I'm a little surprised, but would be very happy to help you run OSCint on your system if you need the functionalities. Regards, Christian. | |||||
510.18 | Thanks for the offer. | WOTVAX::ELLISM | Are you all sitting too comfybold square on your botty? - Then w | Tue Dec 20 1994 10:04 | 13 |
Christian, Steve is leaving at the end of this month. I'll be working in there from January/February sometime. It was felt that until Console manager itself ran regularly not to throw in another 'wildcard', like OSCint, to potentialy confuse the issue. Basicaly, this is a problem with Console Manager, and not OSCint. When this is resolved, then we can look at any problems OSCint may cause. Martin | |||||
510.19 | OPCO::TSG_SJM | Coming live to you from Rosebery | Wed Dec 21 1994 01:50 | 9 | |
Christian and Martin, Martin, I'm actually finishing up today, whooppee, we seem to have a pretty stable PCM platform at the moment, thanks to the efforts of the PCM team. Tony is taking over from me with the POLYCENTER products, so you may like to touch bases with him. Cheers for the final time Steve |