Title: | POLYCENTER Console Manager |
Notice: | Kits, Scans, Docs on CSC32:: as PCM$KITS:,PCM$DOCS:, PCM$SCANS: |
Moderator: | CSC32::BUTTERWORTH |
Created: | Thu Aug 06 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1541 |
Total number of notes: | 6564 |
SUBJECT: System hangs - reboots often necessary SOFTWARE: OpenVMS VAX V6.1 PCM V1.5-006 (w/MUP) PROBLEM STATEMENT: Since adding a number of nodes to CX3PCM, there have been a several problems (much more than usual) with the PCM system. Of late, after addition of 30 more system and the MUP, PCM has been hanging, fails to restart properly. SYMPTOMS on 6520 w/192 active nodes: RECONFIGURE took 10-15 minutes to complete, if it completed at all High use of console output could hang display, and PCM PAGE UP in Log File would hang display Restart of PCM would usually fail UPGRADE: CX3PCM has been reconfigured from a 6520 w/DEBNI to a 6620 w/DEMNA (XMI interface) to deal with resource requirements. ANALYSIS: PCM is a real resource hog with PCM RECONFIGURE, high rates of logging of console data, or moving through the log file: If you hit "page up" several times a second, it results in about 80 DECnet packets/sec out to the NI card. A faster primary CPU was needed to keep up with the several hundred BIO/s that result. Need to replace DEBNI card with DEMNA to deal with excessive network I/O with multiple users With the faster 6620 CPU, I reset Console Deamon to deal with 16 consoles per Console Ctrl process - I had it set to 4 for the 6420 CPU, 8 for the 6520 CPU, and now the SUPPORTED 16 for the 6620 CPU. This will hopefully help with the RWMBX problems. NPAGEDYN is expanding again - next reboot will up this to 8,000,000 bytes. It seems that PCM does not deal well with expanding NPAGEDYN. Replaced the FOUR Striped RA70 log disk with FASTER FOUR RA72 disks for speed and capacity issues. ISSUES with VMS Version of PCM: The VMS version of PCM requires HUGH resources to deal with the support 200 nodes. The terminal I/O design of PCM appears to need work to be more efficient with DECnet I/O. The design of the monitor output for PCM apparently requires one DECnet packet I/O for EACH line of data that is sent to the monitor console. PAGE UP/PAGE DOWN can create up to 120 DECnet I/O/Sec per user. This is excessive, and WILL cause hang of the monitor window unless VERY FAST DECnet NI card and FAST CPU is used to keep up with the normal rate of movement up and down the monitor logs. PAGE UP/DOWN requires HUGH amounts of CPU ... for a PAGE UP key rate of 3-5 per second, it takes about 50% of a 6610 CPU to produce the output. Clearly this seems excessive for a terminal I/O function. With 190+ nodes, RECONFIG on a 6520 w/DEBNI took 10 minutes or more, with both 6500 CPUs saturated. On a 6620 w/DEMNA, same reconfig takes about 30-40 seconds with saturate CPUs. This does NOT scale as expected. If the primary CPU is not fast enough, PCM can hang on reconfig. Jim Lind POLYCENTER Console Manager Summary Totals Configured Systems: 194 User disabled: 2 Active Systems : 192 (D:000 P:000 L:192 T:000) Unreachable: 000 Active Users : 5 (Connect/Monitor: 002 C3: 003 Event sources: 013) CM pid ........: 00000132 V1.5-006 Uptime: 0 08:15:54 ENS pid .......: 00000131 V1.5-006 Uptime: 0 08:15:56 Total bytes ...: 3.80M (0) Ave bps: 127.64 Total lines ...: 74.4K (0) Ave lpm: 149.96 Total events ..: 17 (0) Ave epm: 0.03 Total actions .: 0 (0) Active actions : 0 Failed actions : 0 Crit: 0 Maj: 1 Min: 0 Warn: 0 Clr: 16 Ind: 0
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
560.1 | OPG::PHILIP | And through the square window... | Mon Jan 16 1995 20:22 | 32 | |
Jim, Thanks for the info, you have obviously spent some time putting this together, and its always interesting to see how people are using the software. A couple of points... 1) Yes we know reconfigure is a pig, the daemons have undergone a complete rewrite for V2.0 hopefully this will improve the situation. 2) The memory usage for V2.0 has been reduced as well, the various interfaces no longer load up the database when they start, they query a server process for the info they require. 3) I dont understand your comment about DECnet packets, are you setting host to your PCM system and then doing a console monitor? If so, I cant see how we can change the situation as it is DECnet which controls the size of the packets via the cterm protocol. 4) We have redesigned the IPC mechanism we use for V2.0 such that we are totally unable to produce a RWMBX situation. So you shouldnt see that problem anymore. Now then, dont ask me when you can have V2.0 on OpenVMS that is a question better answered by our product manager. However, suffice to say eventually when we get to ship it, most of your problems will disappear. Finally, you are really close to the "supported" limit on systems connected, do you see a time when you will want to exceed the 200 system limit? Cheers, Phil | |||||
560.2 | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Mon Jan 16 1995 22:34 | 9 | |
> NPAGEDYN is expanding again - next reboot will up this to 8,000,000 > bytes. It seems that PCM does not deal well with expanding NPAGEDYN. What is the value of DEFMBXBUFQUO? PCM uses mailboxes *very heavily* and larger values for this parameter will cause pool consumption especially in your environment! Regs, Dan | |||||
560.3 | DEFMBXBUFQUO at 2048 | BSS::LIND | Jim Lind; 592-4099 CX03-1/N14 CNMC-West | Sat Jan 21 1995 21:20 | 3 |
DEFMBXBUFQUO is 2048 on the CX3PCM at CXO3. Jim Lind | |||||
560.4 | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Tue Jan 24 1995 20:06 | 6 | |
Considering your environment and the number of mailboxes present in it, you could easily consume 2-3 megabytes of non-paged pool just for PCM. Regs, Dan |