| [OpenVMS,PW-VMS] PATHWORKS V5.0 Periodically Consumes Nonpaged Pool
COPYRIGHT (c) 1988, 1993 by Digital Equipment Corporation.
ALL RIGHTS RESERVED. No distribution except as provided under contract.
Copyright (c) Digital Equipment Corporation 1996. All rights reserved.
PRODUCT: OpenVMS VAX, Versions 5.5-2 and above
OpenVMS Alpha, Versions 1.5 and above
PATHWORKS for VMS, Versions 5.0 through 5.0D
SOURCE: Digital Equipment Corporation
SYMPTOM:
OpenVMS VAX and Alpha systems, running PATHWORKS, version 5.0 through
5.0D, can periodically consume nonpaged pool (NPAGEDYN) in large
quantities. This pool consumption occurs at periodic intervals of
approximately 24 days, 20 hours, and may cause a variety of symptoms:
- Significant NPAGEDYN expansion, i.e., 20% or more, consuming
memory resources, inducing extra swapping and paging, and
degrading system performance.
- The following system crashes may occur:
o CPUSPINWAIT bugchecks with the POOL spinlock owned while
reclaiming the lookaside list.
o CLUEXIT bugchecks, often with "maintenance timer expiration"
errors in ERRLOG.SYS.
o SSRVEXCEPT bugchecks, due to RFDRIVER exhausting its NPAGEDYN
resources.
o INVEXCEPTN bugchecks may also be possible, for products unable
to handle the shortage of NPAGEDYN.
- System hangs with subsequent analysis revealing:
o Many processes in FPG, RWMPB, PFW, or COLPG state, with
pagefile resources consumed and NPAGEDYN expanded.
o Errorlog entries indicating the inability for certain devices
to initialize, due to insufficient NPAGEDYN.
o SYS-W-POOLEXPF errors on the console.
- On OpenVMS VAX systems you may encounter an oversized lookaside
list for 64 byte packets due to previous expansion caused by
PATHWORKS, and tracked in SYS$SYSTEM:LISTPREPOP.DAT.
Note:
Subsequent reboots use LISTPREPOP.DAT to pre-populate the
lookaside lists. This effectively wastes pool space, which
can cause NPAGEDYN expansion, and performance degradation,
long after the problem has occurred. If this occurs, it is
recommended that LISTPREPOP.DAT;* be deleted just prior to
a reboot.
- On OpenVMS VAX systems, AUTOGEN runs can oversize NPAGEDYN and
NPAGEVIR based on the expansion caused by PATHWORKS. This wastes
pool space and possibly cause performance degradation long after
the problem has occurred.
Note:
After this occurs on an OpenVMS VAX system, it's recommended
that you delete LISTPREPOP.DAT;* just prior to a reboot,
allow the system to run for 2-3 hours, and then re-run
AUTOGEN.
SOLUTION:
Engineering has acknowledged this problem and plans to address it in
a future release.
\
\
\ CSC NOTE - 20-Dec-1996
\
\ All IPMT cases against this problem are being closed with the status
\ 'fixed-next-release'.
\
\ +-----------------+---------------+--------------------------------+
\ | PATHWORKS | TIMEFRAME | Aprox IPMTs Addressed |
\ +-----------------+---------------+--------------------------------+
\ | ECO3 for V5.0D | Late December | fixes most cases - but not all |
\ +-----------------+---------------+--------------------------------+
\ | V5.0E | Early-mid Jan | fixes most cases - but not all |
\ +-----------------+---------------+--------------------------------+
\ | ECO1 for V5.0E | 2 weeks later | fixes all cases |
\ +-----------------+---------------+--------------------------------+
\
\ The fix is to change the counter from a 1ms longword counter to a 10ms
\ longword counter, and to start that counter when the PATHWORKS Server
\ is started. Thus it is very unlikely to recur, unless the system and
\ the PATHWORKS server are up continuously for 497.1 days.
\
\ Please comment on this article when any of the above versions are
\ released.
WORKAROUND:
Ensure PATHWORKS isn't running during the projected problem time, i.e.,
shut down PATHWORKS before the projected problem time and restart it
after the time has passed.
Projected Problem Times
-----------------------
31-OCT-1996 11:52:21.01
25-NOV-1996 07:35:08.31
20-DEC-1996 03:17:55.61
13-JAN-1997 23:00:42.91
7-FEB-1997 18:43:30.21
4-MAR-1997 14:26:17.51
29-MAR-1997 10:09:04.81
23-APR-1997 05:51:52.11
18-MAY-1997 01:34:39.41
11-JUN-1997 21:17:26.71
6-JUL-1997 17:00:14.01
31-JUL-1997 12:43:01.31
25-AUG-1997 08:25:48.61
19-SEP-1997 04:08:35.91
13-OCT-1997 23:51:23.21
7-NOV-1997 19:34:10.51
2-DEC-1997 15:16:57.81
. .
. .
\
\
\ Similar problems with NPAGEDYN expansion have been seen:
\
\ - When PATHWORKS is using an older version of DECnet/OSI, i.e.,
\ V5.6 with ECO 10 or earlier. Upgrading to the latest version
\ of PATHWORKS and DECnet/OSI should address this potential problem.
\
\ [PW-VMS] PWRKV50D_E03050 PATHWORKS V5.0D ECO3 (LAN Manager)
\
\ - With PATHWORKS and SYSCO MULTINET 4.0A. Multinet reportedly has a
\ patched PWIP driver to address this problem.
\
\ [PW-VMS]V5 NPAGEDYN Current at NPAGEVIR on V5.5-2 System/Multinet 4.0A
ANALYSIS:
The problem is caused by the overflow of a longword counter, which is
incrementing at a 1 millisecond interval. Due to the way PATHWORKS
implements the counter, the overflow appears at a periodic rate of
"24 19:42:47.30".
The problem only occurs if you have active NETBEUI connections during
the time frames that a PATHWORKS counter overflows.
An analysis of the system, or crash, typically shows significant
NPAGEDYN expansion, with the lookaside list for the smallest packets
(typically the first) encountering the majority of the growth, and
being populated with PATHWORKS packets.
Following is an example of using SDA to determine the current, initial,
and maximum allowable size for NPAGEDYN:
SDA> EVALUATE @MMG$GL_NPAGNEXT-@MMG$GL_NPAGEDYN ! Current
SDA> EVALUATE @SGN$GL_NPAGEDYN ! Initial
SDA> EVALUATE @SGN$GL_NPAGEVIR ! Maximum
On an OpenVMS VAX systems you can get an idea of the size of the first
lookaside list with the following command:
SDA> VALIDATE QUEUE/SELF @EXE$AR_NPOOL_DATA+40+(8*0)
Queue is complete, total of 404667 elements in the queue
You could then walk the list to see if a lot of the packets contain
PATHWORKS text, e.g., mblk, buff, and dblk text:
SDA> EXAMINE @EXE$AR_NPOOL_DATA+40 ! Front of queue
SDA> EXAMINE .+@.;10 ! Repeat to traverse the
queue.
Example of the data indicating packets containing PATHWORKS text:
6B6C6264 00000040 6B725750 81CFBB40 @��[email protected] 81CFBBC0
6B6C626D 00000040 6B725750 81CFBBC0 ���[email protected] 81CFBB80
66667562 00000040 6B725750 81CFBB80 .��[email protected] 81CFBCC0
6B6C626D 00000040 00000040 00000080 ....@[email protected] 88941300
66667562 00000040 FFFFFF80 00000080 [email protected] 889412C0
6B6C6264 00000040 FFFFFF80 FFFFFFC0 �[email protected] 88941340
NOTE:
If an OpenVMS VAX system has been rebooted, and the LISTPREPOP.DAT
file was not deleted just prior to the reboot, the lookaside list
has been filled with prepopulated zeroed entries.
On an OpenVMS ALPHA system you can see if one of the lookaside list
is large using the following commands:
SDA> CLUE MEMORY/LOOKASIDE
Listhead Addr: 81BA1180 Size: 64 Status: Invalid, possible loop
Possible loop detected, after tracing 10000 elements
You could then walk the list to see if a lot of the packets contain
PATHWORKS text, e.g., mblk, buff, and dblk text:
SDA> EXAMINE @EXE$AR_NPOOL_DATA+40 ! Front of queue
SDA> EXAMINE @.;10 ! Repeat to traverse the
queue.
\
\
\ References:
\
\ VAXAXP::VMSNOTES #1841
\
\
\ CONTRIBUTORS:
\
\ Technical:
\ Mark Morris (140062)
\ Jeff Chisholm (184480)
\
\ Editorial:
\ Reg Hunter (172692)
\
\\ BUGCHK
\\ PROD=OPENVMS-VAX CAT=OPSYS GRP=OPENVMS-VAX OS=OPENVMS-VAX SPD=25.01
\\ PROD=OPENVMS-AXP GRP=OPENVMS-AXP OS=OPENVMS-AXP SPD=41.87
\\ PROD=PW-VMS CAT=COMM GRP=MULTIVENDOR GRP=PATHWORKS OS=OPENVMS-VAX
\\ SPD=30.50 VEND=DEC OFFER=DSKTP-HELP
\\ 140062 184480
\\ SRC961210003429
\\ EDIT_SRQ=C961210-3429 EDIT_SRQ=C961217-2757 EDIT_SRQ=C961220-526
\\ TYPE=TECH_TIPS TYPE=SYMPTOM_SOLUTION
|