T.R | Title | User | Personal Name | Date | Lines |
---|
1509.1 | The story here hasn't changed. | WTFN::SCALES | Despair is appropriate and inevitable. | Mon Mar 24 1997 13:41 | 16 |
| I believe you are running into a limitation in the amount of memory available to
your process. (I can't construct a good, solid justification for this, but I'm
fairly sure that this is the deal.)
The reason why you are hitting this is either because your process is hitting a
situation where it requires more memory than is available or because the process
is leaking memory.
Fixing the first case should be as simple as reconfiguring your system (upping
the vm-vpagemax parameter). If it's the second case, you need to locate the
memory leak (which might not even be in your code). By the way, if you
reconfigure your system for more space, it will make the program fail less often
in either case (which might make it even more frustrating... :-} ).
Webb
|
1509.2 | | SMURF::DENHAM | Digital UNIX Kernel | Mon Mar 24 1997 14:56 | 27 |
| RE: running out of memory. There should be a DECthreads patch for
an apparent memory leak. This is from the V3.2C patch README file:
PROBLEM: (Case ID: MGO101698 ) (Patch ID: OSF350-105)
********
Applications linked with DECthreads will behave as if they have no more
memory available to them when they are not even close to the operating
system limit.
FILE(s):
/usr/shlib/libpthreads.so subset OSFBASE350
CHECKSUM: 34727 568 RCS: cma_vm.c Revision: 4.2.20.2
/usr/include/dce/exc_handling.h subset OSFPGMR350
CHECKSUM: 33034 30 RCS: exc_handling.h Revision: 4.2.16.2
/usr/ccs/lib/libpthreads.a subset OSFPGMR350
CHECKSUM: 32274 438 RCS: cma_vm.c Revision: 4.2.20.2
----------------------------------------------------------------------
As a refresher, this was the bug relating to the bad in/out argument
to the vm_allocate call causing the address space to become
highly fragmented.
Certainly , any 3.2 system with unexplained memory failures should
be running this patch.
|
1509.3 | Thank you very much | AZUR::ANTEUNIS | If it's possible it's not interesting | Tue Mar 25 1997 04:46 | 11 |
| Thanks a lot for all the good recommendations, I'm already setting
up a memadvise run so see if my code does not leak.
I will make sure the patches are installed, we are running V3.2c of Digital UNIX.
I hope I can make the problems stay away long enough to make the software
look like 24 * 365 with overnight or during the weekend "refreshers". I mean
run down the program and re-start it.
Dirk
|
1509.4 | Patch did not help, here some more info | AZUR::ANTEUNIS | If it's possible it's not interesting | Thu Mar 27 1997 04:32 | 121 |
| Together with our system manager I managed to get the patch mentioned
in 1509.2 installed. Because we are running V2.3G he had to install the
patch on his workstation first, and then extract manually the 3 files.
The problem reproduces itself, at a slightly different limit
Here the non-comment contents of /etc/sysconfigtab
# OSF/1 1.2
proc:
max-proc-per-user = 200
vm:
vm-vpagemax = 32768
ipc:
msg-max = 32768
msg-mnb = 65535
sem-mni = 30
sem-msl = 150
sem-opm = 30
sem-ume = 30
setld tells me
OSFBASE350 installed Base System (- Required -)
OSFBASE375 installed Base System - V3.2G (- Required -)
OSFBIN350 installed Standard Kernel Objects (Kernel Build
Environment)
OSFBIN375 installed Standard Kernel Objects - V3.2G (Kernel Build
Environment)
OSFBINCOM350 installed Kernel Header and Common Files (Kernel Build
Environment)
OSFBINCOM375 installed Kernel Header and Common Files - V3.2G(Kernel
Build Environment)
uerf of the reboot we did after the 3 files were copied and making use of the
above sysconfigtab
********************************* ENTRY 1. *********************************
----- EVENT INFORMATION -----
EVENT CLASS OPERATIONAL EVENT
OS EVENT TYPE 300. SYSTEM STARTUP
SEQUENCE NUMBER 0.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed Mar 26 17:28:49 1997
OCCURRED ON SYSTEM lipa1
SYSTEM ID x00060009 CPU TYPE: DEC 2100
SYSTYPE x00000000
MESSAGE PCXAL keyboard, language English
_(American)
Alpha boot: available memory from
_0xe42000 to 0x17ffe000
Digital UNIX V3.2G (Rev. 62); Thu Dec
_26 10:47:06 MET 1996
physical memory = 384.00 megabytes.
available memory = 369.74 megabytes.
using 1466 buffers containing 11.45
_megabytes of memory
Master cpu at slot 0.
Firmware revision: 4.5
PALcode: OSF version 1.45
ibus0 at nexus
AlphaServer 2100 4/275
cpu 0 EV-45 4mb b-cache
cpu 1 EV-45 4mb b-cache
gpc0 at ibus0
pci0 at ibus0 slot 0
tu0: DECchip 21040-AA: Revision: 2.3
tu0 at pci0 slot 0
tu0: DEC TULIP Ethernet Interface,
_hardware address: 08-00-2B-E5-1E-CB
tu0: console mode: selecting 10Base5
_(AUI) port
psiop0 at pci0 slot 1
Loading SIOP: script 1001b00, reg
_81000000, data 406fdad0
scsi0 at psiop0 slot 0
rz0 at scsi0 bus 0 target 0 lun 0 (DEC
_ RZ28 (C) DEC D41C)
rz2 at scsi0 bus 0 target 2 lun 0 (DEC
_ RZ28B (C) DEC 0006)
rz3 at scsi0 bus 0 target 3 lun 0 (DEC
_ RZ28B (C) DEC 0003)
rz4 at scsi0 bus 0 target 4 lun 0 (DEC
_ RRD43 (C) DEC 1084)
rz5 at scsi0 bus 0 target 5 lun 0 (DEC
_ RZ28M (C) DEC 0466)
rz6 at scsi0 bus 0 target 6 lun 0 (DEC
_ RZ26 (C) DEC T386)
tz1 at scsi0 bus 0 target 1 lun 0 (DEC
_ TLZ06 (C)DEC 0491)
eisa0 at pci0
ace0 at eisa0
ace1 at eisa0
lp0 at eisa0
fdi0 at eisa0
fd0 at fdi0 unit 0
ln0 at eisa0
ln0: DEC LANCE Ethernet Interface,
_hardware address: 08-00-2B-BE-F9-AB
vga0 at eisa0
1024x768 (QVision )
lvm0: configured.
lvm1: configured.
dli: configured
SuperLAT. Copyright 1993 Meridian
_Technology Corp. All rights
_reserved.
So now I go hunting for memory leaks and then we'll see.
Dirk
|
1509.5 | I think the patch should have just installed, if it was appropriate... | WTFN::SCALES | Despair is appropriate and inevitable. | Thu Mar 27 1997 09:36 | 15 |
| Dirk,
I'm really nervous about your description of unpacking the kit to install the
patch -- that suggests to me that the patch was inappropriate for the version
that you're running (and maybe installing it was a bad idea -- you shouldn't
have to take the patch kit apart to install it!).
Jeff, if the patch was done on or for V3.2C, wouldn't it already be in V3.2G?
(I.e., didn't he just back out some stuff??)
(BTW, Dirk, a number of our high-performance computing folks have found that
a vm-pagemax of even 32K is low for large data-sets.)
Webb
|
1509.6 | | SMURF::DENHAM | Digital UNIX Kernel | Thu Mar 27 1997 10:08 | 3 |
| You're right, Webb. That allocation bug fix is in V3.2G.
So it's probably the vpagemax issue...
|