[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DECthreads Conference |
|
Moderator: | PTHRED::MARYS TE ON |
|
Created: | Mon May 14 1990 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1553 |
Total number of notes: | 9541 |
1520.0. "Getting Insufficient Virtual Memory Error Under 4.0b" by HYDRA::BRYANT () Wed Apr 09 1997 13:19
I've got a partner who can't run his threaded program, which runs on 3.2 fine,
under 4.0b. He's getting a "forrtl: sever (41): insufficient virtual memory"
error. He believes it to be threads related. I'm not sure so I'm just looking
for any hints on what may be wrong before I receive his libaries to reproduce
this.
Thanks.
Pat Bryant
Software Partners Engineering
Here are the stats
------------------
This is what he's using to link the program DEMO:
decunix> cat linkdbg
f77 -v -o DEMO demo.o \
fmsaut.dbg \
fmsnoshr.a fmslib.a fmsint.a fmslib.a blas.a \
-lpthread -lmach -lexc -lc
decunix> ./linkdbg
/usr/bin/cc -v -o DEMO /usr/lib/cmplrs/fort/for_main.o -O4 demo.o
fmsaut.dbg fmsno
shr.a fmslib.a fmsint.a fmslib.a blas.a -lpthread -lmach -lexc -lc -lUfor
-lfor -l
Futil -lm_4sqrt -lm -lots
/usr/lib/cmplrs/cc/ld -o DEMO -g0 -O4 -call_shared
/usr/lib/cmplrs/cc/crt0.o /usr/
lib/cmplrs/fort/for_main.o demo.o fmsaut.dbg fmsnoshr.a fmslib.a fmsint.a
fmslib.a
blas.a -lpthread -lmach -lexc -lc -lUfor -lfor -lFutil -lm_4sqrt -lm -lots
-lc
/usr/lib/cmplrs/cc/ld:
1.80u 1.90s 0:09 38% 0+107k 0+429io 0pf+0w 107stk+17704mem
decunix>
Here's a dump:
decunix> dbx -r ./DEMO
forrtl: severe (41): insufficient virtual memory
thread 0xa signal IOT/Abort trap at >*[nxm_thread_kill, 0x3ff8053eab0] ret
r3
1, (r26), 1
(dbx) where
> 0 nxm_thread_kill(0x4, 0x140150860, 0x3ff80193d3c, 0x980, 0x14015c018)
[0x3ff80
53eab0]
1 pthread_kill(0x3ffc0082590, 0x20, 0x0, 0x0, 0x11fffffb5) [0x3ff8056ed4c]
2 (unknown)() [0x3ff805756ec]
3 __tis_raise(0x11fffffb5, 0x3ffc0080310, 0x3ff8010fb04, 0x3ffc0080c50,
0x3ff80
159f44) [0x3ff8010fb00]
4 raise(0x3ff8010fb04, 0x3ffc0080c50, 0x3ff80159f44, 0x3ff80575618,
0x3ff80170a
6c) [0x3ff80159f40]
5 abort(0x3ffc0560c30, 0x3ffc05655d0, 0x3ff80d13180, 0x0, 0x600000000)
[0x3ff80
170a68]
6 for__issue_diagnostic(0x29, 0x2, 0x6, 0x11ffff830, 0x0) [0x3ff80d0b614]
7 for__io_return(0x0, 0x0, 0x0, 0x0, 0x0) [0x3ff80d0baec]
8 for_write_seq_lis(0x3ffc00802a0, 0x140142a00, 0x11ffffca0,
0x120009fd0, 0x140
02f760) [0x3ff80d4b0bc]
9 fms$_fmsaut(NOWDAT = [1] 2
[2] 4
[3] 1997
, NOWTIM = [1] 10
[2] 47
[3] 0
, SERIAL = 0.0) ["d5/fmsaut.f":4, 0x1200165bc]
10 fms$_fmsini(0x0, 0x474e414c, 0x400000002, 0xa000007cd, 0x2f)
["d5/fmsini2.f":
1951, 0x12001a9c4]
11 fmsini(0x120016900, 0x120016940, 0x120016980, 0x1200169c0,
0x120016a10) ["d5/
fmsini.f":1716, 0x1200166a4]
12 demo(0x120016980, 0x1200169c0, 0x120016a10, 0x8008460d, 0x1200164e8)
["d5/dem
o.f":2, 0x12001653c]
13 main() ["for_main.c":203, 0x1200164e4]
(dbx) quit
Place in code where it's failing:
decunix> cat fmsaut.f
SUBROUTINE FMS$_FMSAUT (NOWDAT, NOWTIM, SERIAL)
INTEGER*4 NOWDAT(3), NOWTIM(3)
REAL*8 SERIAL
print *,'Hello' <--- This is where it fails
return
end
I had him bump up his limits to unlimited:
decunix> limit
cputime unlimited
filesize unlimited
datasize 1048576 kbytes
stacksize 32768 kbytes
coredumpsize unlimited
memoryuse 58944 kbytes
descriptors 4096 files
addressspace 1048576 kbytes
Unix 4.0B sysconfig -q proc
===========================
proc:
max-proc-per-user = 64
max-threads-per-user = 256
per-proc-stack-size = 2097152
max-per-proc-stack-size = 33554432
per-proc-data-size = 134217728
max-per-proc-data-size = 1073741824
max-per-proc-address-space = 1073741824
per-proc-address-space = 1073741824
autonice = 0
autonice-time = 600
autonice-penalty = 4
open-max-soft = 4096
open-max-hard = 4096
ncallout_alloc_size = 8192
round-robin-switch-rate = 0
round_robin_switch_rate = 0
sched-min-idle = 0
sched_min_idle = 0
give-boost = 1
give_boost = 1
maxusers = 32
task-max = 277
thread-max = 552
num-wait-queues = 64
Unix 4.0B sysconfig -q vm
=========================
vm:
ubc-minpercent = 10
ubc-maxpercent = 100
ubc-borrowpercent = 20
ubc-maxdirtywrites = 5
ubc-nfsloopback = 0
vm-max-wrpgio-kluster = 32768
vm-max-rdpgio-kluster = 16384
vm-cowfaults = 4
vm-mapentries = 200
vm-maxvas = 1073741824
vm-maxwire = 16777216
vm-heappercent = 7
vm-vpagemax = 32768
vm-segmentation = 1
vm-ubcpagesteal = 24
vm-ubcdirtypercent = 10
vm-ubcseqstartpercent = 50
vm-ubcseqpercent = 10
vm-csubmapsize = 1048576
vm-ubcbuffers = 256
vm-syncswapbuffers = 128
vm-asyncswapbuffers = 4
vm-clustermap = 1048576
vm-clustersize = 65536
vm-zone_size = 0
vm-kentry_zone_size = 16777216
vm-syswiredpercent = 80
vm-inswappedmin = 1
vm-page-free-target = 128
vm-page-free-min = 20
vm-page-free-reserved = 10
vm-page-free-optimal = 74
vm-page-prewrite-target = 256
dump-user-pte-pages = 0
kernel-stack-guard-pages = 1
vm-min-kernel-address = 18446744071562067968
contig-malloc-percent = 20
vm-aggressive-swap = 0
new-wire-method = 1
vm-segment-cache-max = 50
vm-page-lock-count = 0
gh-chunks = 0
gh-min-seg-size = 8388608
gh-fail-if-no-mem = 1
T.R | Title | User | Personal Name | Date | Lines |
---|
1520.1 | | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Wed Apr 09 1997 14:01 | 14 |
| Well, it sounds like the program ran out of virtual memory. I don't see any
connection to threads except that, under POSIX, the old ANSI C raise(), which
is used by abort() [which is called by FORTRAN to report the error], is
defined to call pthread_kill() rather than the old kill() -- and DECthreads
implements pthread_kill(). Therefore putting us at the bottom of the call
stack.
I have no idea WHY or HOW the program ran out of virtual memory, or any
evidence on which to speculate. The space used by libraries, both at load
time and at runtime, changes all the time -- the fact that it hit a limit on
4.0 and not earlier (even assuming the system was configured identically)
doesn't mean much in itself.
/dave
|
1520.2 | Look at 1469.26 | EDSCLU::GARROD | IBM Interconnect Engineering | Wed Apr 09 1997 14:26 | 8 |
| Take a look at note 1469.26 and others in that string. Maybe
that is related to the problem.
We had terrible problems getting threaded programs that used a lot of
threads running on Digital UNIX V4. Seems like some system vm
parameters need tweaking.
Dave
|
1520.3 | Still can't get this to work | HYDRA::BRYANT | | Fri Apr 18 1997 12:25 | 32 |
| I've boosted several system parameters and still can't get a non-shared version
of this app to work. The application is reporting the error:
fms$_fork: 12 = pthread_create(0,4831836840,fms$_io,0)
When the same app builds by producing a shared library it works. Built as
non-shared then it fails with the above message. Here is the build which
produces a shared library:
LIBS="-lUfor -lfor -lm -lpthread -lmach -lexc -lc"
ld \
-shared \
-o fmslib.so \
-all \
fmslib.a \
-none \
fmsint.a \
blas.a \
$LIBS \
-set_version fmslib.51
#
f77 -call_shared -o DEMO_SHARE demo.o \
fmsnoshr.a fmslib.so
Here's the one that doesn't:
f77 -o DEMO_NOSHARE demo.o \
fmsnoshr.a fmslib.a fmsint.a fmslib.a \
blas.a \
-lpthread -lmach -lexc -lc
Any thoughts on this?
|
1520.4 | | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Mon Apr 21 1997 07:22 | 10 |
| Well, for one thing, we don't HAVE a non-shared libpthread for 4.0B, so you
cannot link a threaded application non-shared. (Mixing shared libraries with
static archives is not really supported.) I have no idea whether that could
be related to your problem.
> fms$_fork: 12 = pthread_create(0,4831836840,fms$_io,0)
What, exactly, are you passing for the second argument here? That big integer
is, I hope, the address of an attributes object, but your display certainly
doesn't make that obvious.
|