[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Digital Fortran |
Notice: | Read notes 1.* for important information |
Moderator: | QUARK::LIONEL |
|
Created: | Thu Jun 01 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1333 |
Total number of notes: | 6734 |
1329.0. "Determining available allocatable memory on UNIX" by QUARK::LIONEL (Free advice is worth every cent) Wed Jun 04 1997 10:13
[I got the following request from a DF UNIX customer - I don't know what to
suggest to him - any assistance would be appreciated. - Steve]
Hi
A while ago I posted a Usenet question asking how to get the process size
from within a fortran program. You replied:
On Tue, 6 May 1997, Steve Lionel wrote:
> Nothing "system related" is going to help, because what you need to know is how
> much space the Fortran libraries have available for the allocation pool.
> And even a Fortran library call won't be able to tell you how much you MIGHT
> be able to allocate - the only way to know for sure is to try it. Another
> problem is that if you take the approach of allocating a lot and then freeing
> it, you expand your virtual address space, consuming pagetable entries and
> other system resources, impacting performance.
>
> Do you REALLY need this info? Even if you could get it, it would not be
> reliable. As a subset, it would be useful to know how much memory the Fortran
> libraries have ALLOCATEd - you could use that to detect memory leaks.
My post explained our current hack to get this info, and yes I REALLY need
to know, our target machine has very limited memory. Basically at start of
execution we allocate and re-allocate a set of arrays until the system
refuses to give as any more, and note the sum of the sizes of all these
arrays. Later we do the same again, this time the system lets us allocate
less space, the difference from the original size is assumed to be how much
we have used dynamically. This is calculated at lots of different places
throughout the code, and we see the dynamic amount grow and shrink is a
very sensible way. However for any of a dozen reasons this approach is
pretty unreliable, but since I received no suggestions as to a better method
it's what we're stuck with.
Part of the reliability problem is that I get output like:
###
calling W25IOW22
memory_size= 40.1774480000001 2041.54696800000
calling W2IOW22
memory_size= 40.1933120000003 2041.53096000000
calling W15IOW22
memory_size= 40.2097440000002 2041.51393600000
calling WTIOW22
memory_size= 40.2261920000003 2041.49804000000
calling IOW567
Segmentation fault
###
( The 2 numbers are current size, and what size_left1 reports as still
usable )
The seg fault is occurring in the get_memory_size (size_left1 actually)
routine itself, and is unpredictable. Sometimes the code runs without any
problem, other times it crashes as shown, quite often in completely
different places in the code. The code contains a mixture of allocatable
arrays & pointer arrays, which are allocated and deallocated repeatedly.
I'm not sure if I can count the seq fault as a bug, but it's certainly
undesired behavior.
I've attached the routine we use, and am looking for suggestions to improve
reliability.
Cheers,
Kevin
--
Kevin Maguire __ __ __ __ Daresbury Laboratory
[email protected] / \ / \ / \ / \ Warrington, Cheshire, UK
____________________/ __\/ __\/ __\/ __\_____________________________
___________________/ /__/ /__/ /__/ /________________________________
| / \ / \ / \ / \ \____ voice: (01925) 603221
|/ \_/ \_/ \_/ \ o \ FAX: (01925) 603634
\_____/--<
MODULE MEM
PRIVATE
PUBLIC::SIZE_LEFT , SIZE_LEFT1
cyfh probing the amount of heap size left
CONTAINS
FUNCTION SIZE_LEFT()
cyfh size left in MegaByte of continous heap space
REAL SIZE_LEFT
INTEGER lb , ub , len , err , mid
INTEGER i , j
INTEGER , pointer::qq(:)
cyfh low bound and upper bound of memory left in words
lb = 1
ub = 1000000000
SIZE_LEFT = ub*KIND(1)/1000000.0
ALLOCATE(qq(ub),STAT=err)
IF ( err.GT.0 ) THEN
IF ( ASSOCIATED(qq) ) THEN
DEALLOCATE(qq)
ENDIF
CALL BISECTION(lb,ub,SIZE_LEFT)
ENDIF
END FUNCTION SIZE_LEFT
FUNCTION SIZE_LEFT1()
cyfh size left in MegaByte of all (not necessarily continous) heap space
REAL SIZE_LEFT1
INTEGER size , err
INTEGER i , j
INTEGER , pointer::qq(:)
SIZE_LEFT1 = 0
cyfh
cyfh find the largest size that can be allocated
size = 2**10
ALLOCATE(qq(size),STAT=err)
DO WHILE ( err.LE.0 )
size = size*2
ALLOCATE(qq(size),STAT=err)
IF ( ASSOCIATED(qq) ) THEN
DEALLOCATE(qq)
ENDIF
ENDDO
CALL GREEDY_FILL(size,SIZE_LEFT1)
SIZE_LEFT1 = SIZE_LEFT1*KIND(1)/1000000.0
END FUNCTION SIZE_LEFT1
RECURSIVE SUBROUTINE GREEDY_FILL(size,size_left)
INTEGER , intent(inout)::size
REAL , intent(inout)::size_left
INTEGER , pointer::qq(:)
INTEGER err
CYFH
CYFH IF ALL IS FILL APART FROM SMALL POTS OF SPACE THEN RETURN
CYFH
IF ( size.LT.100 ) RETURN
ALLOCATE(qq(size),STAT=err)
CYFH
CYFH IF SUCCESSFUL THEN EXPAND
CYFH
IF ( err.LE.0 ) THEN
size_left = size_left + size
size = size*1.1
CALL GREEDY_FILL(size,size_left)
CYFH
CYFH ELSE SHRINK
CYFH
ELSE
size = size/1.1
CALL GREEDY_FILL(size,size_left)
ENDIF
IF ( ASSOCIATED(qq) ) THEN
DEALLOCATE(qq)
ENDIF
END SUBROUTINE GREEDY_FILL
RECURSIVE SUBROUTINE BISECTION(lb,ub,size_left)
REAL size_left
INTEGER lb , ub , mid , err , i , j
INTEGER , pointer::qq(:)
IF ( ub-lb.LT.1000 ) RETURN
mid = (lb+ub)/2
ALLOCATE(qq(mid),STAT=err)
IF ( err.GT.0 ) THEN
IF ( ASSOCIATED(qq) ) THEN
DEALLOCATE(qq)
ENDIF
ub = mid
size_left = lb*KIND(1)/1000000.0
CALL BISECTION(lb,ub,size_left)
ELSE
lb = mid
size_left = mid*KIND(1)/1000000.0
IF ( ASSOCIATED(qq) ) THEN
DEALLOCATE(qq)
ENDIF
CALL BISECTION(lb,ub,size_left)
ENDIF
END SUBROUTINE BISECTION
END MODULE MEM
T.R | Title | User | Personal Name | Date | Lines |
---|
1329.1 | Can you find out what "limited memory" means | SUBPAC::FARICELLI | | Wed Jun 04 1997 10:55 | 24 |
|
>>My post explained our current hack to get this info, and yes I REALLY need
>>to know, our target machine has very limited memory.
I'm puzzled by the mention of "very limited memory". In this context,
people usually mean REAL memory.
But if allocations are done on the heap, then wouldn't you be limited by
the size of virtual memory, not real memory (ignoring that once you
exceeded real memory, you might page like crazy).
So what the program measures is the amount of dynamic virtual memory
available, which is limited by a number of system related things:
amount of page/swap file, shell imposed limits on "datasize",
kernel parameter max. virtual address space, lazy or eager swap file
policy, etc. etc. But real memory isn't one of them.
Almost always, when a process runs out of dynamic memory, it is due to
the shell imposed limit on "datasize" being too low. Next in line is
not enough page/swap file. (One could look at the output of
/usr/sbin/swapon -s to see the total available swap/page file as the
process runs).
-- John Faricelli
|
1329.2 | Ah, a T3D! That explains a lot... | QUARK::LIONEL | Free advice is worth every cent | Wed Jun 04 1997 11:59 | 123 |
| Date: 4-JUN-1997 10:58:06.09
From: SMTP%"[email protected]"
Subj: Re: On memory allocation...
Hi Steve
On Wed, 4 Jun 1997, Steve Lionel wrote:
> [this is from another engineer...]
Understood. May I add that a DUnix OS specific solution is fine, I don't
expect anything remotely portable. I tried playing around with getrusage
calls, but some of the results didn't appear to make much sense (to me at
least!)
> >> My post explained our current hack to get this info, and yes I REALLY need
> >> to know, our target machine has very limited memory.
>
> I'm puzzled by the mention of "very limited memory". In this context,
> people usually mean REAL memory.
very limited memory == 64 MB real memory on a node of a T3D (which has no
virtaul memory at all!) I have a code with loads of data sets, some of which
will run on a node of a T3D, some will not. I'm trying to squeeze things so
that the code fits on a node for as many cases as possible, with dynamic
allocation / deallocation, dumping/reading some things to/from disk, etc. I
can obviously use size to get the static size of my process at the outset,
but I need to track how much dynamic memory I'm using as the process
progresses to find out the high-water mark for the whole code. Any
tool/technique which can accomplish this solves my problem. Unfortunately
the code contains hundreds (if not thousands) of calls to F90 allocate /
deallocate, so tracking from within the code seems impractical. Something
that tracks F90 allocate / deallocate calls wpould probably give me all the
info I need. Is there an ATOM tool to do this?
> But if allocations are done on the heap, then wouldn't you be limited by
> the size of virtual memory, not real memory (ignoring that once you
> exceeded real memory, you might page like crazy).
Yes. I should explained this more fully .....
> So what the program measures is the amount of dynamic virtual memory
> available, which is limited by a number of system related things:
> amount of page/swap file, shell imposed limits on "datasize",
> kernel parameter max. virtual address space, lazy or eager swap file
> policy, etc. etc. But real memory isn't one of them.
My process must have a virtual memory size, on the T3D this will be the real
memory size (a pretty crude approximation I know). And on a T3D node (or
T3E for that matter) there is only one use process plus a microkernel, so no
resource sharing at all. I could of course do all my development on the T3D
itself, but then I must contend with long turnaround times, slow compilers,
limited access, etc. I prefer the DEC compilers / tools / environemt
anuway, and the 8400 I use is lightning quick on this code.
> Almost always, when a process runs out of dynamic memory, it is due to
> the shell imposed limit on "datasize" being too low. Next in line is
> not enough page/swap file. (One could look at the output of
> /usr/sbin/swapon -s to see the total available swap/page file as the
> process runs).
The system I use for development is a 6 processor DEC 8400, with 2GB
physical memory. limits are :
% limit
cputime 30:00
filesize unlimited
datasize 2097152 kbytes
stacksize 32768 kbytes
coredumpsize 0 kbytes
memoryuse 1048576 kbytes
descriptors 4096
% limit -h
cputime 30:00
filesize unlimited
datasize 2097152 kbytes
stacksize 32768 kbytes
coredumpsize unlimited
memoryuse 1048576 kbytes
descriptors 4096
% /usr/sbin/swapon -s
Swap partition /dev/rz12b (default swap):
Allocated space: 12800 pages (100MB)
In-use space: 8046 pages ( 62%)
Free space: 4754 pages ( 37%)
Swap partition /dev/rz24b:
Allocated space: 261877 pages (2045MB)
In-use space: 7960 pages ( 3%)
Free space: 253917 pages ( 96%)
Swap partition /dev/rz32b:
Allocated space: 261877 pages (2045MB)
In-use space: 7971 pages ( 3%)
Free space: 253906 pages ( 96%)
Swap partition /dev/rz40b:
Allocated space: 261877 pages (2045MB)
In-use space: 7964 pages ( 3%)
Free space: 253913 pages ( 96%)
Swap partition /dev/rz49b:
Allocated space: 261877 pages (2045MB)
In-use space: 7971 pages ( 3%)
Free space: 253906 pages ( 96%)
Swap partition /dev/rz57b:
Allocated space: 261877 pages (2045MB)
In-use space: 8006 pages ( 3%)
Free space: 253871 pages ( 96%)
Total swap allocation:
Allocated space: 1322185 pages (10329MB)
Reserved space: 133735 pages ( 10%)
In-use space: 47918 pages ( 3%)
Available space: 1188450 pages ( 89%)
I hope that supplies some helpful info.
Thanks,
Kevin
--
Kevin Maguire __ __ __ __ Daresbury Laboratory
[email protected] / \ / \ / \ / \ Warrington, Cheshire, UK
____________________/ __\/ __\/ __\/ __\_____________________________
___________________/ /__/ /__/ /__/ /________________________________
| / \ / \ / \ / \ \____ voice: (01925) 603221
|/ \_/ \_/ \_/ \ o \ FAX: (01925) 603634
\_____/--<
|
1329.3 | is interfacing to C kosher? | SMURF::PETERT | rigidly defined areas of doubt and uncertainty | Wed Jun 04 1997 15:04 | 111 |
| Hmmm, I think I can tell you how to find the total size of the
current process, but I'm not sure how to go about finding the
amount left to allocate. Oops, wait, just figured that out.
It involves a few C calls, or you can write a C subroutine
and just call that from the Fortran routine. The C calls are
system calls and involve C structures that are returned via
pointers so it might be easiest to go the C subroutine called
from fortran main program route. The system calls involved are
getrlimit and one of the procfs subroutines. Getrlimit is
in the manpages (which is how I found it after I started this
reply) and gives you the current and maximum limits available
to the process. I take this to mean that it gives you the maximum
number of bytes you can grab in total. The proc call is
an ioctl call in which you request PIOCPSINFO, and among the
data returned are pr_size, which gives the total size of the
process in PAGES, and pr_rssize, the resident set size, also
in pages. So you have to convert that by multiplying the returned
value by whatever the page size is, to get a comparible figure.
In order to do the proc ioctl, you have to open up the process in
/proc, and then request the info via the ioctl. Here's a test
program I wrote up in C to do this. It should be easily modifiable
to do what they want:
#include <malloc.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/resource.h>
#include <sys/signal.h>
#include <sys/fault.h>
#include <sys/syscall.h>
#include <sys/procfs.h>
#include <sys/ioctl.h>
int main (int argc, char **argv)
{
char fileName[12];
int fd;
int i;
prpsinfo_t pInfo;
void *buff;
struct rlimit *rlp;
sprintf (fileName, "/proc/%d", getpid ());
fprintf (stdout, "Opening %s\n", fileName);
fd = open (fileName, O_RDWR);
if (!fd)
{
fprintf (stderr, "Failed to open %s\n", fileName);
return (-1);
}
rlp = (struct rlimit *) malloc (sizeof(struct rlimit));
getrlimit(RLIMIT_AS, rlp);
printf("Current limit %ld, maximum limit %ld\n", rlp->rlim_cur,
rlp->rlim_max);
for (i = 0; i < 10; i++)
{
if (ioctl (fd, PIOCPSINFO, &pInfo) == -1)
{
fprintf (stderr, "Failed ioctl errno: %d\n", errno);
return (-1);
}
fprintf (stdout, "RSS: %d\n", pInfo.pr_rssize);
fprintf (stdout, "Size: %d\n", pInfo.pr_size);
buff = malloc (50000);
}
close (fd);
return (0);
}
Here's the output (Note that the size is the same in a few subsequent
runs through the loop. I think the system grabbed enough so that
the subsequent alloc, while increasing the resident set size,
did not cause the overall size of the process to be increased.)
petert@dippikill 71> rss
Opening /proc/1245
Current limit 1073741824, maximum limit 1073741824
RSS: 15
Size: 199
RSS: 16
Size: 206
RSS: 17
Size: 206
RSS: 19
Size: 213
RSS: 21
Size: 220
RSS: 23
Size: 227
RSS: 25
Size: 234
RSS: 27
Size: 241
RSS: 28
Size: 248
RSS: 29
Size: 248
PeterT
|