[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::fortran

Title:	Digital Fortran
Notice:	Read notes 1.* for important information
Moderator:	QUARK::LIONEL

Created:	Thu Jun 01 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	1333
Total number of notes:	6734

1329.0. "Determining available allocatable memory on UNIX" by QUARK::LIONEL (Free advice is worth every cent) Wed Jun 04 1997 09:13

[I got the following request from a DF UNIX customer - I don't know what to
suggest to him - any assistance would be appreciated. - Steve]

Hi

A while ago I posted  a Usenet question asking how to get the process size
from within a fortran program.  You replied:

On Tue, 6 May 1997, Steve Lionel wrote:
> Nothing "system related" is going to help, because what you need to know is how
> much space the Fortran libraries have available for the allocation pool.
> And even a Fortran library call won't be able to tell you how much you MIGHT
> be able to allocate - the only way to know for sure is to try it.  Another
> problem is that if you take the approach of allocating a lot and then freeing
> it, you expand your virtual address space, consuming pagetable entries and
> other system resources, impacting performance.
> 
> Do you REALLY need this info?  Even if you could get it, it would not be
> reliable.  As a subset, it would be useful to know how much memory the Fortran
> libraries have ALLOCATEd - you could use that to detect memory leaks.

My post explained our current hack to get this info, and yes I REALLY need
to know, our target machine has very limited memory.  Basically at start of
execution we allocate and re-allocate a set of arrays until the system
refuses to give as any more, and note the sum of the sizes of all these
arrays.  Later we do the same again, this time the system lets us allocate
less space, the difference from the original size is assumed to be how much
we have used dynamically.  This is calculated at lots of different places
throughout the code, and we see the dynamic amount grow and shrink is a
very sensible way.  However for any of a dozen reasons this approach is
pretty unreliable, but since I received no suggestions as to a better method
it's what we're stuck with.

Part of the reliability problem is that I get output like:

###
 calling W25IOW22
 memory_size=    40.1774480000001        2041.54696800000     
 calling W2IOW22
 memory_size=    40.1933120000003        2041.53096000000     
 calling W15IOW22
 memory_size=    40.2097440000002        2041.51393600000     
 calling WTIOW22
 memory_size=    40.2261920000003        2041.49804000000     
 calling IOW567
Segmentation fault
###

( The 2 numbers are current size, and what size_left1 reports as still
usable )

The seg fault is occurring in the get_memory_size (size_left1 actually)
routine itself, and is unpredictable.  Sometimes the code runs without any
problem, other times it crashes as shown, quite often in completely
different places in the code.  The code contains a mixture of allocatable
arrays & pointer arrays, which are allocated and deallocated repeatedly. 
I'm not sure if I can count the seq fault as a bug, but it's certainly
undesired behavior.

I've attached the routine we use, and am looking for suggestions to improve
reliability.

Cheers,
Kevin
--
Kevin Maguire         __    __    __    __           Daresbury Laboratory
[email protected]   /  \  /  \  /  \  /  \      Warrington, Cheshire, UK
____________________/  __\/  __\/  __\/  __\_____________________________
___________________/  /__/  /__/  /__/  /________________________________
                   | / \   / \   / \   / \  \____   voice: (01925) 603221
                   |/   \_/   \_/   \_/   \    o \    FAX: (01925) 603634
                                           \_____/--<                    

      MODULE MEM
      PRIVATE
      PUBLIC::SIZE_LEFT , SIZE_LEFT1
 
cyfh probing the amount of heap size left
      CONTAINS
      FUNCTION SIZE_LEFT()
cyfh size left in MegaByte of continous heap space
      REAL SIZE_LEFT
      INTEGER lb , ub , len , err , mid
      INTEGER i , j
      INTEGER  , pointer::qq(:)
 
cyfh low bound and upper bound of memory left in words
      lb = 1
      ub = 1000000000
      SIZE_LEFT = ub*KIND(1)/1000000.0
      ALLOCATE(qq(ub),STAT=err)
      IF ( err.GT.0 ) THEN
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         CALL BISECTION(lb,ub,SIZE_LEFT)
      ENDIF
      END FUNCTION SIZE_LEFT
 
      FUNCTION SIZE_LEFT1()
cyfh size  left in MegaByte of all (not necessarily continous) heap space
      REAL SIZE_LEFT1
      INTEGER size , err
      INTEGER i , j
      INTEGER  , pointer::qq(:)
 
      SIZE_LEFT1 = 0
cyfh
cyfh find the largest size that can be allocated
      size = 2**10
      ALLOCATE(qq(size),STAT=err)
      DO WHILE ( err.LE.0 )
         size = size*2
         ALLOCATE(qq(size),STAT=err)
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
      ENDDO
      CALL GREEDY_FILL(size,SIZE_LEFT1)
 
      SIZE_LEFT1 = SIZE_LEFT1*KIND(1)/1000000.0
      END FUNCTION SIZE_LEFT1
 
      RECURSIVE SUBROUTINE GREEDY_FILL(size,size_left)
      INTEGER  , intent(inout)::size
      REAL  , intent(inout)::size_left
      INTEGER  , pointer::qq(:)
      INTEGER err
CYFH
CYFH IF ALL IS FILL APART FROM SMALL POTS OF SPACE THEN RETURN
CYFH
      IF ( size.LT.100 ) RETURN
 
      ALLOCATE(qq(size),STAT=err)
CYFH
CYFH IF SUCCESSFUL THEN EXPAND
CYFH
      IF ( err.LE.0 ) THEN
         size_left = size_left + size
         size = size*1.1
         CALL GREEDY_FILL(size,size_left)
CYFH
CYFH ELSE SHRINK
CYFH
      ELSE
         size = size/1.1
         CALL GREEDY_FILL(size,size_left)
      ENDIF
      IF ( ASSOCIATED(qq) ) THEN
         DEALLOCATE(qq)
      ENDIF
      END SUBROUTINE  GREEDY_FILL

      RECURSIVE SUBROUTINE BISECTION(lb,ub,size_left)
      REAL size_left
      INTEGER lb , ub , mid , err , i , j
      INTEGER  , pointer::qq(:)
 
      IF ( ub-lb.LT.1000 ) RETURN
      mid = (lb+ub)/2
      ALLOCATE(qq(mid),STAT=err)
      IF ( err.GT.0 ) THEN
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         ub = mid
         size_left = lb*KIND(1)/1000000.0
         CALL BISECTION(lb,ub,size_left)
      ELSE
         lb = mid
         size_left = mid*KIND(1)/1000000.0
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         CALL BISECTION(lb,ub,size_left)
      ENDIF
 
      END SUBROUTINE  BISECTION
      END MODULE MEM

T.R	Title	User	Personal Name	Date	Lines
1329.1	Can you find out what "limited memory" means	SUBPAC::FARICELLI		`Wed Jun 04 1997 09:55`	24
	>>My post explained our current hack to get this info, and yes I REALLY need >>to know, our target machine has very limited memory. I'm puzzled by the mention of "very limited memory". In this context, people usually mean REAL memory. But if allocations are done on the heap, then wouldn't you be limited by the size of virtual memory, not real memory (ignoring that once you exceeded real memory, you might page like crazy). So what the program measures is the amount of dynamic virtual memory available, which is limited by a number of system related things: amount of page/swap file, shell imposed limits on "datasize", kernel parameter max. virtual address space, lazy or eager swap file policy, etc. etc. But real memory isn't one of them. Almost always, when a process runs out of dynamic memory, it is due to the shell imposed limit on "datasize" being too low. Next in line is not enough page/swap file. (One could look at the output of /usr/sbin/swapon -s to see the total available swap/page file as the process runs). -- John Faricelli
1329.2	Ah, a T3D! That explains a lot...	QUARK::LIONEL	Free advice is worth every cent	`Wed Jun 04 1997 10:59`	123
	Date: 4-JUN-1997 10:58:06.09 From: SMTP%"[email protected]" Subj: Re: On memory allocation... Hi Steve On Wed, 4 Jun 1997, Steve Lionel wrote: > [this is from another engineer...] Understood. May I add that a DUnix OS specific solution is fine, I don't expect anything remotely portable. I tried playing around with getrusage calls, but some of the results didn't appear to make much sense (to me at least!) > >> My post explained our current hack to get this info, and yes I REALLY need > >> to know, our target machine has very limited memory. > > I'm puzzled by the mention of "very limited memory". In this context, > people usually mean REAL memory. very limited memory == 64 MB real memory on a node of a T3D (which has no virtaul memory at all!) I have a code with loads of data sets, some of which will run on a node of a T3D, some will not. I'm trying to squeeze things so that the code fits on a node for as many cases as possible, with dynamic allocation / deallocation, dumping/reading some things to/from disk, etc. I can obviously use size to get the static size of my process at the outset, but I need to track how much dynamic memory I'm using as the process progresses to find out the high-water mark for the whole code. Any tool/technique which can accomplish this solves my problem. Unfortunately the code contains hundreds (if not thousands) of calls to F90 allocate / deallocate, so tracking from within the code seems impractical. Something that tracks F90 allocate / deallocate calls wpould probably give me all the info I need. Is there an ATOM tool to do this? > But if allocations are done on the heap, then wouldn't you be limited by > the size of virtual memory, not real memory (ignoring that once you > exceeded real memory, you might page like crazy). Yes. I should explained this more fully ..... > So what the program measures is the amount of dynamic virtual memory > available, which is limited by a number of system related things: > amount of page/swap file, shell imposed limits on "datasize", > kernel parameter max. virtual address space, lazy or eager swap file > policy, etc. etc. But real memory isn't one of them. My process must have a virtual memory size, on the T3D this will be the real memory size (a pretty crude approximation I know). And on a T3D node (or T3E for that matter) there is only one use process plus a microkernel, so no resource sharing at all. I could of course do all my development on the T3D itself, but then I must contend with long turnaround times, slow compilers, limited access, etc. I prefer the DEC compilers / tools / environemt anuway, and the 8400 I use is lightning quick on this code. > Almost always, when a process runs out of dynamic memory, it is due to > the shell imposed limit on "datasize" being too low. Next in line is > not enough page/swap file. (One could look at the output of > /usr/sbin/swapon -s to see the total available swap/page file as the > process runs). The system I use for development is a 6 processor DEC 8400, with 2GB physical memory. limits are : % limit cputime 30:00 filesize unlimited datasize 2097152 kbytes stacksize 32768 kbytes coredumpsize 0 kbytes memoryuse 1048576 kbytes descriptors 4096 % limit -h cputime 30:00 filesize unlimited datasize 2097152 kbytes stacksize 32768 kbytes coredumpsize unlimited memoryuse 1048576 kbytes descriptors 4096 % /usr/sbin/swapon -s Swap partition /dev/rz12b (default swap): Allocated space: 12800 pages (100MB) In-use space: 8046 pages ( 62%) Free space: 4754 pages ( 37%) Swap partition /dev/rz24b: Allocated space: 261877 pages (2045MB) In-use space: 7960 pages ( 3%) Free space: 253917 pages ( 96%) Swap partition /dev/rz32b: Allocated space: 261877 pages (2045MB) In-use space: 7971 pages ( 3%) Free space: 253906 pages ( 96%) Swap partition /dev/rz40b: Allocated space: 261877 pages (2045MB) In-use space: 7964 pages ( 3%) Free space: 253913 pages ( 96%) Swap partition /dev/rz49b: Allocated space: 261877 pages (2045MB) In-use space: 7971 pages ( 3%) Free space: 253906 pages ( 96%) Swap partition /dev/rz57b: Allocated space: 261877 pages (2045MB) In-use space: 8006 pages ( 3%) Free space: 253871 pages ( 96%) Total swap allocation: Allocated space: 1322185 pages (10329MB) Reserved space: 133735 pages ( 10%) In-use space: 47918 pages ( 3%) Available space: 1188450 pages ( 89%) I hope that supplies some helpful info. Thanks, Kevin -- Kevin Maguire __ __ __ __ Daresbury Laboratory [email protected] / \ / \ / \ / \ Warrington, Cheshire, UK ____________________/ __\/ __\/ __\/ __\_____________________________ ___________________/ /__/ /__/ /__/ /________________________________ \| / \ / \ / \ / \ \____ voice: (01925) 603221 \|/ \_/ \_/ \_/ \ o \ FAX: (01925) 603634 \_____/--<
1329.3	is interfacing to C kosher?	SMURF::PETERT	rigidly defined areas of doubt and uncertainty	`Wed Jun 04 1997 14:04`	111
	Hmmm, I think I can tell you how to find the total size of the current process, but I'm not sure how to go about finding the amount left to allocate. Oops, wait, just figured that out. It involves a few C calls, or you can write a C subroutine and just call that from the Fortran routine. The C calls are system calls and involve C structures that are returned via pointers so it might be easiest to go the C subroutine called from fortran main program route. The system calls involved are getrlimit and one of the procfs subroutines. Getrlimit is in the manpages (which is how I found it after I started this reply) and gives you the current and maximum limits available to the process. I take this to mean that it gives you the maximum number of bytes you can grab in total. The proc call is an ioctl call in which you request PIOCPSINFO, and among the data returned are pr_size, which gives the total size of the process in PAGES, and pr_rssize, the resident set size, also in pages. So you have to convert that by multiplying the returned value by whatever the page size is, to get a comparible figure. In order to do the proc ioctl, you have to open up the process in /proc, and then request the info via the ioctl. Here's a test program I wrote up in C to do this. It should be easily modifiable to do what they want: #include <malloc.h> #include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <errno.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/resource.h> #include <sys/signal.h> #include <sys/fault.h> #include <sys/syscall.h> #include <sys/procfs.h> #include <sys/ioctl.h> int main (int argc, char *argv) { char fileName[12]; int fd; int i; prpsinfo_t pInfo; void buff; struct rlimit rlp; sprintf (fileName, "/proc/%d", getpid ()); fprintf (stdout, "Opening %s\n", fileName); fd = open (fileName, O_RDWR); if (!fd) { fprintf (stderr, "Failed to open %s\n", fileName); return (-1); } rlp = (struct rlimit ) malloc (sizeof(struct rlimit)); getrlimit(RLIMIT_AS, rlp); printf("Current limit %ld, maximum limit %ld\n", rlp->rlim_cur, rlp->rlim_max); for (i = 0; i < 10; i++) { if (ioctl (fd, PIOCPSINFO, &pInfo) == -1) { fprintf (stderr, "Failed ioctl errno: %d\n", errno); return (-1); } fprintf (stdout, "RSS: %d\n", pInfo.pr_rssize); fprintf (stdout, "Size: %d\n", pInfo.pr_size); buff = malloc (50000); } close (fd); return (0); } Here's the output (Note that the size is the same in a few subsequent runs through the loop. I think the system grabbed enough so that the subsequent alloc, while increasing the resident set size, did not cause the overall size of the process to be increased.) petert@dippikill 71> rss Opening /proc/1245 Current limit 1073741824, maximum limit 1073741824 RSS: 15 Size: 199 RSS: 16 Size: 206 RSS: 17 Size: 206 RSS: 19 Size: 213 RSS: 21 Size: 220 RSS: 23 Size: 227 RSS: 25 Size: 234 RSS: 27 Size: 241 RSS: 28 Size: 248 RSS: 29 Size: 248 PeterT