[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::fortran

Title:Digital Fortran
Notice:Read notes 1.* for important information
Moderator:QUARK::LIONEL
Created:Thu Jun 01 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1333
Total number of notes:6734

1329.0. "Determining available allocatable memory on UNIX" by QUARK::LIONEL (Free advice is worth every cent) Wed Jun 04 1997 10:13

[I got the following request from a DF UNIX customer - I don't know what to
suggest to him - any assistance would be appreciated. - Steve]

Hi

A while ago I posted  a Usenet question asking how to get the process size
from within a fortran program.  You replied:

On Tue, 6 May 1997, Steve Lionel wrote:
> Nothing "system related" is going to help, because what you need to know is how
> much space the Fortran libraries have available for the allocation pool.
> And even a Fortran library call won't be able to tell you how much you MIGHT
> be able to allocate - the only way to know for sure is to try it.  Another
> problem is that if you take the approach of allocating a lot and then freeing
> it, you expand your virtual address space, consuming pagetable entries and
> other system resources, impacting performance.
> 
> Do you REALLY need this info?  Even if you could get it, it would not be
> reliable.  As a subset, it would be useful to know how much memory the Fortran
> libraries have ALLOCATEd - you could use that to detect memory leaks.

My post explained our current hack to get this info, and yes I REALLY need
to know, our target machine has very limited memory.  Basically at start of
execution we allocate and re-allocate a set of arrays until the system
refuses to give as any more, and note the sum of the sizes of all these
arrays.  Later we do the same again, this time the system lets us allocate
less space, the difference from the original size is assumed to be how much
we have used dynamically.  This is calculated at lots of different places
throughout the code, and we see the dynamic amount grow and shrink is a
very sensible way.  However for any of a dozen reasons this approach is
pretty unreliable, but since I received no suggestions as to a better method
it's what we're stuck with.

Part of the reliability problem is that I get output like:

###
 calling W25IOW22
 memory_size=    40.1774480000001        2041.54696800000     
 calling W2IOW22
 memory_size=    40.1933120000003        2041.53096000000     
 calling W15IOW22
 memory_size=    40.2097440000002        2041.51393600000     
 calling WTIOW22
 memory_size=    40.2261920000003        2041.49804000000     
 calling IOW567
Segmentation fault
###

( The 2 numbers are current size, and what size_left1 reports as still
usable )

The seg fault is occurring in the get_memory_size (size_left1 actually)
routine itself, and is unpredictable.  Sometimes the code runs without any
problem, other times it crashes as shown, quite often in completely
different places in the code.  The code contains a mixture of allocatable
arrays & pointer arrays, which are allocated and deallocated repeatedly. 
I'm not sure if I can count the seq fault as a bug, but it's certainly
undesired behavior.

I've attached the routine we use, and am looking for suggestions to improve
reliability.

Cheers,
Kevin
--
Kevin Maguire         __    __    __    __           Daresbury Laboratory
[email protected]   /  \  /  \  /  \  /  \      Warrington, Cheshire, UK
____________________/  __\/  __\/  __\/  __\_____________________________
___________________/  /__/  /__/  /__/  /________________________________
                   | / \   / \   / \   / \  \____   voice: (01925) 603221
                   |/   \_/   \_/   \_/   \    o \    FAX: (01925) 603634
                                           \_____/--<                    

      MODULE MEM
      PRIVATE
      PUBLIC::SIZE_LEFT , SIZE_LEFT1
 
cyfh probing the amount of heap size left
      CONTAINS
      FUNCTION SIZE_LEFT()
cyfh size left in MegaByte of continous heap space
      REAL SIZE_LEFT
      INTEGER lb , ub , len , err , mid
      INTEGER i , j
      INTEGER  , pointer::qq(:)
 
cyfh low bound and upper bound of memory left in words
      lb = 1
      ub = 1000000000
      SIZE_LEFT = ub*KIND(1)/1000000.0
      ALLOCATE(qq(ub),STAT=err)
      IF ( err.GT.0 ) THEN
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         CALL BISECTION(lb,ub,SIZE_LEFT)
      ENDIF
      END FUNCTION SIZE_LEFT
 
      FUNCTION SIZE_LEFT1()
cyfh size  left in MegaByte of all (not necessarily continous) heap space
      REAL SIZE_LEFT1
      INTEGER size , err
      INTEGER i , j
      INTEGER  , pointer::qq(:)
 
      SIZE_LEFT1 = 0
cyfh
cyfh find the largest size that can be allocated
      size = 2**10
      ALLOCATE(qq(size),STAT=err)
      DO WHILE ( err.LE.0 )
         size = size*2
         ALLOCATE(qq(size),STAT=err)
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
      ENDDO
      CALL GREEDY_FILL(size,SIZE_LEFT1)
 
      SIZE_LEFT1 = SIZE_LEFT1*KIND(1)/1000000.0
      END FUNCTION SIZE_LEFT1
 
      RECURSIVE SUBROUTINE GREEDY_FILL(size,size_left)
      INTEGER  , intent(inout)::size
      REAL  , intent(inout)::size_left
      INTEGER  , pointer::qq(:)
      INTEGER err
CYFH
CYFH IF ALL IS FILL APART FROM SMALL POTS OF SPACE THEN RETURN
CYFH
      IF ( size.LT.100 ) RETURN
 
      ALLOCATE(qq(size),STAT=err)
CYFH
CYFH IF SUCCESSFUL THEN EXPAND
CYFH
      IF ( err.LE.0 ) THEN
         size_left = size_left + size
         size = size*1.1
         CALL GREEDY_FILL(size,size_left)
CYFH
CYFH ELSE SHRINK
CYFH
      ELSE
         size = size/1.1
         CALL GREEDY_FILL(size,size_left)
      ENDIF
      IF ( ASSOCIATED(qq) ) THEN
         DEALLOCATE(qq)
      ENDIF
      END SUBROUTINE  GREEDY_FILL

      RECURSIVE SUBROUTINE BISECTION(lb,ub,size_left)
      REAL size_left
      INTEGER lb , ub , mid , err , i , j
      INTEGER  , pointer::qq(:)
 
      IF ( ub-lb.LT.1000 ) RETURN
      mid = (lb+ub)/2
      ALLOCATE(qq(mid),STAT=err)
      IF ( err.GT.0 ) THEN
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         ub = mid
         size_left = lb*KIND(1)/1000000.0
         CALL BISECTION(lb,ub,size_left)
      ELSE
         lb = mid
         size_left = mid*KIND(1)/1000000.0
         IF ( ASSOCIATED(qq) ) THEN
            DEALLOCATE(qq)
         ENDIF
         CALL BISECTION(lb,ub,size_left)
      ENDIF
 
      END SUBROUTINE  BISECTION
      END MODULE MEM
T.RTitleUserPersonal
Name
DateLines
1329.1Can you find out what "limited memory" meansSUBPAC::FARICELLIWed Jun 04 1997 10:5524
>>My post explained our current hack to get this info, and yes I REALLY need
>>to know, our target machine has very limited memory.

  I'm puzzled by the mention of "very limited memory". In this context,
  people usually mean REAL memory.

  But if allocations are done on the heap, then wouldn't you be limited by
  the size of virtual memory, not real memory (ignoring that once you
  exceeded real memory, you might page like crazy).

  So what the program measures is the amount of dynamic virtual memory
  available, which is limited by a number of system related things:
  amount of page/swap file, shell imposed limits on "datasize",
  kernel parameter max. virtual address space, lazy or eager swap file
  policy, etc. etc. But real memory isn't one of them.

  Almost always, when a process runs out of dynamic memory, it is due to
  the shell imposed limit on "datasize" being too low. Next in line is
  not enough page/swap file. (One could look at the output of
  /usr/sbin/swapon -s to see the total available swap/page file as the
  process runs).

  -- John Faricelli
1329.2Ah, a T3D! That explains a lot...QUARK::LIONELFree advice is worth every centWed Jun 04 1997 11:59123
Date:	 4-JUN-1997 10:58:06.09
From:	SMTP%"[email protected]"
Subj:	Re: On memory allocation...

Hi Steve

On Wed, 4 Jun 1997, Steve Lionel wrote:
> [this is from another engineer...]

Understood.  May I add that a DUnix OS specific solution is fine, I don't
expect anything remotely portable.  I tried playing around with getrusage
calls, but some of the results didn't appear to make much sense (to me at
least!)  

> >> My post explained our current hack to get this info, and yes I REALLY need
> >> to know, our target machine has very limited memory.
> 
>   I'm puzzled by the mention of "very limited memory". In this context,
>   people usually mean REAL memory.

very limited memory == 64 MB real memory on a node of a T3D (which has no
virtaul memory at all!) I have a code with loads of data sets, some of which
will run on a node of a T3D, some will not.  I'm trying to squeeze things so
that the code fits on a node for as many cases as possible, with dynamic
allocation / deallocation, dumping/reading some things to/from disk, etc.  I
can obviously use size to get the static size of my process at the outset,
but I need to track how much dynamic memory I'm using as the process
progresses to find out the high-water mark for the whole code.  Any
tool/technique which can accomplish this solves my problem.  Unfortunately
the code contains hundreds (if not thousands) of calls to F90 allocate /
deallocate, so tracking from within the code seems impractical. Something
that tracks F90 allocate / deallocate calls wpould probably give me all the
info I need.  Is there an ATOM tool to do this?

>   But if allocations are done on the heap, then wouldn't you be limited by
>   the size of virtual memory, not real memory (ignoring that once you
>   exceeded real memory, you might page like crazy).

Yes.  I should explained this more fully .....

>   So what the program measures is the amount of dynamic virtual memory
>   available, which is limited by a number of system related things:
>   amount of page/swap file, shell imposed limits on "datasize",
>   kernel parameter max. virtual address space, lazy or eager swap file
>   policy, etc. etc. But real memory isn't one of them.

My process must have a virtual memory size, on the T3D this will be the real
memory size (a pretty crude approximation I know).  And on a T3D node (or
T3E for that matter) there is only one use process plus a microkernel, so no
resource sharing at all.  I could of course do all my development on the T3D
itself, but then I must contend with long turnaround times, slow compilers,
limited access, etc.  I prefer the DEC compilers / tools / environemt
anuway, and the 8400 I use is lightning quick on this code.

>   Almost always, when a process runs out of dynamic memory, it is due to
>   the shell imposed limit on "datasize" being too low. Next in line is
>   not enough page/swap file. (One could look at the output of
>   /usr/sbin/swapon -s to see the total available swap/page file as the
>   process runs).

The system I use for development is a 6 processor DEC 8400, with 2GB
physical memory.  limits are :

% limit
cputime         30:00
filesize        unlimited
datasize        2097152 kbytes
stacksize       32768 kbytes
coredumpsize    0 kbytes
memoryuse       1048576 kbytes
descriptors     4096 
% limit -h
cputime         30:00
filesize        unlimited
datasize        2097152 kbytes
stacksize       32768 kbytes
coredumpsize    unlimited
memoryuse       1048576 kbytes
descriptors     4096 

% /usr/sbin/swapon -s
Swap partition /dev/rz12b (default swap):
    Allocated space:        12800 pages (100MB)
    In-use space:            8046 pages ( 62%)
    Free space:              4754 pages ( 37%)
Swap partition /dev/rz24b:
    Allocated space:       261877 pages (2045MB)
    In-use space:            7960 pages (  3%)
    Free space:            253917 pages ( 96%)
Swap partition /dev/rz32b:
    Allocated space:       261877 pages (2045MB)
    In-use space:            7971 pages (  3%)
    Free space:            253906 pages ( 96%)
Swap partition /dev/rz40b:
    Allocated space:       261877 pages (2045MB)
    In-use space:            7964 pages (  3%)
    Free space:            253913 pages ( 96%)
Swap partition /dev/rz49b:
    Allocated space:       261877 pages (2045MB)
    In-use space:            7971 pages (  3%)
    Free space:            253906 pages ( 96%)
Swap partition /dev/rz57b:
    Allocated space:       261877 pages (2045MB)
    In-use space:            8006 pages (  3%)
    Free space:            253871 pages ( 96%)
Total swap allocation:
    Allocated space:      1322185 pages (10329MB)
    Reserved space:        133735 pages ( 10%)
    In-use space:           47918 pages (  3%)
    Available space:      1188450 pages ( 89%)

I hope that supplies some helpful info.

Thanks,
Kevin
--
Kevin Maguire         __    __    __    __           Daresbury Laboratory
[email protected]   /  \  /  \  /  \  /  \      Warrington, Cheshire, UK
____________________/  __\/  __\/  __\/  __\_____________________________
___________________/  /__/  /__/  /__/  /________________________________
                   | / \   / \   / \   / \  \____   voice: (01925) 603221
                   |/   \_/   \_/   \_/   \    o \    FAX: (01925) 603634
                                           \_____/--<                    
1329.3is interfacing to C kosher?SMURF::PETERTrigidly defined areas of doubt and uncertaintyWed Jun 04 1997 15:04111
    Hmmm, I think I can tell you how to find the total size of the
    current process, but I'm not sure how to go about finding the 
    amount left to allocate.  Oops, wait, just figured that out.
    It involves a few C calls, or you can write a C subroutine
    and just call that from the Fortran routine.  The C calls are
    system calls and involve C structures that are returned via
    pointers so it might be easiest to go the C subroutine called
    from fortran main program route.  The system calls involved are
    getrlimit and one of the procfs subroutines.  Getrlimit is
    in the manpages (which is how I found it after I started this 
    reply) and gives you the current and maximum limits available
    to the process.  I take this to mean that it gives you the maximum
    number of bytes you can grab in total.  The proc call is
    an ioctl call in which you request PIOCPSINFO, and among the 
    data returned are pr_size, which gives the total size of the 
    process in PAGES, and pr_rssize, the resident set size, also
    in pages.  So you have to convert that by multiplying the returned
    value by whatever the page size is, to get a comparible figure.
    In order to do the proc ioctl, you have to open up the process in 
    /proc, and then request the info via the ioctl.  Here's a test
    program I wrote up in C to do this.  It should be easily modifiable
    to do what they want:
    
    
    #include <malloc.h>
    #include <unistd.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <fcntl.h>
    #include <errno.h>
    #include <sys/stat.h>
    #include <sys/types.h>
    #include <sys/resource.h>
    #include <sys/signal.h>
    #include <sys/fault.h>
    #include <sys/syscall.h>
    #include <sys/procfs.h>
    #include <sys/ioctl.h>
    
    int main (int argc, char **argv)
    {
       char fileName[12];
       int fd;
       int i;
       prpsinfo_t pInfo;
       void *buff;
       struct rlimit *rlp;
    
       sprintf (fileName, "/proc/%d", getpid ());
       fprintf (stdout, "Opening %s\n", fileName);
       fd = open (fileName, O_RDWR);
       if (!fd)
       {
          fprintf (stderr, "Failed to open %s\n", fileName);
          return (-1);
       }
    
       rlp = (struct rlimit *) malloc (sizeof(struct rlimit));
       getrlimit(RLIMIT_AS, rlp);
       printf("Current limit %ld, maximum limit %ld\n", rlp->rlim_cur,
    	  rlp->rlim_max);
    
       for (i = 0; i < 10; i++)
       {
          if (ioctl (fd, PIOCPSINFO, &pInfo) == -1)
          {
             fprintf (stderr, "Failed ioctl errno: %d\n", errno);
             return (-1);
          }
          fprintf (stdout, "RSS: %d\n", pInfo.pr_rssize);
          fprintf (stdout, "Size: %d\n", pInfo.pr_size);
          buff = malloc (50000);
       }
    
       close (fd);
       return (0);
    }
    
    
    Here's the output (Note that the size is the same in a few subsequent 
    runs through the loop.  I think the system grabbed enough so that
    the subsequent alloc, while increasing the resident set size,
    did not cause the overall size of the process to be increased.)
    
    petert@dippikill 71> rss
    Opening /proc/1245
    Current limit 1073741824, maximum limit 1073741824
    RSS: 15
    Size: 199
    RSS: 16
    Size: 206
    RSS: 17
    Size: 206
    RSS: 19
    Size: 213
    RSS: 21
    Size: 220
    RSS: 23
    Size: 227
    RSS: 25
    Size: 234
    RSS: 27
    Size: 241
    RSS: 28
    Size: 248
    RSS: 29
    Size: 248
    
    
    PeterT