T.R | Title | User | Personal Name | Date | Lines |
---|
9418.1 | | BLAZER::MIKELIS | Software Partner's Eng. MR01-3/F26 | Mon Apr 07 1997 20:00 | 72 |
| Here is a reproducer:
Compile it on a 3.2 system:
cxx -g main.cc -D_REENTRANT -o main.x -lc_r -threads
Run it on a 4.0A system and you should notice it opening
(but not closing) UDP sockets.
I have been using lsof (public domain software) to track
the open files. What I am seeing on 4.0A is this:
main.x 6576 bobby 4u inet 0x5cf4ba40 0t0 UDP *:4633
main.x 6576 bobby 5u inet 0x12308180 0t0 UDP *:4640
main.x 6576 bobby 6u inet 0x12309200 0t0 UDP *:4645
main.x 6576 bobby 7u inet 0x12309a40 0t0 UDP *:4651
On our 4.0 system:
These sockets are not released until the process exits.
Each time a new thread is created, and getgrgid_r is
call, a new socket is opened and subsequently not release.
Eventually, system limits are hit and bad things happen.
On our 3.2 system:
only ONE socket is opened, it remains open until the process
exits, but that OK (it's only one file handle per process).
It appears that getgrgid_r is properly recycling/reusing this
file handle in 3.2, but in 4.0 it opens a new socket, somehow
forgetting that it had one open already.
CODE:
------
#include <stdlib.h>
#include <stream.h>
#include <grp.h>
#include <pthread.h>
static void* start_routine (void *) {
gid_t gid;
group grp;
char buff[1024];
cout << "\tThread is running..." << endl;
grp.gr_name = 0;
grp.gr_passwd = 0;
grp.gr_gid = 0;
grp.gr_mem = 0;
gid = getgrgid_r(500, &grp, buff, sizeof(buff));
cout << "\tgetgrgid_r returned = " << gid << endl;
sleep (5);
return (0);
}
void main (void) {
for (int i=0; i < 5; i++) {
cout << "Starting thread number " << i << endl;
pthread_t thread;
pthread_create(&thread, pthread_attr_default, start_routine, 0);
pthread_addr_t status;
pthread_join(thread, &status);
cout << "thread number " << i << " terminated with status " \
<< status << endl;
}
}
|
9418.2 | there are some anomalies in the supplied code. | SMURF::GAF | Jerry Feldman, Unix Dev. Environment, DTN:381-2970 | Tue Apr 08 1997 09:34 | 18 |
| gid = getgrgid_r(500, &grp, buff, sizeof(buff));
The above call to getgrid_r is missing a parameter.
The man page for getgrgid_r is:
int getgrgid_r(
gid_t gid,
struct group *grp,
char *buffer,
size_t len,
struct group **result);
Also, the endgrent_r function should be called before exiting the
thread.
void endgrent_r(
FILE **gr_fp);
Note that in V4.0, the standard functions are all thread safe.
|
9418.3 | | BLAZER::MIKELIS | Software Partner's Eng. MR01-3/F26 | Wed Apr 09 1997 11:22 | 34 |
| I passed on your input and got the folowing back from my customer:
Yes, this is true for the new 4.0 interface but according to the man
pages, DEC still supports the 3.2 interface for backward compatibility.
Notice that our code is not new design. We are just bringing binary
executable code built on 3.2 to a 4.0 system and running it on 4.0
This is the system call in question here.
[Digital] The following obsolete functions are supported in order to
maintain backward compatibility with previous versions of the
operating system. You should not use them in new designs.
int getgrgid_r(
gid_t gid,
struct group *grp,
char *buffer,
int len);
>
> Also, the endgrent_r function should be called before exiting the
> thread.
>
> void endgrent_r(
> FILE **gr_fp);
>
> Note that in V4.0, the standard functions are all thread safe.
If in fact this system call must be made, then which FILE should be
passed to the system call? This file was not opened by the user,
it was opened by the operating system, so it's the operating systems
responsibility to close it or provide the handle to it (if the user
is expected to close it).
|
9418.4 | Could be a binary incompatibility | SMURF::GAF | Jerry Feldman, Unix Dev. Environment, DTN:381-2970 | Wed Apr 09 1997 19:03 | 10 |
| I erred when I posted .2, so I marked it hidden. Somehow it became
unhidden. In any case, the getgrgid_r call always does a setgrent and
and endgrent.
I have not looked at the V3.2 sources, but in V4.0, changes were made
to all of libc to eliminate the libc_r as a separate library, and to
make all of libc thread-safe. If the customer's code were built
-non_shared, then there could be a problem. I would suggest recompiling
and relinking the code on 4.0, and see if that corrects the problem.
|
9418.5 | getgrgid_r() still a problem | BLAZER::MIKELIS | Software Partner's Eng. MR01-3/F26 | Wed Apr 16 1997 12:45 | 78 |
| Here's another response from my ISV. UDP sockets are still
remaining open. A test case is included below. Could you
please take a look and see if there is some kind of solution
i can give my ISV? Thanks/james
------------
As you suggested in a prior emailing, I am now running
my test program on a 4.0A system using the new posix
entry point for getgrgid_r() with the 5 argument signature.
The problem is present in this version of the OS/RTL
as well.
Furthur the program will hang after about 3950 iterations,
due to having so many UDP sockets open (I suspect). Each
iteration thru the loop creates a thread which calls getgrgid
which opens a UDP socket but never closes it. Just run lsof
to see how many UDP sockets are open when its wedged, you'll
see!
At this point we have no way of working around this problem
and it appears to exist in both the 3.2G and the 4.0A version
of the Digital Unix Operating System.
As a reference point, we did run this test program on an
SGI system running Irix 6.4, which also supports the latest
pthreads standard. The program worked there with no problem.
Can you provide a work-around or a patch for this problem ?
Stuck in neutral,
-- bobby --
Here's the code we're running:
------------------------------
#include <stdio.h>
#include <grp.h>
#include <pthread.h>
static void* start_routine (void *) {
gid_t gid;
group grp;
group *grp_ptr;
char buff[1024];
fprintf(stderr, "\tThread is running...");
grp.gr_name = 0;
grp.gr_passwd = 0;
grp.gr_gid = 0;
grp.gr_mem = 0;
gid = getgrgid_r(500, &grp, buff, sizeof(buff), &grp_ptr);
fprintf(stderr, "\tgetgrgid_r returned = %d\n", gid);
// sleep (1);
return (0);
}
void main (void) {
for (int i=0; i < 5000; i++) {
fprintf(stderr, "Starting thread number %d\n", i);
pthread_attr_t attrs;
pthread_attr_init(&attrs);
pthread_attr_setscope(&attrs, PTHREAD_SCOPE_SYSTEM);
pthread_t thread;
pthread_create(&thread, &attrs, start_routine, 0);
void* status;
pthread_join(thread, &status);
fprintf(stderr, "thread number %d terminated with status %d\n",
i, status);
}
fprintf(stderr, "End of test program\n");
}
|
9418.6 | Please file a QAR or CLD. | SMURF::GAF | Jerry Feldman, Unix Dev. Environment, DTN:381-2970 | Wed Apr 16 1997 18:23 | 2 |
| You really need to submit either a QAR or CLD. This is the only way
that the engineering groups can respond or provide a solution.
|