| Hi,
Please note:
This problem, Log #1997-3439 on ucx bind(), is CRITICAL for us and our
large semi-conductor manufacturer customers!!!
Thank you for doing the search. The following server and client programs,
which are modifications of ucx$tcp_server_ipc.c and ucx$tcp_client_ipc.c,
illustrate the socket stealing problem with the bind service when
setsockopt SO_REUSEADDR is done on the tcp ports.
Our problem is that the server is really a daemon on the local machine and
has one port for tcp connections and one for udp access to the network
(udp port is not created in this tcpsrv program). If a second copy of
tcpsrv is started, it steals the tcp port from the first copy, but the udp
port is still owned by the first copy. Thus the second copy cannot
function (ie., access the network). A complete restart after killing all
programs is the only recovery due to this bug in UCX.
The use of setsockopt SO_REUSEADDR is required by our application because
if the tcpsrv process dies, it must be immediately restarted in order to
lose a minimum number of network packets. To wait two minutes is to wait
forever. The two minute wait is the behaviot if the setsockopt is not
used and the server has any client connections.
This bug was introduced in UCX v.4.0. I have tested with UCX v.3.3 and it
worked correctly. It also fails under UCX v.4.1 ECO-2, that latest
version I have been able to find in-house. This problem is CRITICAL!!!
Please supply a patch that we may redistribute to our very big DEC
customers as soon as possible.
Incidently, I first noticed this problem when I was connecting a very
large number of clients to the server (daemon). The server ran out of
byte count quota, causing it to hang. Our networking API responds to that
failure (correclty) by starting a new server, creating a collection of
non-functioning programs that all have to be killed and restarted by
hand.
If the issue is not clear, or you need to have the udp port added to the
tcpsrv, or you need any more information, please contact me.
Louise
~~~~~~~~~~~~~~~~~~~~~~~~~ Have a nice day ~~~~~~~~~~~~~~~~~~~~~~~~~~~
----------------------------------------------------------------------
Louise Wholey | e-mail: [email protected]
TIBCO - The Information Bus Company | http://www.tibco.com/
(formerly Teknekron Software Systems) | direct: 415-846-5262
3165 Porter Drive | main: 415-846-5000
Palo Alto, CA 94304 | fax: 415-846-5005
----------------------------------------------------------------------
======================================================================
Server - from SYS$COMMON:[SYSHLP.EXAMPLES.UCX]UCX$TCP_SERVER_IPC.C
tcpsrv.c program - note: If SO_REUSEADDR setsockopt is used, the tcp
socket will be stolen by another instance of this program. But that
socket option is required to enable another copy to be started immediately
(rather than after waiting 2 minutes) after this one dies.
The reason this is a problem for us is that the tcpsrv uses a UDP port to
access the network (not included in this example). If a second copy of
tcpsrv is started, the first instance of tcpsrv keeps the UDP port, but
all subsequent clients connect to the second copy of tcpsrv. Thus all
network activity is killed until both copies of tcpsrv and all clients are
restarted.
Build and run program using DEC C:
$ CC /L_DOUBLE=64 /FLOAT=IEEE /PREFIX=ALL tcpsrv
$ link tcpsrv
$ run tcpsrv
/*====================================================================
*
* COPYRIGHT (C) 1989 BY
* DIGITAL EQUIPMENT CORPORATION, MAYNARD, MASS.
*
* This software is furnished under a license and may be used and copied
* only in accordance with the terms of such license and with the
* inclusion of the above copyright notice. This software or any other
* copies thereof may not be provided or otherwise made available to any
* other person. No title to and ownership of the software is hereby
* transferred.
*
* The information in this software is subject to change without notice
* and should not be construed as a commitment by DIGITAL EQUIPMENT
* CORPORATION.
*
* DIGITAL assumes no responsibility for the use or reliability of its
* software on equipment which is not supplied by DIGITAL.
*
*
*
* FACILITY:
* INSTALL
*
*
* ABSTRACT:
* This is an example of a TCP/IP server using the IPC
* socket interface.
*
*
* ENVIRONMENT:
* UCX V1.2 or higher, VMS V5.2 or higher
*
* This example is portable to Ultrix. The include
* files are conditionally defined for both systems, and
* "perror" is used for error reporting.
*
* To link in VAXC/VMS you must have the following
* entries in your .opt file:
* sys$library:ucx$ipc.olb/lib
* sys$share:vaxcrtl.exe/share
*
* AUTHORS:
* UCX Developer
*
* CREATION DATE: May 23, 1989
*
* MODIFICATION HISTORY:
*
*/
/*
*
* INCLUDE FILES
*
*/
#ifdef VMS
#include <errno.h>
#include <types.h>
#include <stdio.h>
#include <socket.h>
#include <in.h>
#include <netdb.h> /* change hostent to comply with BSD 4.3 */
#include <inet.h>
#include <ucx$inetdef.h> /* INET symbol definitions */
#include <stdlib.h>
#include <unixio.h>
#else
#include <errno.h>
#include <sys/types.h>
#include <stdio.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <sys/uio.h>
#endif
/*
* Functional Description
*
* This examples creates a socket of type SOCK_STREAM (TCP),
* binds and listens on the socket, receives a message
* and closes the connection.
* Error messages are printed to the screen.
*
* IPC calls used:
* accept
* bind
* close
* gethostbyname
* listen
* recv
* shutdown
* socket
*
*
* Formal Parameters
* The server program expects one parameter:
* portnumber ... port number where it will listen
*
*
* Routine Value
*
* Status
*/
void cleanup();
/*--------------------------------------------------------------------*/
main(int argc, char **argv)
{
int sock_2, sock_3; /* sockets */
static char message[BUFSIZ];
static struct sockaddr_in sock2_name; /* Address struct for socket2.*/
static struct sockaddr_in retsock2_name; /* Address struct for socket2.*/
struct hostent hostentstruct; /* Storage for hostent data. */
struct hostent *hostentptr; /* Pointer to hostent data. */
static char hostname[256]; /* Name of local host. */
int flag;
int retval; /* helpful for debugging */
int namelength;
int port = 7200;
int level = SOL_SOCKET;
int optname = SO_REUSEADDR;
char optval = 1;
int optlen = 1;
/*
* Check input parameters.
*/
if (argc != 2 )
printf("Will use port 7200.\n");
else
port = atoi(argv[1]);
/*
* Open socket 2: AF_INET, SOCK_STREAM.
*/
if ((sock_2 = socket (AF_INET, SOCK_STREAM, 0)) == -1)
{
perror( "socket");
exit(vaxc$errno);
}
/*
* THIS SETSOCKOPT WITH SO_REUSEADDR ALLOWS SOCKET STEALING!!!
*/
setsockopt(sock_2, level, optname, &optval, optlen);
/*
* Get the host local name.
*/
retval = gethostname(hostname,sizeof hostname);
if (retval)
{
perror ("gethostname");
cleanup (1, sock_2, 0);
}
/*
* Get pointer to network data structure for socket 2.
*/
if ((hostentptr = gethostbyname (hostname)) == NULL)
{
perror( "gethostbyname");
cleanup(1, sock_2, 0);
}
/*
* Copy hostent data to safe storage.
*/
hostentstruct = *hostentptr;
/*
* Fill in the name & address structure for socket 2.
*/
sock2_name.sin_family = hostentstruct.h_addrtype;
sock2_name.sin_port = htons(port);
sock2_name.sin_addr = * ((struct in_addr *) hostentstruct.h_addr);
/*
* Bind name to socket 2.
*/
retval = bind (sock_2, (struct sockaddr *)&sock2_name, sizeof
sock2_name);
if (retval)
{
perror("bind");
cleanup(1, sock_2, 0);
}
/*
* Listen on socket 2 for connections.
*/
retval = listen (sock_2, 5);
if (retval)
{
perror("listen");
cleanup(1, sock_2, 0);
}
/*
* Accept connection from socket 2:
* accepted connection will be on socket 3.
*/
namelength = sizeof (sock2_name);
sock_3 = accept (sock_2, (struct sockaddr *)&sock2_name, &namelength);
if (sock_3 == -1)
{
perror ("accept");
cleanup( 2, sock_2, sock_3);
}
while (1)
{
/*
* Receive message from socket 1 in client.
*/
flag = 0; /* maybe 0 or MSG_OOB or MSG_PEEK */
retval = recv(sock_3, message ,sizeof (message), flag);
if (retval == -1)
{
perror ("receive");
cleanup( 2, sock_2, sock_3);
}
else
{
printf (" %s\n", message);
}
}
/*
* Call cleanup to shutdown and close sockets.
*/
cleanup(2, sock_2, sock_3);
} /* end main */
/*-----------------------------------------------------------*/
void cleanup(int how_many, int sock1, int sock2)
{
int retval;
/*
* Shutdown and close sock1 completely.
*/
retval = shutdown(sock1,2);
if (retval == -1)
perror ("shutdown");
retval = close (sock1);
if (retval)
perror ("close");
/*
* If given, shutdown and close sock2.
*/
if (how_many == 2)
{
retval = shutdown(sock2,2);
if (retval == -1)
perror ("shutdown");
retval = close (sock2);
if (retval)
perror ("close");
}
exit(vaxc$errno);
} /* end cleanup*/
======================================================================
======================================================================
Program from SYS$COMMON:[SYSHLP.EXAMPLES.UCX]UCX$TCP_CLIENT_IPC.C;1
tcpcli.c program - If this program has a connection to tcpsrv (above),
and another instance of tcpsrv is started, this program continues to run
fine, but all new clients (copies of this program) connect to the new
tcpsrv, which cannot process any udp activity. The udp port is still
owned by the first server.
Build and run program using DEC C:
$ CC /L_DOUBLE=64 /FLOAT=IEEE /PREFIX=ALL tcpcli
$ link tcpcli
$ run tcpcli
/*====================================================================
*
* COPYRIGHT (C) 1989 BY
* DIGITAL EQUIPMENT CORPORATION, MAYNARD, MASS.
*
* This software is furnished under a license and may be used and copied
* only in accordance with the terms of such license and with the
* inclusion of the above copyright notice. This software or any other
* copies thereof may not be provided or otherwise made available to any
* other person. No title to and ownership of the software is hereby
* transferred.
*
* The information in this software is subject to change without notice
* and should not be construed as a commitment by DIGITAL EQUIPMENT
* CORPORATION.
*
* DIGITAL assumes no responsibility for the use or reliability of its
* software on equipment which is not supplied by DIGITAL.
*
*
*
* FACILITY:
* INSTALL
*
*
* ABSTRACT:
* This is an example of a TCP/IP client using the IPC
* socket interface.
*
*
* ENVIRONMENT:
* UCX V1.2 or higher, VMS V5.2 or higher
*
* This example is portable to Ultrix. The include
* files are conditionally defined for both systems, and
* "perror" is used for error reporting.
*
* To link in VAXC/VMS you must have the following
* entries in your .opt file:
* sys$library:ucx$ipc.olb/lib
* sys$share:vaxcrtl.exe/share
*
* AUTHORS:
* UCX Developer
*
* CREATION DATE: May 23, 1989
*
* MODIFICATION HISTORY:
*
*/
/*
*
* INCLUDE FILES
*
*/
#ifdef VMS
#include <errno.h>
#include <types.h>
#include <stdio.h>
#include <socket.h>
#include <in.h>
#include <netdb.h> /* change hostent to comply with BSD 4.3*/
#include <inet.h>
#include <ucx$inetdef.h> /* INET symbol definitions */
#include <stdlib.h>
#include <unixio.h>
#include <string.h>
#include <signal.h>
#else
#include <errno.h>
#include <sys/types.h>
#include <stdio.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <sys/uio.h>
#endif
/*
*
* MACRO DEFINITIONS
*
*/
#ifndef vms
#define TRUE 1
#define FALSE 0
#endif
/*
* Functional Description
*
* This example creates a socket of type SOCK_STREAM (TCP),
* initiates a connection to the remote host, sends
* a message to the remote host, and closes the connection.
* Error messages are printed to the screen.
*
* IPC calls used:
* close
* connect
* gethostbyname
* send
* shutdown
* socket
*
*
* Formal Parameters
* The client program expects two parameters:
* hostname ... name of remote host
* portnumber ... port where remote host(server) is listening
*
*
* Routine Value
*
* Status
*/
void cleanup();
/*--------------------------------------------------------------------*/
main(argc,argv)
int argc;
char **argv;
{
int sock_1; /* socket */
static char message[] = "Hi there.";
static struct sockaddr_in sock2_name; /* Address struct for socket2.*/
struct hostent hostentstruct; /* Storage for hostent data. */
struct hostent *hostentptr; /* Pointer to hostent data. */
static char hostname[256]; /* Name of local host. */
int flag;
int retval; /* helpful for debugging */
int shut = FALSE; /* flag to cleanup */
int port = 7200;
char host[100];
/*
* Check input parameters.
*/
if (argc != 3 )
{
printf("Will use local host and port 7200.\n");
gethostname( host, sizeof host);
} else
{
strcpy(host, argv[1]);
port = atoi(argv[2]);
}
/*
* Open socket 1: AF_INET, SOCK_STREAM.
*/
if ((sock_1 = socket (AF_INET, SOCK_STREAM, 0)) == -1)
{
perror( "socket");
exit(vaxc$errno);
}
/*
*Get pointer to network data structure for socket 2 (remote host).
*/
if ((hostentptr = gethostbyname (host)) == NULL)
{
perror( "gethostbyname");
cleanup(shut, sock_1);
}
/*
* Copy hostent data to safe storage.
*/
hostentstruct = *hostentptr;
/*
* Fill in the name & address structure for socket 2.
*/
sock2_name.sin_family = hostentstruct.h_addrtype;
sock2_name.sin_port = htons(port);
sock2_name.sin_addr = * ((struct in_addr *) hostentstruct.h_addr);
/*
* Connect socket 1 to sock2_name.
*/
retval = connect(sock_1, (struct sockaddr *)&sock2_name, sizeof
(sock2_name));
if (retval)
{
perror("connect");
cleanup(shut, sock_1);
}
/*
* Send message to socket 2.
*/
flag = 0; /* maybe 0 or MSG_OOB */
retval = send(sock_1, message ,sizeof (message), flag);
if (retval < 0)
{
perror ("send");
shut = TRUE;
}
sleep(30);
/*
* Call cleanup to shutdown and close socket.
*/
cleanup(shut, sock_1);
} /* end main */
/*-----------------------------------------------------------*/
void cleanup(shut, socket)
int shut;
int socket;
{
int retval;
/*
* Shutdown socket completely -- only if it was connected
*/
if (shut) {
retval = shutdown(socket,2);
if (retval == -1)
perror ("shutdown");
}
/*
* Close socket.
*/
retval = close (socket);
if (retval)
perror ("close");
exit(vaxc$errno);
} /* end main */
|
| #1 29-APR-1997 02:51:20.40
MAIL
From: HYDRA::AXPDEVELOPER "[email protected]"
To: shen
CC: AXPDEVELOPER
Subj: FWD: Re: ucx bind() behavior incompatibility problem
From: SMTP%"[email protected]" 28-APR-1997 17:34:51.86
To: "[email protected]" <[email protected]>
CC: alpha-developer <[email protected]>,
"[email protected]"
<[email protected]>, "[email protected]"
<[email protected]>
Subj: Re: ucx bind() behavior incompatibility problem
Return-Path: [email protected]
Received: by asimov.mro.dec.com (UCX V4.1-12, OpenVMS V6.2 VAX);
Mon, 28 Apr 1997 17:34:49 -0400
Received: from pobox1.pa.dec.com by fluid.mro.dec.com
(5.65v4.0/1.1.8.2/19Nov96-0448PM)
id AA09866; Mon, 28 Apr 1997 17:34:46 -0400
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
Received: by pobox1.pa.dec.com; id AA14831; Mon, 28 Apr 97 14:34:45
-0700
Received: by pkohub1.athena.pko.dec.com with SMTP (Microsoft Exchange
Server
Internet Mail Connector Version 4.0.994.63)
id <[email protected]>; Mon, 28 Apr
1997
17:36:52 -0400
Received: from mail11.digital.com by mrohub1.mro.dec.com with SMTP
(Microsoft
Exchange Internet Mail Connector Version 4.0.995.52)
id JW664KDM; Mon, 28 Apr 1997 17:36:52 -0400
Received: from flash.tibco.com by mail11.digital.com (8.7.5/UNX
1.5/1.0/WV)
id RAA04515; Mon, 28 Apr 1997 17:25:38 -0400 (EDT)
Received: by flash.tibco.com (4.1/1.37)
id AA02122; Mon, 28 Apr 97 14:17:32 PDT
Received: from tssgate.tibco.com(160.101.20.20) by flash.tibco.com via
smap
(V1.3)
id sma002118; Mon Apr 28 14:17:24 1997
Received: from keylargo.tibco.com by tekbspa.tibco.com (4.1/SMI-4.1)
id AA28004; Mon, 28 Apr 97 14:17:24 PDT
Received: from condor.23net by keylargo.tibco.com (4.1/SMI-4.1)
id AA25594; Mon, 28 Apr 97 14:17:22 PDT
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
Received: by condor.23net (SMI-8.6/SMI-SVR4)
id OAA12779; Mon, 28 Apr 1997 14:17:22 -0700
Message-Id:
<c=US%a=_%p=Digital%[email protected]>
From: Louise Wholey <[email protected]>
To: "[email protected]" <[email protected]>
Cc: alpha-developer <[email protected]>,
"[email protected]"
<[email protected]>,
"[email protected]"
<[email protected]>
Subject: Re: ucx bind() behavior incompatibility problem
Date: Mon, 28 Apr 1997 17:17:22 -0400
X-Mailer: Microsoft Exchange Server Internet Mail Connector Version
4.0.994.63
Encoding: 150 TEXT
Hi,
Thank you for your reply about the SO_REUSEADDR problem in UCX v.4.0.
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
Our application needs the normal UNIX behavior for SO_REUSEADDR,
especially on TCP sockets, which is that an application can immediately
resue a port after the original owner dies. It is also important to us
to
maintain the normal behavior of ports, that is once a port has been
bound,
it cannot be accessed by another application.
We are using SO_REUSEADDR on TCP sockets to enable immediate access to
the
same port after a connection is broken. This is important for our API
in
order for it to re-establish communication with the Rendezvous daemon
if
the daemon dies. Our API will attempt to start a new daemon if it
cannot
communicate with the daemon on the TCP port (default is 7500) used for
the
connection.
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
Under UCX v.4.0, we have a completely broken environment. Here is why.
Our Rendezvous daemon starts up and listens on TCP port 7500 for client
connections. If a client connects to the daemon and requests network
service on UDP port 7500 (which may be for either broadcast or
multicast
messages), then the daemon will be using TCP port 7500 and UDP port
7500,
the default Rendezvous service, as well as the port generated by
accept()
that connects it to the client.
Under UCX v.4.0, if another daemon to starts running, it steals the TCP
port 7500 from the first daemon. That is, the first daemon can no
longer
be accessed by new clients on TCP port 7500 because the second daemon
now
owns that port. But the second daemon is disabled because it cannot
use
UDP port 7500 (our default service) to access UDP traffic on the
network.
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
The first deamon owns UDP port 7500. We do not use SO_REUSEADDR for
UDP
sockets. The solution to this dilemma is to stop all Rendezvous
programs
and start again.
The problem appeared when we were trying to increase the number of
allowed
daemon connections for a customer who wants 500 or more connections to
our
daemon. The second daemon was started by the API when the first one
ran
out of BYTLM quota. At that point the daemons need to be stopped and
all
applications need to be restarted because of the confused port
ownership
described above.
We would like to see SO_REUSEADDR work as it did under UCX v.3.x. If
you
invent a new option such as SO_SHAREPORT, it will not break
pre-existing
applications.
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
Please let me know what you plan to do so that I may advise customers
how
best to deal with this problem.
Louise
--
~~~~~~~~~~~~~~~~~~~~~~~~ Have a nice day ~~~~~~~~~~~~~~~~~~~~~~~~~~~
----------------------------------------------------------------------
Louise Wholey | e-mail: [email protected]
TIBCO - The Information Bus Company | http://www.tibco.com/
(formerly Teknekron Software Systems) | direct: 415-846-5262
3165 Porter Drive | main: 415-846-5000
Palo Alto, CA 94304 | fax: 415-846-5005
----------------------------------------------------------------------
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
> Cc: [email protected], [email protected]
> Date: Mon, 28 Apr 97 12:43:04 -0400
> From: [email protected]
> X-Mts: smtp
>
>
> Louise,
>
> An IPMT report was submitted to Digital Customer Support center
> for the ucx v4.x bind() incompatible behavior problem with v3.3
reported
> by you. The CSC/IPMT tracking number is:
>
> C970408-2160
>
> Attached is the explanation from the UCX developer. Please let
us know
> how you think about it.
>
> Regards,
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
>
> Alpha Developer Support.
>
> From: csc32::[email protected] (Customer Support,
Software
> 22-Apr-1997 1555 -0400)
> To: [email protected]
> Cc: [email protected]
> Subject: Answer on UCX problem
> Content-Type: text/plain; charset=US-ASCII
> Content-Transfer-Encoding: 7bit
>
>
> Robert,
>
> This is the response from UCS Engineering on a problem
submitted
> by you through UCX notes conference note #5406. The response
> follows. If this does not work for you or the customer
please
> explain in detail why and answer the question asked by the
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
> Engineer. I'll forward your answer to the Engineer.
Otherwise
> both Engineering and the CSC will consider this problem
solved.
>
>
> Thank you,
> Jayna Rabke
>
>
> "This change was made as part of the work for UCX V4.0 to support
> IP multicasting. In a multicast environment, it is important for
> several sockets, owned by different processes, to be able to share
> a single local port number. For example, consider the case of an
> video receiver application, where several users of a large
> timesharing system may wish to view the same video source. The
> video packets are encapsulated within multicast frames; each frame
> is delivered to many different hosts, and potentially to several
> different processes within that host.
>
> "Can you explain why your application is setting SO_REUSEADDR
Press RETURN for more...
#1 29-APR-1997 02:51:20.40
MAIL
> when, in fact, it would prefer not to reuse the local address
> and port? Offhand, this sounds like a bug in the application.
>
> "One possible resolution is that, for a future version of UCX,
> we are considering whether to split the functionality of the
> current SO_REUSEADDR up into two separate options: SO_REUSEADDR
> and SO_REUSEPORT. That way, SO_REUSEADDR could work (or not
> work, depending upon your point of view) as it always has, and
> SO_REUSEPORT could provide for newer applications which actually
> do want to reuse local port numbers."
>
>
> Mark "MyTH"
>
|