[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference bulova::decw_jan-89_to_nov-90

Title:DECWINDOWS 26-JAN-89 to 29-NOV-90
Notice:See 1639.0 for VMS V5.3 kit; 2043.0 for 5.4 IFT kit
Moderator:STAR::VATNE
Created:Mon Oct 30 1989
Last Modified:Mon Dec 31 1990
Last Successful Update:Fri Jun 06 1997
Number of topics:3726
Total number of notes:19516

3654.0. "Problem with connections during loginout" by WSTWLD::KENT (I'm not cynical, just experienced...) Tue Nov 13 1990 18:40

  Our environment consists of VMS 5.4 ( DECW v2.1? ) and Ada v2.0-18
  running on an 8-plane VS3520.

  We're using a somewhat unique DECwindows environment in which many of
  the standard DECwindows command files have been altered.  The problem
  we are encountering is that after logging out of our application 28 times
  ( ten times if we are running all of our usual processes, 28 times if we
  only run our login process ), the cursor changes to the wristwatch symbol 
  and stays that way.  We have to restart DECwindows in order to get the 
  login box back.  A review of the server error log shows that every 
  connection but one has been closed.  Three connections are open when we get 
  the "Cannot allocate..." message in the server log.  The last four lines 
  from the log are shown below.  Note that this output log was generated
  by running our login process 28 times.  The other processes that we usually
  run were not started, since they take a LONG time to bring up.

     6-NOV-1990 08:37:40.7 Connection b7168 is accepted by Txport
     6-NOV-1990 08:37:40.9 Cannot allocate clientrec, or bigbuffer
     6-NOV-1990 08:37:41.4 Connection b7130 is closed by Txport
     6-NOV-1990 08:43:37.3 Connection b2538 is closed by Txport

  I've read in other notes that the "Cannot allocate..." message indicates
  a lack of virtual memory, but it seems odd that our application can be
  run a specific number of times before the server fails.

  The DECW$SM.LOG file displays a message that shows that our login
  process was unable to open the display.  However, according to the 
  server logfile, there are only three connections open, so I don't 
  know why it can't open the display.

  Our complete application actually consists of a login process and 
  several other processes which all use DECwindows.  The other processes stay 
  up ( connected to the server ) when our login process is terminated.  When
  our login process terminates, it causes DECW$STARTLOGIN to be run, so we
  can use VMS's login/password features for our application.  Our login
  process is really just a front end for our application that gets started
  when the user logs in from the normal login box.  This whole mess was
  necessary because we had to maintain the connections from our processes
  to the DECwindows server, since all of the processes take a long time to
  build all of their windows and so forth.  We were also required to use 
  VMS's login features.

  To accomplish this "feat", the following changes were made to the DECwindows
  command files.

    - DECW$SYLOGIN.COM

      $!
      $ run USER_DISK:OUR_LOGIN       ! Our login front end

    - DECW$STARTSM.COM
    
      .
      .
      $! run SYS$SYSTEM:DECW$SESSION  ! We disable the SESSION MANAGER

    - SYLOGICALS.COM
      
      .
      .
      $define/sys/exec DECW$LOGIN_MULTIPLE 1  

    - RELOGIN.COM 

      $! This command file is executed when we exit from our login process
      $!
      $define DECW$DISPLAY WSA0
      $run SYS$SYSTEM:DEC$STARTLOGIN
      $exit

  There, that should be confusing enough. :-)  Can anyone give me any clues
  about why we can only run ( and exit ) our login process a certain number
  of times before the server refuses to talk anymore?  I don't think we've
  used up all of our connections, since there are only three in use when
  we get the error message.  Is there something that the session manager
  does upon exiting that our application also needs to do?  I know it seems
  strange not to be using the session manager, but that was necessary to
  prevent the user from doing anything but running our application and to 
  prevent it from closing all of our connections.  I'm also not the one who 
  devised this scheme or wrote any of the code, but they have all left our 
  group and I have to help find a solution so we don't have to restart the 
  server ( and all of our processes ) after ten logins.

  Thanks for any help,

  ...Scott

T.RTitleUserPersonal
Name
DateLines
3654.1Hard choices aheadSTAR::VATNEPeter Vatne, VMS DevelopmentWed Nov 14 1990 11:4124
I hate to say this, but any time you do unsupported things to a system,
you are occasionally going to run up against bad and unsolved problems.
It is those problems that are the very reason we don't support something!

In your case, you are probably running up against a server "memory leak".
Memory leaks are a class of server bug where the memory allocated for
a client is not correctly released.  Therefore, after a very long time
of creating and destroying server objects, the server eventually runs
out of virtual memory, and you have to restart the server.

The reason regular users don't see this problem very often is that when
they logout (quit session), the server performs a "reset", which will
cause all memory to be logically deallocated, including "lost" memory.

Of course, we should find and fix memory leaks in the server.  However,
this is a long and time-consuming process.  In the meantime, the only
thing I can suggest for you is to close down all connections every
ten times, and have your application rebuild all its windows, as
painful as that may be.

I can't help but wondering that perhaps you should be concentrating
on why your application takes so long to start up.  If that could be
solved, you could then let the server naturally reset when the user
logs out.
3654.2More info on the problemGSRC::KENTI'm not cynical, just experienced...Mon Nov 26 1990 22:2258
  Re: .1

  I realize that our configuration is an odd one, but how much of our 
  changes are considered to be unsupported?  I don't mean to say that
  we heartily approved of the changes to the DECWindows command files,
  but we were driven by some system requirements.  Apparently, Digital's
  loginout process was certified for our DoD application, and it wasn't
  permitted for the project to develop it's own login/logout verification
  program.  Although the login front end developed for the project does do 
  additional user verification, Digital's loginout and security verification 
  processing must be successfully completed first.  The time it takes to
  start all of the graphical processes and the fact that they are required 
  to accumulate data even when no one is logged in are reasons why we 
  couldn't allow the server connections to be broken by normal logoff 
  processing.  

  The reason I asked about how much is unsupported is due to the logical
  DECW$LOGIN_MULTIPLE.  It seems that to make use of this logical, it was
  necessary to modify some of the DECWindows command files.  Otherwise,
  you would end up with multiple session managers.  This can be fun, but not
  overly useful!  Or maybe that is useful to some people!?  Well, we changed
  the command files and now, of course, we seem to be finding( causing? )
  problems that other people aren't having.  So, when you spoke of
  unsupported things, did you have in mind certain aspects of our 
  environment or is the whole mess unsupported?  Or is it simply non-standard?
  I know, I'm playing games with semantics.  It just seemed that we'd be
  more likely to get help for a "non-standard" environment than an 
  "unsupported" one!
  
  I did appreciate your explanation of what the problem is most likely to
  be.  Since we are not using the session manager to logoff, is there any
  other way to "reset" the server in such a way that the server recovers
  what it may have lost and still maintain our connections?  ( And can I
  have my cake and eat it, too...?  I know this is a silly question, but
  I had to ask anyway! ) 

  Unfortunately, we simply CAN'T close down all the connections after ten
  logins.  Users will be logging on and off frequently, and forcing a 
  restart of everything after every ten logins is unacceptable and might
  even be considered dangerous considering the mission-critical environment 
  in which this is running.  The time required to restart everything is 
  important, but what could be happening while these processes are 
  unavailable is also important.

  We've been able to duplicate the problem with a reduced version of our
  login process and the command file changes that I mentioned in the base
  note.  This program ( in Ada, but should be simple to convert to any
  other language ) is available in the next reply.  Basically, all it does
  is open a display connection, call add_host with "* * *" so anyone can
  use the X server, close the connection, and then execute a do_command
  to restart the login box.  On our system ( a VS3520 ) we need to login
  30 times ( with this version of our login program ) before we can no 
  longer open any connections.  Can we get anyone to look at this?  If it 
  appears to be a problem with the X server, should we QAR it?  

  ...Scott

3654.3Source code for the previous replyGSRC::KENTI'm not cynical, just experienced...Mon Nov 26 1990 22:2458
with 
  system, x, text_io, lib;

use
  system;

procedure OUR_LOGIN is
  --
  --  This program demonstrates "locking up" the X server after a certain
  --  number of logins have been performed.  Modifications to some of the
  --  DECWindows command files are also required before this program can 
  --  be used.
  --

  Display   : X.Display_Id_Type;
  Do_Status : System.Unsigned_Longword;

  Host_Access_String  : Constant String := "* * *";
  Host_Network_Record : X.Host_Address_Type :=
        ( Host_Family  => X.C_Family_Generic,
          Host_Length  => Host_Access_String'Length,
          Host_Address => Host_access_String'Address );

begin

  --
  --  Open the display
  --
  X.OPEN_DISPLAY ( Result => Display );
  if Display = System.Address_Zero then
    TEXT_IO.PUT_LINE("Could not open display!");
    return;
  end if;

  --
  --  Make the X server available to anyone
  --
  X.ADD_HOST( Display => Display,
              Host    => Host_Network_Record );

  --
  --  Close the display connection
  --
  X.CLOSE_DISPLAY( Display => Display );

  --
  --  Execute the relogin.com file so it will bring up a new login box.  
  --  Executing the DO_COMMAND should also terminate this program.
  --  
  LIB.DO_COMMAND( Status         => Do_Status,
                  Command_String => "@SYS$MANAGER:RELOGIN.COM" );

  --
  --  If we get to this location, then the DO_COMMAND must have failed.
  --
  TEXT_IO.PUT_LINE("DO_COMMAND must have failed..."); 

end OUR_LOGIN;
3654.4Explanation and a suggestionSTAR::VATNEPeter Vatne, VMS DevelopmentTue Nov 27 1990 22:5633
There is a big semantic difference between "non-standard" and "unsupported".
A standard system is the DECwindows system you get out of the box with no
modifications.  A non-standard system is one that you have customized.
A non-standard system can be either supported or unsupported.

A supported system is one that we guarantee that we will answer QARs for.
You can certainly use an unsupported system, but if you find any problems,
we don't guarantee we will attempt to fix them.  We will certainly listen
with interest, as unsupported systems may get supported in the future if
business needs warrant it.  We document in the installation guide those
customizations that are supported.  Anything not documented is probably
not supported, although there have been documentation oversights.

Now down to nuts and bolts: the use of DECW$LOGIN_MULTIPLE is unsupported
in general.  It is put there as a hook for development groups to develop
their own prototypes.  Whether or not this hook is supported for specific
uses is negotiated between the development group and VMS DECwindows.

My suggestion is for you to redesign this system.  It is best to let
the server reset between user sessions.  I would suggest that the
application be modified to somehow detect that the user is logging
out, and close down the display, but keep on collecting data.  The
application should then detect when the user is logging back in
again, and re-open the display.

There are lots of techniques that you could use to implement such
a scheme.  You could probably run a program that then talks to the
application via a mailbox, and lets the application know that the
user has logged in.  If this is truly a mission-critical environment,
I would suggest that the display programs be run in a separate
process from the data collection programs.  If you need lots of
bandwidth, I suggest using global sections to pass the data
between the two processes.