T.R | Title | User | Personal Name | Date | Lines |
---|
3654.1 | Hard choices ahead | STAR::VATNE | Peter Vatne, VMS Development | Wed Nov 14 1990 11:41 | 24 |
| I hate to say this, but any time you do unsupported things to a system,
you are occasionally going to run up against bad and unsolved problems.
It is those problems that are the very reason we don't support something!
In your case, you are probably running up against a server "memory leak".
Memory leaks are a class of server bug where the memory allocated for
a client is not correctly released. Therefore, after a very long time
of creating and destroying server objects, the server eventually runs
out of virtual memory, and you have to restart the server.
The reason regular users don't see this problem very often is that when
they logout (quit session), the server performs a "reset", which will
cause all memory to be logically deallocated, including "lost" memory.
Of course, we should find and fix memory leaks in the server. However,
this is a long and time-consuming process. In the meantime, the only
thing I can suggest for you is to close down all connections every
ten times, and have your application rebuild all its windows, as
painful as that may be.
I can't help but wondering that perhaps you should be concentrating
on why your application takes so long to start up. If that could be
solved, you could then let the server naturally reset when the user
logs out.
|
3654.2 | More info on the problem | GSRC::KENT | I'm not cynical, just experienced... | Mon Nov 26 1990 22:22 | 58 |
|
Re: .1
I realize that our configuration is an odd one, but how much of our
changes are considered to be unsupported? I don't mean to say that
we heartily approved of the changes to the DECWindows command files,
but we were driven by some system requirements. Apparently, Digital's
loginout process was certified for our DoD application, and it wasn't
permitted for the project to develop it's own login/logout verification
program. Although the login front end developed for the project does do
additional user verification, Digital's loginout and security verification
processing must be successfully completed first. The time it takes to
start all of the graphical processes and the fact that they are required
to accumulate data even when no one is logged in are reasons why we
couldn't allow the server connections to be broken by normal logoff
processing.
The reason I asked about how much is unsupported is due to the logical
DECW$LOGIN_MULTIPLE. It seems that to make use of this logical, it was
necessary to modify some of the DECWindows command files. Otherwise,
you would end up with multiple session managers. This can be fun, but not
overly useful! Or maybe that is useful to some people!? Well, we changed
the command files and now, of course, we seem to be finding( causing? )
problems that other people aren't having. So, when you spoke of
unsupported things, did you have in mind certain aspects of our
environment or is the whole mess unsupported? Or is it simply non-standard?
I know, I'm playing games with semantics. It just seemed that we'd be
more likely to get help for a "non-standard" environment than an
"unsupported" one!
I did appreciate your explanation of what the problem is most likely to
be. Since we are not using the session manager to logoff, is there any
other way to "reset" the server in such a way that the server recovers
what it may have lost and still maintain our connections? ( And can I
have my cake and eat it, too...? I know this is a silly question, but
I had to ask anyway! )
Unfortunately, we simply CAN'T close down all the connections after ten
logins. Users will be logging on and off frequently, and forcing a
restart of everything after every ten logins is unacceptable and might
even be considered dangerous considering the mission-critical environment
in which this is running. The time required to restart everything is
important, but what could be happening while these processes are
unavailable is also important.
We've been able to duplicate the problem with a reduced version of our
login process and the command file changes that I mentioned in the base
note. This program ( in Ada, but should be simple to convert to any
other language ) is available in the next reply. Basically, all it does
is open a display connection, call add_host with "* * *" so anyone can
use the X server, close the connection, and then execute a do_command
to restart the login box. On our system ( a VS3520 ) we need to login
30 times ( with this version of our login program ) before we can no
longer open any connections. Can we get anyone to look at this? If it
appears to be a problem with the X server, should we QAR it?
...Scott
|
3654.3 | Source code for the previous reply | GSRC::KENT | I'm not cynical, just experienced... | Mon Nov 26 1990 22:24 | 58 |
| with
system, x, text_io, lib;
use
system;
procedure OUR_LOGIN is
--
-- This program demonstrates "locking up" the X server after a certain
-- number of logins have been performed. Modifications to some of the
-- DECWindows command files are also required before this program can
-- be used.
--
Display : X.Display_Id_Type;
Do_Status : System.Unsigned_Longword;
Host_Access_String : Constant String := "* * *";
Host_Network_Record : X.Host_Address_Type :=
( Host_Family => X.C_Family_Generic,
Host_Length => Host_Access_String'Length,
Host_Address => Host_access_String'Address );
begin
--
-- Open the display
--
X.OPEN_DISPLAY ( Result => Display );
if Display = System.Address_Zero then
TEXT_IO.PUT_LINE("Could not open display!");
return;
end if;
--
-- Make the X server available to anyone
--
X.ADD_HOST( Display => Display,
Host => Host_Network_Record );
--
-- Close the display connection
--
X.CLOSE_DISPLAY( Display => Display );
--
-- Execute the relogin.com file so it will bring up a new login box.
-- Executing the DO_COMMAND should also terminate this program.
--
LIB.DO_COMMAND( Status => Do_Status,
Command_String => "@SYS$MANAGER:RELOGIN.COM" );
--
-- If we get to this location, then the DO_COMMAND must have failed.
--
TEXT_IO.PUT_LINE("DO_COMMAND must have failed...");
end OUR_LOGIN;
|
3654.4 | Explanation and a suggestion | STAR::VATNE | Peter Vatne, VMS Development | Tue Nov 27 1990 22:56 | 33 |
| There is a big semantic difference between "non-standard" and "unsupported".
A standard system is the DECwindows system you get out of the box with no
modifications. A non-standard system is one that you have customized.
A non-standard system can be either supported or unsupported.
A supported system is one that we guarantee that we will answer QARs for.
You can certainly use an unsupported system, but if you find any problems,
we don't guarantee we will attempt to fix them. We will certainly listen
with interest, as unsupported systems may get supported in the future if
business needs warrant it. We document in the installation guide those
customizations that are supported. Anything not documented is probably
not supported, although there have been documentation oversights.
Now down to nuts and bolts: the use of DECW$LOGIN_MULTIPLE is unsupported
in general. It is put there as a hook for development groups to develop
their own prototypes. Whether or not this hook is supported for specific
uses is negotiated between the development group and VMS DECwindows.
My suggestion is for you to redesign this system. It is best to let
the server reset between user sessions. I would suggest that the
application be modified to somehow detect that the user is logging
out, and close down the display, but keep on collecting data. The
application should then detect when the user is logging back in
again, and re-open the display.
There are lots of techniques that you could use to implement such
a scheme. You could probably run a program that then talks to the
application via a mailbox, and lets the application know that the
user has logged in. If this is truly a mission-critical environment,
I would suggest that the display programs be run in a separate
process from the data collection programs. If you need lots of
bandwidth, I suggest using global sections to pass the data
between the two processes.
|