T.R | Title | User | Personal Name | Date | Lines |
---|
402.1 | See Auditing, See INSTALL/REPLACE Message(s) | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Mon Mar 31 1997 13:05 | 16 |
|
We'll need some evidence around this -- this sounds like there are
events occuring that the user is not telling us or is unaware of,
or that there is a bug here. Without evidence, I'd assume the
former is the case.
Use auditing, and enable and watch the installed image audits.
It _is_ possible that the INSTALL/REPLACE on V6.1 might fail, this
is usually a result of insufficient contiguous global pages. There
are messages displayed that indicate this event occured, and there
are f$getsyi lexical function keywords that can be used to detect
sufficient global pages before the command, and there are f$file
keywords and error handling sequences that can be used to detect an
image that was not reinstalled as expected.
|
402.2 | INSTALL known issues | SIOG::PKIRK | I wonder if I'm on the right planet......? | Tue Apr 01 1997 09:20 | 27 |
|
Re -1
Hello Steve
I am working with John Dooley on this issue.
You say that :
It _is_ possible that the INSTALL/REPLACE on V6.1 might fail, this
is usually a result of insufficient contiguous global pages. There
are messages displayed that indicate this event occured, and there
are f$getsyi lexical function keywords that can be used to detect
sufficient global pages before the command, and there are f$file
keywords and error handling sequences that can be used to detect an
image that was not reinstalled as expected.
Are there any known issues around INSTALL where its expected
behaviour may not match the actual,In particular INSTALL /REPLACE
and INSTALL/REMOVE-INSTALL/ADD?
Could you give me some examples of uses of the f$file function to
get more info on installed images.
Many Thanks
Paul Kirk
|
402.3 | More info | CSC32::M_ANTRY | | Tue Apr 01 1997 13:57 | 101 |
| I might as well add my .02 worth. I'm working this same customer issue
from the CSC side.
I'll add some more background. The customer is running a huge FAB LAB,
OK it's INTEL, this application is quite large and impacts production
heavily(I'm not quite sure what it does) when it is down. They are in
the process of trying to move the application from some old RA73 disks
to some newer disks. They are doing this by doing a
backup/image/ignore and moving the contents of the disk and then they
supposedly take down the application and issue a command procedure that
REMOVES the installed images and then ADDS them back in after changing
logicals that point to where they are. They think this goes OK only to
find that users start complaining about the application. Then they
start looking and the errors are showing up as "Shareable images must
be installed..." They look and the image is installed(in one instance)
but the open/head/share attributes had not been set. They do a remove
and then a add using the same syntax and all is well.
Know this may not be the best way to move an application but it
absolutely critical for them to keep it up. Is there a better way to
do this? Any issues or ways around looking at things to make sure they
are OK to remove the installed image, verify that it has been removed
and then adding it back in?
The customer sent me some logs today with a different but related
question:
The did a install list/full/glob and one of their shareable images
shows up as:
>F10AP214:<WSMSTAP1.PCMON53>.EXE
> COMSHR;39 Open Hdr Shar Lnkbl
> Entry access count = 10
> Current / Maximum shared = 9 / 10
> Global section count = 2
>
>
>
> COMSHR;39 Open Hdr Shar Lnkbl
>
> Delete Pending Global Sections
>
>COMSHR_002 (32000001) TMP SYS
Pagcnt/Refcnt=89/890
>COMSHR_001 (32000001) TMP SYS
Pagcnt/Refcnt=1/10
>
>
>DSA214:<WSMSTAP1.PCMON53>.EXE List head adr/siz/ref =
B1DFEC80/46/1
>
> COMSHR;39 Open Hdr Shar Lnkbl
I assume that this image has been "REMOVED" hence the delete pending
global sections.
Then the customer says that he has done a REMOVE followed by an ADD and
now a list/full/glob shows:
>F10AP214:<WSMSTAP1.PCMON53>.EXE
> COMSHR;39 Open Hdr Shar Lnkbl
>
> System Global Sections
>
>COMSHR_002 (32000001) PRM SYS
Pagcnt/Refcnt=89/534
>COMSHR_001 (32000001) PRM SYS
Pagcnt/Refcnt=1/6
>
> Delete Pending Global Sections
>
>COMSHR_002 (32000001) TMP SYS
Pagcnt/Refcnt=89/2047
>COMSHR_001 (32000001) TMP SYS
Pagcnt/Refcnt=1/23
>COMSHR_002 (32000001) TMP SYS
Pagcnt/Refcnt=89/890
>COMSHR_001 (32000001) TMP SYS
Pagcnt/Refcnt=1/10
>
>
>DSA214:<WSMSTAP1.PCMON53>.EXE List head adr/siz/ref =
B1E2CAA0/46/1
>
> COMSHR;39 Open Hdr Shar Lnkbl
We can tell that yes this is a different instance of the shareable
image due to the List head adr being different. We still show the
delete pending global sections along with the new ones.
I guess my question (because I don't recall what the customers
complaint was in this case) is this normal to see this type of listing
and can you further explain what is happening behind the scenes with
how install is keeping track of this installed shareable image.
Thanks alot
Mark Antry
801-294-7527
Bottom line for the customer is the issue on how to move these
installed images from one disk to another with minimum down time.
|
402.4 | Big Tradeoffs Here | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Tue Apr 01 1997 14:19 | 64 |
| re: .2
: Are there any known issues around INSTALL where its expected
: behaviour may not match the actual,In particular INSTALL /REPLACE
: and INSTALL/REMOVE-INSTALL/ADD?
It is quote possible that an INSTALL REPLACE sequence or an INSTALL
DELETE followed by an INSTALL CREATE will fail when insufficient
contiguous global page free space exists for the replacement operation.
(Replacement images are typically larger than the original image, too.
See .3 for an example of how delete-pending pages can tie up room, too.)
Prior to V6.2, one should seriously avoid having two images -- each
with the same image name -- installed shareable on a system. Prior
to V6.2, the image name was used to generate the section name, and
this prevented the same image name from being installed shareable
more than once, regardless of directory and device. As of V6.2,
this longstanding (but poorly documented) restriction was lifted.
See NOTED::HACKERS 1748.* for a detailed discussion of this topic.
re: .3
...
: Know this may not be the best way to move an application but it
: absolutely critical for them to keep it up. Is there a better way to
: do this?
I had a discussion with a ambassador/partner back at the last Nashua
meeting that sounded very similar to this situation, and I indicated
that this "hot backup" sequence was not something that I'd recommend.
(I was primarily concerned with the contents of the data files...
If the application is "crash-worthy" or can recover from crashes,
it can *probably* contend with this situation, but it's not something
I would freely recommend without also having a high level of comfort
around the behaviour of the application(s) in use.)
: Any issues or ways around looking at things to make sure they
: are OK to remove the installed image, verify that it has been removed
: and then adding it back in?
One thing I *think* I had suggested to the ambassador/partner was a
configuration with multiple system disks, and multiple application
disks -- this would allow a rolling OpenVMS upgrade, and -- assuming
the application is coded for it -- a rolling application upgrade.
Having multiple system disks around is key...
: The customer sent me some logs today with a different but related
: question:
:
:...
:
: I guess my question (because I don't recall what the customers
: complaint was in this case) is this normal to see this type of listing
: and can you further explain what is happening behind the scenes with
: how install is keeping track of this installed shareable image.
There are images currently mapped against these image sections,
and these sections cannot be deleted until the image(s) are
restarted. This would be a normal artifact of the "hot backup".
(This also ties up global pages -- the customer should re-gen and
reboot with sufficient pages for this sort of activity, if this is
normal.) And I've seen hot-swaps "underneath" running applications
cause the applications to fail...
|
402.5 | suspected problem doesn't match error message | GIDDAY::GILLINGS | a crucible of informative mistakes | Tue Apr 01 1997 23:09 | 32 |
| re .3:
Mark,
Something doesn't add up here...
> Then they
> start looking and the errors are showing up as "Shareable images must
> be installed..." They look and the image is installed(in one instance)
> but the open/head/share attributes had not been set. They do a remove
> and then a add using the same syntax and all is well.
The PRIVINSTALL error results from trying to activate a shareable image
which is not installed from a privileged or protected image. The attributes
of the target image are irrelevant, the only thing that matters is that
it be installed. So, the problem cannot be that "open/head/share" are
missing. Note that there are numerous images installed by OpenVMS without
any attributes, purely so they can be called from privileged images (see
VMSIMAGES.DAT or just INSTALL LIST on any system).
To diagnose this problem, you would need to have the customer provide you
with the failing command, the *exact* error message(s), the output from
INSTALL LIST/FULL, a SHOW LOGICAL/FULL on the image logical name(s) and
DIRECTORY/FULL of the image name. It would also be helpful to know the
exact commands issued to REMOVE and ADD the images and change logical names.
Finally, have the customer enable auditing for INSTALL commands:
$ SET AUDIT/AUDIT/ENABLE=INSTALL
in order to check for stray INSTALL commands.
John Gillings, Sydney CSC
|
402.6 | I did have the error message wrong | CSC32::M_ANTRY | | Wed Apr 02 1997 09:36 | 33 |
| re: -1, OK I was wrong on the error message, here is a message from the
customer that describes in his words what is happening:
>Mark,
>
>The problem was that new processes that tried to use this installed
>image during
>process activation failed with the following error.
>
>-SYSTEM-W-NOSUCHSEC, no such (global) section
>
>When we re-installed the comshr image by using the following sequence
>of
>commands the new processes started sucessfully.
>
>$IMAGE1 = "PCMON_ROOT_DIRECTORY:COMSHR.EXE"
>$INSTALL:=$INSTALL/COMMAND_MODE
>INSTALL REMOVE 'IMAGE1'
>$! Install the COMSHR shareable image
>$INSTALL ADD/OPEN/SHARED/HEADER 'IMAGE1'
>
>Hope that this helps
>
>Regards
>
>Karl Brennan
I'll see about having the customer gather more data as described in
.5:
Thanks
Mark Antry
|
402.7 | things to look (out) for | STAR::DAVIDSON | Stu Davidson - OpenVMS Engineering | Wed Apr 02 1997 15:04 | 30 |
|
Reading this note string, it occurs to me that INSTALL has had a history
of not completely protecting itself from multiple copies of INSTALL
trying to modify known images at the same time.
Also, note what Steve pointed out about collision of global section
names. INSTALL was quite happy to install 2 images of the same name
from different device/directory combinations. However, there was only
one set of global sections. If the two images were not identical,
behavior tended to be, well, unexpected. If you then remove one of the
images, the global section(s) would be deleted - which could explain
the image activator reporting
"-SYSTEM-W-NOSUCHSEC, no such (global) section"
All these problems were fixed in V6.2
I don't think any of the known problems explain an image 'losing'
attributes. The most likely explanation there is some incorrect
command procedure.
Perhaps they use f$file to observe that a file is installed, so
they use a simple INSTALL REPLACE, creating a new known file entry
for a different device/directory. Depending on logical names and
just exactly what commands were used, the REPLACE could have been
turned into a CREATE - and the attributes get lost.
I've never seen any evidence that installed images have attributes
change except by explicit use of INSTALL, nor INSTALL failing to
create known file entires (with requested attributes) without an
appropriate message.
|
402.8 | INSTALL is now a verb (since V5.0) | GIDDAY::GILLINGS | a crucible of informative mistakes | Wed Apr 02 1997 22:58 | 27 |
| Mark,
This is probably not relevant to the problem at hand, but please ask
your customer to remove the archaic usage of INSTALL:
>$IMAGE1 = "PCMON_ROOT_DIRECTORY:COMSHR.EXE"
>$INSTALL:=$INSTALL/COMMAND_MODE
>INSTALL REMOVE 'IMAGE1'
>$! Install the COMSHR shareable image
>$INSTALL ADD/OPEN/SHARED/HEADER 'IMAGE1'
Delete the line "$INSTALL:=$INSTALL/COMMAND_MODE" since INSTALL is now
a full DCL VERB. Nothing else needs to change. There is at least one
obscure problem with INSTALL when using the old interface.
I would also reccomend always using a logical name to install an image.
For example:
$ DEFINE/SYSTEM/EXEC COMSHR PCMON_ROOT_DIRECTORY:COMSHR.EXE
$ INSTALL REMOVE COMSHR
$ INSTALL ADD/OPEN/SHARED/HEADER COMSHR
KFE lookup is very picky about filespecs. The filespec used to activate
the image must be textually *identical* to that used to INSTALL the image.
Since you need the logical name anyway, it makes sense to guarantee a
match by defining the logical name first and using it to INSTALL the image.
John Gillings, Sydney CSC
|