[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

402.0. "Installed images dropping out - 6.1" by SIOG::DOOLEY () Mon Mar 31 1997 06:41

    
    OpenVMS Vax 6.1 + patches - Installed Images
    
    Hi,
    	I have seen a problem where an image was installed with 
    /open/header/shared/priv qualifiers. The customer checked installed
    images later and discovered that the image was no longer "installed"
    with the /header/shared qualifiers.
    
    Is it possible that after an Install/replace of an image that it "drops
    out" of the installed image list. The customer has verified that the
    image is not installed shared as he sees a huge amount of image
    activation per user.
    
    The above note may have holes in it but I hope you understand the
    question. Briefly are there any circumstances that an installed image
    may not stay installed. For example if there was a shortage of global
    sections of global pages.?
    
    					Thanks,
    						John
T.RTitleUserPersonal
Name
DateLines
402.1See Auditing, See INSTALL/REPLACE Message(s)XDELTA::HOFFMANSteve, OpenVMS EngineeringMon Mar 31 1997 13:0516
   We'll need some evidence around this -- this sounds like there are
   events occuring that the user is not telling us or is unaware of,
   or that there is a bug here.  Without evidence, I'd assume the
   former is the case.

   Use auditing, and enable and watch the installed image audits.

   It _is_ possible that the INSTALL/REPLACE on V6.1 might fail, this
   is usually a result of insufficient contiguous global pages.  There
   are messages displayed that indicate this event occured, and there
   are f$getsyi lexical function keywords that can be used to detect
   sufficient global pages before the command, and there are f$file
   keywords and error handling sequences that can be used to detect an
   image that was not reinstalled as expected.

402.2INSTALL known issues SIOG::PKIRKI wonder if I'm on the right planet......?Tue Apr 01 1997 09:2027
	Re -1
    
    Hello Steve 
    
    I am working with John Dooley on this issue.
    
    You say that : 
    
   It _is_ possible that the INSTALL/REPLACE on V6.1 might fail, this
   is usually a result of insufficient contiguous global pages.  There
   are messages displayed that indicate this event occured, and there
   are f$getsyi lexical function keywords that can be used to detect
   sufficient global pages before the command, and there are f$file
   keywords and error handling sequences that can be used to detect an
   image that was not reinstalled as expected.

    Are there any known issues around INSTALL where its expected 
    behaviour may not match the actual,In particular INSTALL /REPLACE
    and INSTALL/REMOVE-INSTALL/ADD?
    	
    Could you give me some examples of uses of the f$file function to 
    get more info on installed images.
             
    Many Thanks 
    
    Paul Kirk 
402.3More infoCSC32::M_ANTRYTue Apr 01 1997 13:57101
    I might as well add my .02 worth.  I'm working this same customer issue 
    from the CSC side.
    
    I'll add some more background.  The customer is running a huge FAB LAB,
    OK it's INTEL, this application is quite large and impacts production
    heavily(I'm not quite sure what it does) when it is down.  They are in
    the process of trying to move the application from some old RA73 disks
    to some newer disks.  They are doing this by doing a
    backup/image/ignore and moving the contents of the disk and then they
    supposedly take down the application and issue a command procedure that
    REMOVES the installed images and then ADDS them back in after changing
    logicals that point to where they are.  They think this goes OK only to
    find that users start complaining about the application.  Then they
    start looking and the errors are showing up as "Shareable images must
    be installed..."  They look and the image is installed(in one instance)
    but the open/head/share attributes had not been set.  They do a remove
    and then a add using the same syntax and all is well.
    
    Know this may not be the best way to move an application but it
    absolutely critical for them to keep it up.  Is there a better way to
    do this?  Any issues or ways around looking at things to make sure they
    are OK to remove the installed image, verify that it has been removed
    and then adding it back in?
    
    The customer sent me some logs today with a different but related
    question:
    The did a install list/full/glob and one of their shareable images
    shows up as:
    >F10AP214:<WSMSTAP1.PCMON53>.EXE
    >   COMSHR;39        Open Hdr Shar          Lnkbl 
    >        Entry access count         = 10
    >        Current / Maximum shared   = 9 / 10
    >        Global section count       = 2
    >
    >
    >
    >   COMSHR;39        Open Hdr Shar          Lnkbl 
    > 
    >        Delete Pending Global Sections
    >
    >COMSHR_002     (32000001)              TMP SYS       
    Pagcnt/Refcnt=89/890
    >COMSHR_001     (32000001)              TMP SYS       
    Pagcnt/Refcnt=1/10
    >
    >
    >DSA214:<WSMSTAP1.PCMON53>.EXE            List head adr/siz/ref =
    B1DFEC80/46/1
    >
    >   COMSHR;39        Open Hdr Shar          Lnkbl 
    
    I assume that this image has been "REMOVED" hence the delete pending
    global sections.
    
    Then the customer says that he has done a REMOVE followed by an ADD and
    now a list/full/glob shows:
    
    >F10AP214:<WSMSTAP1.PCMON53>.EXE
    >   COMSHR;39        Open Hdr Shar          Lnkbl 
    > 
    >        System Global Sections
    >
    >COMSHR_002     (32000001)              PRM SYS       
    Pagcnt/Refcnt=89/534
    >COMSHR_001     (32000001)              PRM SYS       
    Pagcnt/Refcnt=1/6
    > 
    >        Delete Pending Global Sections
    >
    >COMSHR_002     (32000001)              TMP SYS       
    Pagcnt/Refcnt=89/2047
    >COMSHR_001     (32000001)              TMP SYS       
    Pagcnt/Refcnt=1/23
    >COMSHR_002     (32000001)              TMP SYS       
    Pagcnt/Refcnt=89/890
    >COMSHR_001     (32000001)              TMP SYS       
    Pagcnt/Refcnt=1/10
    >
    >
    >DSA214:<WSMSTAP1.PCMON53>.EXE            List head adr/siz/ref =
    B1E2CAA0/46/1
    >
    >   COMSHR;39        Open Hdr Shar          Lnkbl 
    
    We can tell that yes this is a different instance of the shareable
    image due to the List head adr being different.  We still show the
    delete pending global sections along with the new ones.
    
    I guess my question (because I don't recall what the customers
    complaint was in this case) is this normal to see this type of listing
    and can you further explain what is happening behind the scenes with
    how install is keeping track of this installed shareable image.
    
    Thanks alot
    
    Mark Antry
    801-294-7527
    
    Bottom line for the customer is the issue on how to move these
    installed images from one disk to another with minimum down time.
    
402.4Big Tradeoffs HereXDELTA::HOFFMANSteve, OpenVMS EngineeringTue Apr 01 1997 14:1964
   re: .2

:    Are there any known issues around INSTALL where its expected
:    behaviour may not match the actual,In particular INSTALL /REPLACE
:    and INSTALL/REMOVE-INSTALL/ADD?

   It is quote possible that an INSTALL REPLACE sequence or an INSTALL
   DELETE followed by an INSTALL CREATE will fail when insufficient
   contiguous global page free space exists for the replacement operation. 
   (Replacement images are typically larger than the original image, too.
   See .3 for an example of how delete-pending pages can tie up room, too.)

   Prior to V6.2, one should seriously avoid having two images -- each
   with the same image name -- installed shareable on a system.  Prior
   to V6.2, the image name was used to generate the section name, and
   this prevented the same image name from being installed shareable
   more than once, regardless of directory and device.  As of V6.2,
   this longstanding (but poorly documented) restriction was lifted.
   See NOTED::HACKERS 1748.* for a detailed discussion of this topic.

   re: .3
...
:    Know this may not be the best way to move an application but it
:    absolutely critical for them to keep it up.  Is there a better way to
:    do this?

   I had a discussion with a ambassador/partner back at the last Nashua
   meeting that sounded very similar to this situation, and I indicated
   that this "hot backup" sequence was not something that I'd recommend.

   (I was primarily concerned with the contents of the data files...
   If the application is "crash-worthy" or can recover from crashes,
   it can *probably* contend with this situation, but it's not something
   I would freely recommend without also having a high level of comfort
   around the behaviour of the application(s) in use.)
    
:  Any issues or ways around looking at things to make sure they
:    are OK to remove the installed image, verify that it has been removed
:    and then adding it back in?

   One thing I *think* I had suggested to the ambassador/partner was a
   configuration with multiple system disks, and multiple application
   disks -- this would allow a rolling OpenVMS upgrade, and -- assuming
   the application is coded for it -- a rolling application upgrade.
   Having multiple system disks around is key...
    
:    The customer sent me some logs today with a different but related
:    question:
:
:...
:    
:    I guess my question (because I don't recall what the customers
:    complaint was in this case) is this normal to see this type of listing
:    and can you further explain what is happening behind the scenes with
:    how install is keeping track of this installed shareable image.

   There are images currently mapped against these image sections,
   and these sections cannot be deleted until the image(s) are
   restarted.  This would be a normal artifact of the "hot backup".
   (This also ties up global pages -- the customer should re-gen and
   reboot with sufficient pages for this sort of activity, if this is
   normal.)  And I've seen hot-swaps "underneath" running applications
   cause the applications to fail...

402.5suspected problem doesn't match error messageGIDDAY::GILLINGSa crucible of informative mistakesTue Apr 01 1997 23:0932
  re .3:

  Mark,

    Something doesn't add up here...  

>  Then they
>    start looking and the errors are showing up as "Shareable images must
>    be installed..."  They look and the image is installed(in one instance)
>    but the open/head/share attributes had not been set.  They do a remove
>    and then a add using the same syntax and all is well.

  The PRIVINSTALL error results from trying to activate a shareable image
  which is not installed from a privileged or protected image. The attributes
  of the target image are irrelevant, the only thing that matters is that
  it be installed. So, the problem cannot be that "open/head/share" are
  missing. Note that there are numerous images installed by OpenVMS without
  any attributes, purely so they can be called from privileged images (see
  VMSIMAGES.DAT or just INSTALL LIST on any system).

  To diagnose this problem, you would need to have the customer provide you
  with the failing command, the *exact* error message(s), the output from
  INSTALL LIST/FULL, a SHOW LOGICAL/FULL on the image logical name(s) and
  DIRECTORY/FULL of the image name. It would also be helpful to know the
  exact commands issued to REMOVE and ADD the images and change logical names.
  Finally, have the customer enable auditing for INSTALL commands:

	$ SET AUDIT/AUDIT/ENABLE=INSTALL

  in order to check for stray INSTALL commands.

						John Gillings, Sydney CSC
402.6I did have the error message wrongCSC32::M_ANTRYWed Apr 02 1997 09:3633
    re: -1, OK I was wrong on the error message, here is a message from the
    customer that describes in his words what is happening:
    
    >Mark,
    >
    >The problem was that new processes that tried to use this installed
    >image during
    >process activation failed with the following error.
    >
    >-SYSTEM-W-NOSUCHSEC, no such (global) section
    >
    >When we re-installed the comshr image by using the following sequence
    >of
    >commands the new processes started sucessfully.
    >
    >$IMAGE1 = "PCMON_ROOT_DIRECTORY:COMSHR.EXE"
    >$INSTALL:=$INSTALL/COMMAND_MODE
    >INSTALL REMOVE 'IMAGE1'
    >$!  Install the COMSHR shareable image
    >$INSTALL  ADD/OPEN/SHARED/HEADER 'IMAGE1'
    >
    >Hope that this helps
    >
    >Regards
    >
    >Karl Brennan
    
     I'll see about having the customer gather more data as described in
    .5:
    
    Thanks
    Mark Antry
    
402.7things to look (out) forSTAR::DAVIDSONStu Davidson - OpenVMS EngineeringWed Apr 02 1997 15:0430
    
    Reading this note string, it occurs to me that INSTALL has had a history
    of not completely protecting itself from multiple copies of INSTALL
    trying to modify known images at the same time.
    
    Also, note what Steve pointed out about collision of global section
    names. INSTALL was quite happy to install 2 images of the same name
    from different device/directory combinations.  However, there was only
    one set of global sections. If the two images were not identical,
    behavior tended to be, well, unexpected. If you then remove one of the
    images, the global section(s) would be deleted - which could explain
    the image activator reporting 
       "-SYSTEM-W-NOSUCHSEC, no such (global) section"
    
    All these problems were fixed in V6.2
    
    I don't think any of the known problems explain an image 'losing' 
    attributes. The most likely explanation there is some incorrect 
    command procedure.
    
    Perhaps they use f$file to observe that a file is installed, so
    they use a simple INSTALL REPLACE, creating a new known file entry
    for a different device/directory. Depending on logical names and
    just exactly what commands were used, the REPLACE could have been
    turned into a CREATE - and the attributes get lost.
    
    I've never seen any evidence that installed images have attributes
    change except by explicit use of INSTALL, nor INSTALL failing to
    create known file entires (with requested attributes) without an
    appropriate message.
402.8INSTALL is now a verb (since V5.0)GIDDAY::GILLINGSa crucible of informative mistakesWed Apr 02 1997 22:5827
  Mark,
    This is probably not relevant to the problem at hand, but please ask
  your customer to remove the archaic usage of INSTALL:

    >$IMAGE1 = "PCMON_ROOT_DIRECTORY:COMSHR.EXE"
    >$INSTALL:=$INSTALL/COMMAND_MODE
    >INSTALL REMOVE 'IMAGE1'
    >$!  Install the COMSHR shareable image
    >$INSTALL  ADD/OPEN/SHARED/HEADER 'IMAGE1'

  Delete the line "$INSTALL:=$INSTALL/COMMAND_MODE" since INSTALL is now
  a full DCL VERB. Nothing else needs to change. There is at least one
  obscure problem with INSTALL when using the old interface.

  I would also reccomend always using a logical name to install an image.
  For example:

	$ DEFINE/SYSTEM/EXEC COMSHR PCMON_ROOT_DIRECTORY:COMSHR.EXE
	$ INSTALL REMOVE COMSHR
	$ INSTALL ADD/OPEN/SHARED/HEADER COMSHR

  KFE lookup is very picky about filespecs. The filespec used to activate
  the image must be textually *identical* to that used to INSTALL the image.
  Since you need the logical name anyway, it makes sense to guarantee a
  match by defining the logical name first and using it to INSTALL the image.

						John Gillings, Sydney CSC