[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference help::decnet-osi_for_vms

Title:DECnet/OSI for OpenVMS
Moderator:TUXEDO::FONSECA
Created:Thu Feb 21 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3990
Total number of notes:19027

3904.0. "clusterwide NET$CONFIGURE APPLICATION_DELETE/ADD" by ZUR01::SCHNEIDER (Kurt Schneider, SW-Support Zurich) Fri Mar 21 1997 04:00

    Hello,
    
    	My customer runs DECnet/OSI V6.3 in a cluster and
    has a common NET$APPLICATION_STARTUP.NCL which is rather
    large. If several satellite-nodes boot at the same time,
    the called NET$CONFIGURE.COM is used to do APPLIACTION_DELETE
    and _ADD simultaneous/clusterwide. After all the startup's
    have completed the final NET$APPLICATION_STARTUP.NCL looks
    rather small (most of the appliaction disappeared). This
    happend during RSM$CLIENT_STARTUP.COM.
    
    	There are several DEC-products which include in their
    startup-procedure the called NET$CONFIGURE interface for
    application definitions. Of course it is not necessary to
    execute APPLICATION_DELETE followed by APPLICATION_ADD at
    every startup, but if they'd like to be sure the application
    is defined they need to do that. (APPLICATION_TEST or similar
    does not exist...) 
    
    Question: Is NET$CONFIGURE APPLICATION_DELETE and ..ADD
    	      supposed to work in a cluster-environment with
     	      a SYS$COMMON:[SYSMGR]NET$APPLICATION_STARTUP.NCL?
    
    
    Thanks for a feedback.
    
    Regards
    Kurt         
T.RTitleUserPersonal
Name
DateLines
3904.1BULEAN::BANKSSaturn SapWed Apr 02 1997 10:2132
Not entirely.

Unfortunately, the files aren't well interlocked for simultaneous access
from multiple cluster nodes (or even multiple instances of NET$CONFIGURE on
the same node).

The config procedure merely creates a .TMP file to build the new script
into, then renames it to be the active script when it's done.  It's
entirely possible that it's not even careful about which .TMP file it picks
up, so it's possible that when it decides to make a .TMP into the active
script, it's actually picking up a work in progress from some other node.

I can think of at least two ways to fix this bug, but obviously haven't
done it yet.  For starters, I'd suggest a QAR (or equivalent) to make sure
the problem doesn't get forgotten.

I will say that the best we can probably do right now is just make sure
something catastrophic doesn't happen, although it's still a terribly bad
idea to try to have multiple systems simultaneously update a cluster common
application script.  By this, I mean that a "fix" (as suggested in the
previous paragraph) would not eliminate all your problems, because short of
going to some formally interlocked database, I think we're always going to
have problems with two NET$CONFIGUREs banging on the same config file at
the same time.

One question in return:  Simply because I'm so ignorant of what goes on
outside the DECnet software proper, it comes as a surprise to me that
something does application delete/add operations on startup.  Is this a
first time (configuration) thing, or something more ongoing?  What
processes are involved, and what's supposed to happen?  (I ask this so any
future edits can be more attuned to whatever software's needs it isn't
currently attuned to.)
3904.2MUNICH::KLOEPPERVera Kloepper/Net&Comms-SupportThu Apr 03 1997 02:5718
    
    The RSM (remote system manager) startup does the delete/add
    always on startup of the system - this can create lots of problems in
    a cluster environment.
    I've seen this in Germany - 
    problem 1 : the system startup is real slow because of the several 
                checksumming action,
    problem 2 : s.t. the net$*.dat files are corrupted after a cluster
                reboot.
    
    Because this dat file corruption wasn't always - but if they happend
    the DECnet didn't start at the next reboot - I modified the rsm-startup
    at this side - I've included a normal check by NCL to see if the
    object is already defined - if not I add it - if yes no further action.
    
    Kurt - maybe this is also a solution for your customer ..
    
    servus  Vera
3904.3use distributed locks on a common fileZUR01::SCHNEIDERKurt Schneider, SW-Support ZurichThu Apr 17 1997 04:4523
    thanks for the replies.
    
    	I just asume that RSM-folks did not trust in the Net-Object-
    database hold in NCL-scripts so they made sure that their applications
    get defined at startup of their product (this happens on each reboot).
    
    	Of course (Vera, you are right!) we also run modified/edited 
    RSM-startup commandfiles. But as you know, each time you forget to
    change all necessary startupfiles you might get the same problem again.
    This happens if new systems get introduced by installing them from
    original productfiles.
    
    	Another idea to prevent concurrent modifications by multiple 
    NET$CONFIGURE-processes would be to maitain a RMS-file in the common
    area where we could have a clusterwide lock on it for synchronisation.
    Disadavantage would be to eventually slow down boot-processes even
    more. But a meaningful OPCOM-message might help to check this
    behaviour.
    
    	To complete this note I will open an IPMT.
    
    Regards
    Kurt