[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

437.0. "availability-based license isn't available to autostart queue after system crash" by HYDRA::NEWMAN (Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26) Wed Apr 09 1997 11:54

A software vendor has a symbiont with an availability-based license PAK.
It starts fine, but if the node on which the queue is running fails, the
license doesn't become available.  The next available AUTOSTART queue fails
(since the license wasn't freed).

They say that they have to restart the crashed node and unload the license.

This sort of eliminates the benefit of AUTOSTART queues :-(

Q:  What needs to be done to make the license available when the crash occurs?

The vendor didn't say what version of OpenVMS.
								-- Chuck Newman
T.RTitleUserPersonal
Name
DateLines
437.1More Info Needed...XDELTA::HOFFMANSteve, OpenVMS EngineeringWed Apr 09 1997 12:2010
   Licenses are automatically freed up on node crash.

   Licenses can be tied (intentionally or otherwise) to
   a single node.

   Please post an example of the LMF command(s) used to
   register the PAK, and please post an example -- without
   the checksum, of course -- of the failing PAK.  (Also
   indicate which LMF database file contains the PAK.)
437.2Should probably be an ACTIVITY PAKGIDDAY::GILLINGSa crucible of informative mistakesWed Apr 09 1997 20:2728
  Chuck,

    A bit of guesswork. I'm assuming you have a PAK with only sufficient units
  for one node in the cluster. Since it's an AVAILABILITY PAK, the unit
  requirements are tested at the time the PAK is LOADed. So, assuming there
  are no /EXCLUDE or /INCLUDE lists, the first node to attempt to LOAD the
  PAK will succeed, subsequent node(s) will fail.

>They say that they have to restart the crashed node and unload the license.

    There shouldn't be any need to restart the crashed node, however, it
  will probably be necessary to LOAD the PAK on the node which is supposed to
  run the symbiont.

>Q:  What needs to be done to make the license available when the crash occurs?

    3 options I can see

  1) Manually LOAD the PAK on one of the surviving nodes. If you're on a
     recent enough version of OpenVMS, you may be able to automate this
     using the new cluster event services.

  2) Get an ACTIVITY based PAK which can be LOADed on all systems in the
     cluster. Failover will then be automatic.

  3) Purchase sufficient AVAILABILITY units to cover all nodes in the cluster.

						John Gillings, Sydney CSC
437.3AUSS::GARSONDECcharity Program OfficeWed Apr 09 1997 23:107
    additional to .2
    
    A possible (untested) hack would be to change the print queue so that
    it does not autostart but to create a batch queue that does autostart.
    The sole permanent job on the batch queue would load the licence and
    start the real printer queue on the current node (and then wait for the
    system to crash). The batch job would need to be restartable.
437.4ACTIVITY pak with 1 UNIT?HYDRA::NEWMANChuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26Fri Apr 11 1997 15:2720
I heard back from the developer.  He confirmed that the problem has nothing
to do with nodes crashing and that they are using a common license
database file.

He says (among other things):
--------------------------------------------------------------------------------
The license is an availability license, with the availability
table code set to CONSTANT=1, and the units set to 1.  This will allow
the license to be loaded on a single node only.

Perhaps we are generating this type of license incorrectly.  I've been checking
out the LMF and the License Management Utility docs, but I'm not finding what
I'm looking for.  What we want (and thought we had) was a license PAK that
allowed the symbiont to run on a single node of a cluster, and would work with
failover.  That is, our customers could use this license on a cluster, and have
their queues be AUTOSTART queues.  
--------------------------------------------------------------------------------
Sounds like (thanks, John) an ACTIVITY pak with 1 UNIT will do it [Y/N]?

								-- Chuck Newman
437.5HYDRA::NEWMANChuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26Fri Apr 11 1997 18:0635
Looks like an ACTIVITY pak won't work for one of their models (multiple
processes, but only on one node in the cluster).

Here's what he writes:
----------------------------------------
What we want is a license PAK that will allow unlimited processes (queues)
running on a single processor.  If we went with an ACTIVITY license PAK, it
would require us to limit the number of processes (the UNITS number), or allow
unlimited processes running on ALL nodes in a cluster.  We offer customers
ACTIVITY based licenses that do just this, but we also offer a single processor,
unlimited queues license.  

If it is possible to generate an ACTIVITY license that limits use to a single
processor, and unlimited processes, I'm all ears.  From what I know, this isn't
possible - we need an AVAILABILITY license.  
----------------------------------------

He implies that he needs to unload the license from the system that was using
it before he can load it onto another system.  If that's the case, should I
enter a QAR?

I'll propose the alternative in reply .3.  Would the following also work:

Before doing the license check, his stuff could attempt to load the license.
If the license load fails, it would just exit.  As long as there is valid license
on the cluster, one of the systems should be able to load it and start up.

Alternatively, his stuff could start a queue on every node, and do the following:
1)  Do some T.B.D. locking stuff
2)  Do license check
3)  If failure, do locking stuff so that it will get notified when
	a functional (i.e., passed license check) queue fails.
4)  If success, go to town.

Sound reasonable?
437.6AUSS::GARSONDECcharity Program OfficeMon Apr 14 1997 00:1066
re .5
    
    In summary, I don't think that either AVAILABILITY licences or ACTIVITY
    licences will do what the customer wants. To some extent, the licence
    behaviours available reflect Digital's licensing model and if that
    departs too radically from the customer's then the customer will have
    to implement in whole or in part his own licensing system. In practice
    only a small amount of coding should be needed to get the customer's
    desired licensing behaviour. Suggestions below.
    
    I am not convinced that licences tied to a single node (at any one
    time) makes sense for a queue failover situation if there are more
    than two nodes in the cluster. That is, it must be assumed that any
    customer of your customer who purchases such a licence, configures all
    queues to failover identically.
    
    AVAILABILITY licences just don't failover in the way the customer would
    like. Even if they did failover either automatically or manually there
    is a race condition that would be tricky to handle. And in a cluster of
    more than two nodes, additional sophistication would be needed to
    ensure that a licence failed over to the right node. These are not
    problems that apply to the customer's specific situation but do apply
    if Digital were to provide a general failover capability for AVAILABILITY
    licences and may explain why we do not provide it now.
    
    ACTIVITY licences don't work the way the customer would like either.
    The model for such a licence is that it allows a certain amount of
    activity and this activity can occur without restraint anywhere in the
    cluster.
    
    It is important not to lose sight of the fact that LMF is not an
    enforcement tool. At a certain point the customer might find it more
    practical to encode the Ts&Cs in the licence agreement and not in the
    software.
    
>He implies that he needs to unload the license from the system that was using
>it before he can load it onto another system.  If that's the case, should I
>enter a QAR?
    
    I would suggest getting a log showing exactly what the customer is
    doing, what the licences look like (sans checksums of course) and what
    is loaded where - if this can't be resolved.
    
    Certainly, if a node has a licence loaded and the node does not fail
    then no other node can load the licence (assuming only sufficient units
    for one node at a time). If the node with the licence loaded fails then
    any other node should be able to load the licence and it doesn't make
    sense to say that the licence must be unloaded from the node that had
    the licence loaded (before the node crashed) since that node may never
    again live.

>Before doing the license check, his stuff could attempt to load the license.
>If the license load fails, it would just exit.
    
    Reasonably creative. Does LMF give an interface for loading licences?
    Was the customer intending to SPAWN the LIC LOAD? I would suggest
    ignoring the result of loading the licence and just try to grab it. In
    this case, the AVAILABILITY licence would be appropriate.
    
    Another option would be to introduce a master process of which there is
    zero or one per node - one when failover to that node occurs. Any
    symbionts would ask the master process whether the licence was OK. The
    master process could itself be implemented as a symbiont so that it
    failed over with all the others. In this case, the ACTIVITY licence
    with units enough for one concurrent activity would be appropriate.
               
437.7a currently out of reach solutionSTAR::ABISI come in peaceMon Apr 14 1997 12:4829
All the previous replies offer good ideas and suggestions.  As a matter of
completeness, I thought I'd mention that LMF V1.2 (shipping with VMS 7.1) has
some features that could  be useful to this application.  Unfortunately, many
of these features are currently unusable due to Digital Business restrictions.

Whatever solution is chosen, LMF alone can't make this work.  Additional code
must be added to the customer's product to make them work.  

LMF V1.2 has a new license type I called a USER license that allows unlimited
use to a user.  Then when that user is done using the license, the license is
then automatically available (like a failover) to another user in the cluster. 
Unlike the currently available personal use license, the new USER license
doesn't have to manually assigned via a reserve list to a specific user.  

For this queueing product to use this license type, it would have to pass in
the name of the node it's running on into the LMF during its license check. 
LMF would then allow unlimited use on that node.  In this situation, we would
size the license for just one node (user) and you would have your ideal
solution.

Unfortunately, since the Digital Business folks didn't want to do the work
required to offer this type of license, I'm the only one that can generate the
USER type license PAK.  For third parties to use this license type, a new
release of the PAKGEN product would be required and Digital doesn't want to do
that either. 

Maybe if there was customer demand ...

Eric