T.R | Title | User | Personal Name | Date | Lines |
---|
437.1 | More Info Needed... | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed Apr 09 1997 12:20 | 10 |
|
Licenses are automatically freed up on node crash.
Licenses can be tied (intentionally or otherwise) to
a single node.
Please post an example of the LMF command(s) used to
register the PAK, and please post an example -- without
the checksum, of course -- of the failing PAK. (Also
indicate which LMF database file contains the PAK.)
|
437.2 | Should probably be an ACTIVITY PAK | GIDDAY::GILLINGS | a crucible of informative mistakes | Wed Apr 09 1997 20:27 | 28 |
| Chuck,
A bit of guesswork. I'm assuming you have a PAK with only sufficient units
for one node in the cluster. Since it's an AVAILABILITY PAK, the unit
requirements are tested at the time the PAK is LOADed. So, assuming there
are no /EXCLUDE or /INCLUDE lists, the first node to attempt to LOAD the
PAK will succeed, subsequent node(s) will fail.
>They say that they have to restart the crashed node and unload the license.
There shouldn't be any need to restart the crashed node, however, it
will probably be necessary to LOAD the PAK on the node which is supposed to
run the symbiont.
>Q: What needs to be done to make the license available when the crash occurs?
3 options I can see
1) Manually LOAD the PAK on one of the surviving nodes. If you're on a
recent enough version of OpenVMS, you may be able to automate this
using the new cluster event services.
2) Get an ACTIVITY based PAK which can be LOADed on all systems in the
cluster. Failover will then be automatic.
3) Purchase sufficient AVAILABILITY units to cover all nodes in the cluster.
John Gillings, Sydney CSC
|
437.3 | | AUSS::GARSON | DECcharity Program Office | Wed Apr 09 1997 23:10 | 7 |
| additional to .2
A possible (untested) hack would be to change the print queue so that
it does not autostart but to create a batch queue that does autostart.
The sole permanent job on the batch queue would load the licence and
start the real printer queue on the current node (and then wait for the
system to crash). The batch job would need to be restartable.
|
437.4 | ACTIVITY pak with 1 UNIT? | HYDRA::NEWMAN | Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26 | Fri Apr 11 1997 15:27 | 20 |
| I heard back from the developer. He confirmed that the problem has nothing
to do with nodes crashing and that they are using a common license
database file.
He says (among other things):
--------------------------------------------------------------------------------
The license is an availability license, with the availability
table code set to CONSTANT=1, and the units set to 1. This will allow
the license to be loaded on a single node only.
Perhaps we are generating this type of license incorrectly. I've been checking
out the LMF and the License Management Utility docs, but I'm not finding what
I'm looking for. What we want (and thought we had) was a license PAK that
allowed the symbiont to run on a single node of a cluster, and would work with
failover. That is, our customers could use this license on a cluster, and have
their queues be AUTOSTART queues.
--------------------------------------------------------------------------------
Sounds like (thanks, John) an ACTIVITY pak with 1 UNIT will do it [Y/N]?
-- Chuck Newman
|
437.5 | | HYDRA::NEWMAN | Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26 | Fri Apr 11 1997 18:06 | 35 |
| Looks like an ACTIVITY pak won't work for one of their models (multiple
processes, but only on one node in the cluster).
Here's what he writes:
----------------------------------------
What we want is a license PAK that will allow unlimited processes (queues)
running on a single processor. If we went with an ACTIVITY license PAK, it
would require us to limit the number of processes (the UNITS number), or allow
unlimited processes running on ALL nodes in a cluster. We offer customers
ACTIVITY based licenses that do just this, but we also offer a single processor,
unlimited queues license.
If it is possible to generate an ACTIVITY license that limits use to a single
processor, and unlimited processes, I'm all ears. From what I know, this isn't
possible - we need an AVAILABILITY license.
----------------------------------------
He implies that he needs to unload the license from the system that was using
it before he can load it onto another system. If that's the case, should I
enter a QAR?
I'll propose the alternative in reply .3. Would the following also work:
Before doing the license check, his stuff could attempt to load the license.
If the license load fails, it would just exit. As long as there is valid license
on the cluster, one of the systems should be able to load it and start up.
Alternatively, his stuff could start a queue on every node, and do the following:
1) Do some T.B.D. locking stuff
2) Do license check
3) If failure, do locking stuff so that it will get notified when
a functional (i.e., passed license check) queue fails.
4) If success, go to town.
Sound reasonable?
|
437.6 | | AUSS::GARSON | DECcharity Program Office | Mon Apr 14 1997 00:10 | 66 |
| re .5
In summary, I don't think that either AVAILABILITY licences or ACTIVITY
licences will do what the customer wants. To some extent, the licence
behaviours available reflect Digital's licensing model and if that
departs too radically from the customer's then the customer will have
to implement in whole or in part his own licensing system. In practice
only a small amount of coding should be needed to get the customer's
desired licensing behaviour. Suggestions below.
I am not convinced that licences tied to a single node (at any one
time) makes sense for a queue failover situation if there are more
than two nodes in the cluster. That is, it must be assumed that any
customer of your customer who purchases such a licence, configures all
queues to failover identically.
AVAILABILITY licences just don't failover in the way the customer would
like. Even if they did failover either automatically or manually there
is a race condition that would be tricky to handle. And in a cluster of
more than two nodes, additional sophistication would be needed to
ensure that a licence failed over to the right node. These are not
problems that apply to the customer's specific situation but do apply
if Digital were to provide a general failover capability for AVAILABILITY
licences and may explain why we do not provide it now.
ACTIVITY licences don't work the way the customer would like either.
The model for such a licence is that it allows a certain amount of
activity and this activity can occur without restraint anywhere in the
cluster.
It is important not to lose sight of the fact that LMF is not an
enforcement tool. At a certain point the customer might find it more
practical to encode the Ts&Cs in the licence agreement and not in the
software.
>He implies that he needs to unload the license from the system that was using
>it before he can load it onto another system. If that's the case, should I
>enter a QAR?
I would suggest getting a log showing exactly what the customer is
doing, what the licences look like (sans checksums of course) and what
is loaded where - if this can't be resolved.
Certainly, if a node has a licence loaded and the node does not fail
then no other node can load the licence (assuming only sufficient units
for one node at a time). If the node with the licence loaded fails then
any other node should be able to load the licence and it doesn't make
sense to say that the licence must be unloaded from the node that had
the licence loaded (before the node crashed) since that node may never
again live.
>Before doing the license check, his stuff could attempt to load the license.
>If the license load fails, it would just exit.
Reasonably creative. Does LMF give an interface for loading licences?
Was the customer intending to SPAWN the LIC LOAD? I would suggest
ignoring the result of loading the licence and just try to grab it. In
this case, the AVAILABILITY licence would be appropriate.
Another option would be to introduce a master process of which there is
zero or one per node - one when failover to that node occurs. Any
symbionts would ask the master process whether the licence was OK. The
master process could itself be implemented as a symbiont so that it
failed over with all the others. In this case, the ACTIVITY licence
with units enough for one concurrent activity would be appropriate.
|
437.7 | a currently out of reach solution | STAR::ABIS | I come in peace | Mon Apr 14 1997 12:48 | 29 |
| All the previous replies offer good ideas and suggestions. As a matter of
completeness, I thought I'd mention that LMF V1.2 (shipping with VMS 7.1) has
some features that could be useful to this application. Unfortunately, many
of these features are currently unusable due to Digital Business restrictions.
Whatever solution is chosen, LMF alone can't make this work. Additional code
must be added to the customer's product to make them work.
LMF V1.2 has a new license type I called a USER license that allows unlimited
use to a user. Then when that user is done using the license, the license is
then automatically available (like a failover) to another user in the cluster.
Unlike the currently available personal use license, the new USER license
doesn't have to manually assigned via a reserve list to a specific user.
For this queueing product to use this license type, it would have to pass in
the name of the node it's running on into the LMF during its license check.
LMF would then allow unlimited use on that node. In this situation, we would
size the license for just one node (user) and you would have your ideal
solution.
Unfortunately, since the Digital Business folks didn't want to do the work
required to offer this type of license, I'm the only one that can generate the
USER type license PAK. For third parties to use this license type, a new
release of the PAKGEN product would be required and Digital doesn't want to do
that either.
Maybe if there was customer demand ...
Eric
|