[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

421.0. "Entry numbers in cluster" by TAV02::GALIA (Galia Reznik, Israel Software Support) Sun Apr 06 1997 08:38

    Hi,
    
    I would like to understand the queues' entries numbering in cluster.
    As I read, entry numbers for the first Queue Manager in cluster are
    1..9999 . When those are occupied, entry numbers are raised to
    1000000 until 1009999 (10000..19999 are reserved for the second 
    Queue Manager, 20000..29999 for the third, and so on). There should be a
    minimal number of free entries in the range of 1..9999 to prevent going
    up to 1000000.
    
    Our customer has a following cluster:
    2* VAX V6.1 + VAX V6.2   on CI +
    VAXstation 3100 GPX V6.2 + Alphaserver 2000 4/200 V6.2-1H3 +
    VAXstation 4000 VLC V6.2
    Each computer has it's own system disk. The Queue Manager's database
    and master file are on the same disk and all computers point to it 
    (by $SHOW QUE/MANAGER/FULL). 
    
    Some time ago they had a problem with looping process sending batch
    jobs, and they have reached entry number over 4000000. Since then, they
    claim to reach over 1000000 all the time. They don't have many
    'occupied' entries between 1 and 9999, but still those empty entries
    not used - they see entries over 1000000.
    
    Do I understand correctly the entries mechanism? Why does it happen
    only after they had that looping process, and how can they get rid of
    this problem (they REALLY want those entries below 1000000).
    
    Thanks,
    Galia Reznik,
    MCS, Israel.
T.RTitleUserPersonal
Name
DateLines
421.1AUSS::GARSONDECcharity Program OfficeMon Apr 07 1997 00:1313
    re .0
    
    Suggestions...
    
    Use DIR/TIT
    
    Read Topic 282.
    
    Follow the pointer to the conference referenced therein.
    
    No, I don't know how to persuade the queue manager to go back to using
    small entry numbers once it has gone to big entry numbers. It seems
    like a reasonable thing to want to do though.
421.2TAV02::GALIAGalia Reznik, Israel Software SupportMon Apr 07 1997 03:1215
    Hi,
    
    Thanks for the prompt answer.
    I read the topics you've pointed to, and some others, but I would like
    to make some points clear.
    I understand, that if the Queue Manager uses more then 90% of entries
    in 1..9999, it will expand to 1000000. As I mentioned in .0, the
    customer does not have such number of entries in use. They had it ONCE,
    in the PAST, and since then it seems the Queue Manager 'remembers' it's
    way up over the millions. At some point it does go back, and it is
    not clear when. 
    Will it help to delete the *journal* file in the database?
    Thanks,
    Galia. 
    
421.3AUSS::GARSONDECcharity Program OfficeMon Apr 07 1997 04:3211
    re .2
    
    The previously referenced conference may be a better place to ask.
    
    I would not delete the journal file unless I didn't care about *any*
    data in the queue system and in that case I would delete the other
    queue files as well and go for a completely clean start. However for all I
    know there may be a very simple and non-risky solution to the problem.
    
    Why does the customer care what the entry number is?
    
421.4TAV02::GALIAGalia Reznik, Israel Software SupportMon Apr 07 1997 09:2110
    Hi,
    
    The customer has software limited to 4 digits for an entry number - he is
    very much interested in a solusion for this, since changing the s/w is
    not applicable.
    
    I will try to ask this in batch-print notes-file.
    
    Thanks,
    Galia. 
421.5Contact VendorXDELTA::HOFFMANSteve, OpenVMS EngineeringMon Apr 07 1997 10:5510
:    The customer has software limited to 4 digits for an entry number - he is
:    very much interested in a solusion for this, since changing the s/w is
:    not applicable.

   Short answer: The software is broken -- it has a bad assumption.

   If memory serves, computer associates had a package with this bug.
   In any case, contact the vendor.

421.6AUSS::GARSONDECcharity Program OfficeMon Apr 07 1997 19:5914
re .4
    
>    The customer has software limited to 4 digits for an entry number - he is
>    very much interested in a solusion for this, since changing the s/w is
>    not applicable.
    
    As Steve says, this is not a Digital problem. However I would hope that
    if someone within Digital knows how to workaround the application's
    problem, you can find it.
    
>    I will try to ask this in batch-print notes-file.
    
    Please remember to read first and ask second. I don't imagine that you
    are the only person to experience this problem.
421.7some thoughts about simple spool managementCOL01::VSEMUSCHINDuck and Recover !Tue Apr 08 1997 04:4717
    First of all I would say that it is not a great work to keep the
    command file which [re]defines all your queues. Even if you don't
    need it for SYSTARTUP_VMS. At least you can each time save the
    output of $ SHOW QUEUE/FULL and $ SHOW/QUEUE/FORM/FULL. Listings 
    with output of those commands could be written in SYSHUTDOWN, 
    by example. Those listings could be slightly converted to command 
    procedure (just a couple of easy TPU procedures). If you have such 
    a tool you can every time $ START /QUEUE/MANAGER/NEW without fear
    to loss something. Or you could $ DELETE/QUEUE filled with rubbish
    and redefine it easily.
    
    And if they could be hurt by entry numbers over 9999, they could 
    pay us for writing of command file, that each 5 (3, 1) minutes
    counts entries and warns if there are more than 500 of them 
    (yellow alert) or 750 (red alert).
    
    =Seva
421.8TAV02::GALIAGalia Reznik, Israel Software SupportTue Apr 08 1997 05:1020
    Hi,
    
    I DO understand, that it is unwise to rely on the assumption, that
    entry number should be upto 4 digits, but I still did not get an answer
    to my questions: 
    Why the Queue manager climbs up to the millions, when there are plenty
    empty entries in 1..9999? Shouldn't it do so when over 90% of 1..9999
    entries are occupied? Why didn't the customer have such problem BEFORE
    he had his looping process creating entries? Only after that case,
    when, by mistake, he had lots of entries in the system and had entries
    over million, now this sutiation repeats, even though they HAVE
    available entries. As if the Queue manager 'remembers' the last entry
    it had. Where does it 'remember' it? 
    In one of the notes I read a suggestion to shut down the cluster with
    *NO* entries in any queue - then the first entry after the reboot will
    be number 1. Will this clear the problem in your opinion?
    
    Thanks,
    Galia.
                                                            
421.9BSS::JILSONWFH in the Chemung River ValleyTue Apr 08 1997 09:4818
>    Why the Queue manager climbs up to the millions, when there are plenty
>    empty entries in 1..9999?

If you *REALLY* want to know then examine the source code.  Since the 
software is working and behaviour around this area is not documented you 
have no valid reason to IPMT the behaviour, you could IPMT the 
documentation to get the behaviour defined.  

If you would like to start completely from scratch I would suggest
1) Stop all queue
2) Use SHOW QUEUE/FULL...  to save all queue, form & characteristic 
   definitions and all pending jobs.
3) STOP/QUEUE/MANAGER/CLUSTER
4) START/QUEUE/MANGER/NEW
5) Recreate all your queues and resubmit all jobs.


Jilly
421.10See Previous Discussions...XDELTA::HOFFMANSteve, OpenVMS EngineeringTue Apr 08 1997 10:2338
     See 282.* for a previous discussion.  (DIRECT/TITL=QUEUE)

:    I DO understand, that it is unwise to rely on the assumption, that
:    entry number should be upto 4 digits, but I still did not get an answer
:    to my questions: 

    This whole area is largely undocumented, and is subject to change.

    If you're going to IPMT something, IPMT the documentation around
    the need to deal with more than four digits, to try to avoid this
    sort of programming error in the future.  (I'd have the customer
    log the equivilent of an IPMT against the package vendor, however.)

:    Why the Queue manager climbs up to the millions, when there are plenty
:    empty entries in 1..9999? Shouldn't it do so when over 90% of 1..9999
:    entries are occupied?

    The upper digits are currently reserved for the identification of
    the queue manager -- each queue manager has access to ten thousand
    (10,000) entries.   I don't know that this behaviour is documented,
    and I certainly would not depend on it.

:    Why didn't the customer have such problem BEFORE
:    he had his looping process creating entries?

    Based on this, it would appear the looping process "spammed" the
    batch queues, and invoked the expansion processing.  If so, just
    clean out the queues, get rid of the existing queue database, and
    recreate it.  (And complain to the package vendor!)

:    In one of the notes I read a suggestion to shut down the cluster with
:    *NO* entries in any queue - then the first entry after the reboot will
:    be number 1. Will this clear the problem in your opinion?

    I'd recreate the queue data files.  You can try a test of the
    above "clean queue reset", but the above suggestion is not likely
    documented behaviour, and is subject to change.

421.11Thanks!TAV02::GALIAGalia Reznik, Israel Software SupportTue Apr 08 1997 11:253
    Thanks to you all for your answers.
    
    Galia.
421.12No real help, only a theory hereEVMS::DAVIDB::DMILLERThis bug fix broke what???????Tue Apr 08 1997 15:1714
	Are there still entries left in the queues after you get rid
	of all the entries above 9999 and restart the system?

	If so, I'll take a SWAG at why you can't get back down under 10000...

	Perhaps the queue manager starts at the next available number
	when it first starts up.  If you have an entry 9200 still in
	the queue, and the queue manager starts using numbers at 9201,
	you'll quickly (?) get above 10000 again.

	-Dave

	SWAG:	Scientific Wild-Assed Guess