[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

2429.0. "XF- Your current document cannot be established" by LEMAN::64652::REGINA (Carrie's in the carrot land) Wed Mar 17 1993 17:53

ALL-IN-1 V3.0 unpatched. 3 node cluster. On 2 of the machines all a 
sudden we can no longer do an XF. It works through the first mail, and 
then fails with "Your current document cannot be established". There 
haven't been any changes to the cluster, all systems up for the same 
time. Node A was rebooted today to see if it helps, to no avail.

We have checked obvious things like global sections/pages, file 
buffering etc. BY comparing trace logs it seems that the failing nodes 
go elsewhere in the code than the working one (even if the trace 
doesn't reveal a difference until then. It checks for being a maildrawer 
or not. The logs  are in GVPROD::NODEA.LOG (failing one) and NODEC.LOG.

Thanks for any hint.
Regina
T.RTitleUserPersonal
Name
DateLines
2429.6Working . . .IOSG::SHOVEDave Shove -- REO2-G/M6Mon Mar 22 1993 17:389
    There's no special .EXE or anything that XF uses. (In some cases it'll
    use the file cab server, but you reckon there's no problem with that.)
    It's all done with the (somewhat involved!) Named Data on form
    EM$INDEX$OPTIONS. Basically it just loops through your selections doing
    MAIL FORWARDs.
    
    I'm copying your .log files now . . .
    
    D.
2429.10No official support line available, desperate...LEMAN::PUNKIE::REGINACarrie's in the carrot landWed Mar 31 1993 11:4221
Ok,

we tried. But: official channels say no (I guess they don't have the 
resources) because we are not a customer. How I loved the hotline in Valbonne 
some years ago!

The problem appeared after an upgrade to VMS 5.5-2. The problem appears on 2 
nodes which are 6000-430. The problem does not appear on the third node 8800.

Has anyone anywhere a 6000-430 with 5.5-2 and ALL-IN-1 V3.0 unpatched?  

As a side info, the cluster *HAS* been rebooted after the VMS upgrade and the 
uptime of 370 + days is a feature somewhere (dixit the system manager). In 
the meantime I also recompiled and reinstalled OA$MAIN, as the *MTHRTL.EXE 
has a new version in VMS 5.5-2 (lots of shots in the dark ...). To no avail, 
you guessed it.

Sorry to be a pain, but this notesfiles seems to be the only available 
resource for our problem.
/rhr

2429.11works for 1 only (unpatched 3.0)FORTY2::ASHGrahame Ash @REOWed Mar 31 1993 12:2210
and I can't be any help either. Our cluster has 3 different 6000s, 
unfortunately not one the same as yours. XF on OUTBOX 'works' in exactly the 
same way on all 3 systems - it processes the first message successfully, and 
then stops, as if I'd only selected F, not XF. No error messages.

I remember during 2.3 development we debated long and hard about what XF would 
do (and if anyone actually wanted it!). I can't remember what we decided, but 
I'm sure it wasn't 'just process one message'.

grahame
2429.12OK on our 6000-340 with 5.5-2IOSG::SHOVEDave Shove -- REO2-G/M6Wed Mar 31 1993 17:5119
    Well, one of our production systems is a 6000-340 running 5.5-2
    (TRON::); it is however running a "prototype of a possible future
    release" (nudge nudge!)
    
    On that system, XF works fine, as it does on the 8800 in the same
    cluster.
    
    So it appears that whatever it was, we've fixed it.
    
    Unfortunately, there's no record in our change history of our having
    deliberately fixed it, so it must have been as a side-effect of another
    change.
    
    It might be worth seeing if it's reproducible on 3.0-1, or getting
    someone to do so. Then at least you'd know if you'll be able to fix it
    by installing the patch.
    
    Sorry again,
    D.
2429.14I don't think it's VMS or VAXFORTY2::ASHGrahame Ash @REOThu Apr 08 1993 11:0816
Hi Regina,

Yes, we're all on 5.5-2.

In practice it's very rare for ALL-IN-1 to react as differently as this after 
a VMS or CPU upgrade - I can't remember ever seeing a problem that was that 
specific to the operating environment.

I'd say that it's more likely that there's something wrong with the ALL-IN-1 
setup or startup on the failing machines. Are all the relevant images and 
sections installed (check using $INSTALL)? Has A1V3START.COM completed 
successfully? (And there must be other similar things!)

Good luck,

grahame
2429.15INSTALL and SYSGEN has been checked for the obviousLEMAN::PUNKIE::REGINACarrie's in the carrot landTue Apr 13 1993 11:3615
Hmmm..

I *DID* check with INSTALL, went one by one through all ALL-IN-1 images with 
their privileges as well as all other installed images and the only probably 
relevant differences I came up with was MTHRTL and VMTHRTL, which is normal 
to be different between in a 6000 series and a 8000 series machine.

Remember, only XF refuses to work.

One last question: what is the microcode revision of your 6000-430? Do an
ANA/ERR, I am interested in CPU and console revision.

Thanks and regards
Regina

2429.16Here's what our (TRON's) ANAL/ERR saysIOSG::SHOVEDave Shove -- REO2-G/M6Tue Apr 13 1993 12:375
    TIME STAMP  KA62B  CPU FW REV# 6.  CONSOLE FW REV# 8.0
    
    Hope it helps - means nothing to me.
    
    D.
2429.17My system wants to play too - unfortunately!SUBURB::A1_CRANCaroline (A1_Cran) CroftWed Jun 09 1993 13:2826
    
    
    Hi,
    
    I suddenly seem to be getting this problem too on our node SUBURB.
    The cluster has 4 nodes, 2 6520's & 2 6340's. If I link on the 6340
    then the XF will work on the 2 6340's but NOT the 6520's. If I link on
    the 6520's it won't work on any of the nodes.
    
    I am almost possitive this happened after I install the a1_eco01030
    patch, although I also installed the a1_fix009030 at the same time. 
    
    Has anyone got any ideas, because I am at a full stop now!!!
    
    Regards
    Caroline.
    
    
    
    
    
    
    
    
    
    
2429.18XF- Your current document cannot be establishedKERNEL::SIMPSONRfredThu Jun 10 1993 14:4385
 Hello,

 Here is some more information on Caroline's problem (2429.17).
 It is only reproducable on the one cluster, we have performed tests on two
 other machines, one of which has exactly the same hardware setup. The only
 apparent difference is that the cluster that does not work, was originally
 V2.4 and then became an Official V3.0 Field Test site. All the ALL-IN-1
 systems are patched to V3.0-1.

 The system has four machines, of which the problem only occurs on two. To
 reproduce the problem,log onto one of the nodes that does not work, create
 three files in WP, then goto EM bring up an index and selem, then do
 an XF and <RETURN>, this will bring the following message box:

         +---------------------------------------------------------+
         |Your xf operation failed on document entitled XF Test2 in|
         |folder XF TEST.  Your current document cannot be         |
         |established.                                             | 
         |                                                         |
         |                                                         |
         |Press RETURN to continue or press EXIT SCREEN to abandon.|
         +---------------------------------------------------------+

 On pressing RETURN and 'Y' to send, a mail is sent, but contains only the last
 document to be forwarded and the message text. The ALL-IN-1 system is on the
 network so that if somebody from engineering can find time to log in and see
 the problem at first hand please get in touch with Caroline Croft (774 6254)
 for details.

 I also compared ALL-IN-1 traces and there appears to be no differences except
 for the error messages, below are some snippets from the A1TRACE.LOG from the
 node that has the problem:


![IO]     Getting field DELETE from DOCDB, Value: Y
![IO]     Getting field MODIFY from DOCDB, Value: Y
![SYMBOL] Symbol: OA$CURDOC, Value: XF TEST                       274897
![IO]     Getting record from CAB$PDAF, Key: [.DOC7]ZUSIMOBLI.WPL, Key-of-ref: D
!               AF_KEY/0
![IO]     Getting field FORWARDABLE from CAB$, Value:
![SYMBOL] Symbol: CAB$.FORWARDABLE[OA$CURDOC], Value:
![A1LOG]  Entry: %OA-I-LOGERROR, %OA-W-CAB_NEED_CURDOC, Your current document ca
!               nnot be established
![IO]     Getting record from DOCDB, Key: CREATED                       725097,
!               Key-of-ref: DOCUMENT/0



![A1LOG]  Entry: %OA-I-LOGFUN, Function: IFNOTSTATUS
![FUNC]   Function: XOP, Cmd line: "~~TEST_FOR_X_ERROR~~"
![A1LOG]  Entry: %OA-I-LOGFUN, Function: XOP             "~~TEST_FOR_X_ERROR~~"
![SYMBOL] Symbol: "~~TEST_FOR_X_ERROR~~", Value: ~~TEST_FOR_X_ERROR~~
![SYMBOL] Symbol: EM$INDEX$OPTIONS, Value:
![FUNC]   Function: GET, Cmd line: #X_TEXT = OA$MSG_TEXT
![A1LOG]  Entry: %OA-I-LOGFUN, Function: GET             #X_TEXT = OA$MSG_TEXT
![SYMBOL] Symbol: #X_TEXT = OA$MSG_TEXT, Value: Your current document cannot be
!               established
![FUNC]   Function: OA$MSG_PURGE, Cmd line:
![A1LOG]  Entry: %OA-I-LOGFUN, Function: OA$MSG_PURGE



![SYMBOL] Symbol: #X_KEY, Value: XF TEST                       274897
![IO]     Getting record from DOCDB, Key: XF TEST                       725103,
!               Key-of-ref: DOCUMENT/0
![IO]     Getting field TYPE from DOCDB, Value: DOCUMENT
![IO]     Getting field MAIL_STATUS from DOCDB, Value:
![IO]     Getting field DELETE from DOCDB, Value: Y
![IO]     Getting field MODIFY from DOCDB, Value: Y
![FUNC]   Function: SCRIPT, Cmd line: GENERAL_CURDOC_ERROR
![A1LOG]  Entry: %OA-I-LOGFUN, Function: SCRIPT          GENERAL_CURDOC_ERROR
!
![SCRIPT] Opening script GENERAL_CURDOC_ERROR (SCRIPT)
![IVP]    Script GENERAL_CURDOC_ERROR opened
![SCRIPT] Function nesting level: 0. Script context follows:


 Finally if there is anybody who has a similar hardware and software setup and
 was an Official field test site for V3.0 please would it be possible for you
 to try and reproduce the problem,

 Thanks in Advance,
 Richard Simpson.

2429.20Don't you just hate this when this happens...!!SUBURB::CROFTCCaroline CroftMon Jun 14 1993 11:2823
    
    
    Hi,
    
    In answer to .18 Yes - I have - However in absolute desperation on
    Friday I compared sysgen params across the 2 types of Nodes, and
    against the ALL-IN-1 installation guide, and the following parameters
    were set (via autogen) to be too low.
    
    	MAXBUF
    	PIOPAGES
    	PQL_MENQLM
    	TTY_TYPAHDSZ
    
    I reset these to the recomended minimum levels, rebooted, and
    hey-presto - it worked.
    
    SO I guess you can quite rightly blame us for this one!!
    
    Regards
    Caroline.
    
    
2429.21Nasty things these quotas!IOSG::CHINNICKgone walkaboutMon Jun 14 1993 13:0929
    Hmmm...	very interesting...
    
    I've been looking at this problem and I couldn't figure out what could
    be causing it. I even had to answer the SPR on it as 'Not Reproducible'
    
    It looks like it's caused by ENQLM from the info you've just given. Of
    course, it could just be intermittent and have gone away?
    
    [This is by elimination:
    
    	TTY_TYPAHDSZ - more chance of winning the lottery than this
    	PIOPAGES     - possible but unlikely - I'd expect RMS$_DME if so
    	MAXBUF	     - not doing any buffered I/O here, so v. unlikely
    ]
    
    Could you look at the account(s) where you were having the problem and
    tell us what the ENQLM was set to??. These days, I suggest ENQLM=1200
    for ALL-IN-1 because there are so many data-sets and other products
    linked to it that it is very easy to blow this limit.
    
    If it is ENQLM, then the failure is most likely to be when the lock is
    taken out on the document. This should cause an OA$_CAB_DOCLOCKED error
    which should be seen when GOLD\W was used. Can anyone confirm this was
    the observed symptom??
    
    Congratulations on finding it!
    
    Paul.
2429.22 SUBURB::A1_CRANCaroline (A1_Cran) CroftMon Jun 14 1993 16:4119
    
    
    Hi,
    
    ENQLM is 350 on most of our ALL-IN-1 accounts - sounds like we need to
    up it somewhat.
    
    We couldn't do a GOLD W on the error because it was trapped and then
    the general curdoc error routine was called which draws the box and
    displays the 'current document could not be established error'. A trace 
    however generated the OA-W-CAB_NEED_CURDOC error.
    
    The funny thing is, I have set the sysgen params on my VAXstation 3100
    and still cannot replicate the problem, so could it be a 'funny' with
    the 6520's prehaps...
    
    Regards
    Caroline
    
2429.23The answer is blowing in the wind...IOSG::CHINNICKgone walkaboutMon Jun 14 1993 17:0940
    
    Well, 1200 is just my recommendation for ENQLM... RMS can get quite
    lock hungry with the bucket, record, global buffer and file locks. Add
    to this the explicit locks which ALL-IN-1 uses and you could get up to
    quite a high number. Normally 350 might seem a lot, but I'd say its a
    bit too low. I think I've blown 300 before now.
    
    OA$_CAB_NEED_CURDOC is normally an error in response to an attempt to
    validate the current document. The code tries to check if the required
    document is current for various CAB functions and if not tries to get
    the record. If this get fails, you get the error. It can also happen in
    other circumstances where the current document has not been properly
    set and I was thinking that it was along these lines where a problem
    was occuring. Errors such as a failure to obtain the document lock
    causes OA$_CAB_DOCLOCKED to be signalled. In either case, you'll get
    this problem of 'no current document'.
    
    In any case, I think quotas are a good candidate. Of course, RMS
    structure errors, disk problems and the normal myriad of I/O failures
    could all give this message. So ENQLM is not the only possibility!
    
    The nature of this error is likely to be very sensitive to your
    configuration. ALL-IN-1 may be locking a lot of records to process an
    XF operation and depending on the nature of these documents you could
    be locking 10's to 100's of SDAF records alone. Add to this any other
    data-sets which are active or held open and you soon see that it will
    depend on the documents you are forwarding, the file tuning options,
    the size of the system, the values for the quotas and parameters, the
    direction of the wind, which side you got out of bed, etc etc...
    
    And, the SYSGEN parameters - even if dynamic - would only apply at
    login time. You could try the test of using SDA to check the remaining
    ENQLM before an XF operation and then during and after to see what sort
    of numbers are being used. PQL_MENQLM is the minimum value - it only
    overrides the UAF quota value if this value is less than the minimum -
    maybe your account on the 3100 has a higher value?
    
    Paul.