T.R | Title | User | Personal Name | Date | Lines |
---|
827.2 | More info | SCOTTC::MARSHALL | Pearl-white, but slightly shop-soiled | Wed Jun 10 1992 16:58 | 8 |
| Hi,
Please can you give the exact error message(s) (including %RMS-F-ERR type stuff,
where appropriate), and say which log file the error appears in.
I'll then try and find what might generate that error...
Scott
|
827.4 | | WAYOUT::ALLAMS | If you're not confused, you're misinformed. | Fri Jun 12 1992 10:26 | 52 |
|
Ok,
My error messages are as per notes 1852 and 2240 in the old conference.
The only answer in there states:
> Firstly the device offline message:
> 1) The customer is lying and a device was offline
> 2) It is a spurious error due to some of the more peculiar signalling
> and error handling abilities of the Sender/Fetcher which can be
> ignored - does it happen at the same time as the Pending message?
> 3) The device was offline and the customer is lying (see number 1).
> As to which device it could be well the one with the Shared Directory
> on is most likely.
I am fairly certain that the customer is not lying.
The message is in oa$mti_err.
The thing that worries the customer is that this error appears to be
associated with a lost message. - The message failed at about the same
time that this message occured - what happened is a message was sent
with a list of addressees, and one of those addressees was not translated
to a full address. Therefore the mail failed to be sent to this addressee.
The error message has occcured a couple of times, but he has only lost
one message. It is impossible to link that message with that error (as
far as I know), equally it is impossible to prove that that error was
not linked to the message.
The problem with telling him that it is a spurious message that the
sender gives out due to the 'more peculiar signalling and error
handling abilities' of the sender is:
If we know where and how the message is being generated, why isn't it
fixed, or at the very least release noted?
(Also, the fact that the previous notes were aug 91 tend to strengthen
this)
If we don't know 'exactly' where the error is generated, how can we be
sure that the error doesn't result it the loss of a message.
Any chance of a more definitive answer?
If I can't get one, I'm going to have to
submit a CLD.
Steve
|
827.5 | Rumours and ideas!!! | AIMTEC::LAMBERSON_M | | Fri Jun 12 1992 13:59 | 23 |
| Steve,
A couple of things you may wish to check for that are rumoured to
cause this type of error.
1. User is logged into ALL-IN-1 via a PC/WORKSTATION, does a send on a
message with a "large" distribution list, then powers off the
PC/WORKSTATION before the cursor returns to "home".
2. Users sending second class or deferred mail to PAPER MAIL
addressees with an invalid PRINTER destination in PROFILE.DAT. For
example a user sends deferred mail and PRINTER is set to SYS$PRINT
(Default) when SYS$PRINT is undefined on the system, or the field is
set to PORT and the users terminal printer is disconnected or not
available. Remember to check both the originators and the POSTMASTER
profile records since there is now a question about which is used by
the SENDER run.
These ideas have not been tested and verified but several problems that
have been worked by the CSC seem to have been related to these issues.
HTH
|
827.6 | | WAYOUT::ALLAMS | If you're not confused, you're misinformed. | Fri Jun 19 1992 10:57 | 3 |
|
No, neither of these are the case.
|
827.8 | An educated guess?? | AIMTEC::LAMBERSON_M | | Thu Jun 25 1992 17:14 | 23 |
| There is a very good possibility that this problem is related to file
access "timeouts" on the SDAF or PENDING files. You may want to check
the IO activity to the devices holding these files during the periods
using the SPM or other performance monitoring package. If the IO rate
for that time period is significant (over 1) then the possibility of
file timeouts exist. This probability is significantly increased if the
indexed files are not regularly compressed by the Reorganize System
Files housekeeping procedure or manual CONVERT/FDL= procedures on these
files. It is also a good practice to create optimized FDLs for these
files if they have grown significantly. You may also want to look at
the IO rates to other devices that are accessed via the same HSC or
disk interface module. If these devices have a higher (hardware)
priority (lower requestor slot in the HSC, etc) then the ALL-IN-1 disk
will have to wait for these disks to complete their IO requests
communications before it can complete its own. This can also increase
the probability that a timeout may occur.
This is all conjecture at this point but after working a similar
problem with another customer yesterday it does seem likely that a
timeout was the cause.
Hope this helps
|