[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

827.0. "sender job fails with dev-offline error" by WAYOUT::ALLAMS (If you're not confused, you're misinformed.) Mon Jun 08 1992 12:11


Hi,


	This happens on a really busy ALL-IN-1 system, usually at peak mail
	sending time - the sender will fail on a message with the dev-offline
	error.

	All the logicals and shared directories check out ok.

	I read in an earlier conf. about a similar problem, that was possibly
	put down to the business of the system, and the sender producing 
	spurious errors.  This system is 2.4 patched to 603.  

	Any Ideas?


		Steve Allam
T.RTitleUserPersonal
Name
DateLines
827.2More infoSCOTTC::MARSHALLPearl-white, but slightly shop-soiledWed Jun 10 1992 16:588
Hi,

Please can you give the exact error message(s) (including %RMS-F-ERR type stuff,
where appropriate), and say which log file the error appears in.

I'll then try and find what might generate that error...

Scott
827.4WAYOUT::ALLAMSIf you're not confused, you're misinformed.Fri Jun 12 1992 10:2652
Ok,

	My error messages are as per notes 1852 and 2240 in the old conference.

	The only answer in there states:

>	Firstly the device offline message:
>    1) The customer is lying and a device was offline
>    2) It is a spurious error due to some of the more peculiar signalling
>       and error handling abilities of the Sender/Fetcher which can be
>       ignored - does it happen at the same time as the Pending message?
>    3) The device was offline and the customer is lying (see number 1).
>    As to which device it could be well the one with the Shared Directory
>    on is most likely.

	I am fairly certain that the customer is not lying.

	The message is in oa$mti_err.

	The thing that worries the customer is that this error appears to be 
	associated with a lost message. - The message failed at about the same
	time that this message occured - what happened is a message was sent 
	with a list of addressees, and one of those addressees was not translated
	to a full address.  Therefore the mail failed to be sent to this addressee.

	The error message has occcured a couple of times, but he has only lost 
	one message.  It is impossible to link that message with that error (as
	far as I know), equally it is impossible to prove that that error was 
	not linked to the message.

	The problem with telling him that it is a spurious message that the
	sender gives out due to the 'more peculiar signalling and error 
	handling abilities' of the sender is:

	If we know where and how the message is being generated, why isn't it
	fixed, or at the very least release noted?
	
	(Also, the fact that the previous notes were aug 91 tend to strengthen
	this)

	If we don't know 'exactly' where the error is generated, how can we be
	sure that the error doesn't result it the loss of a message.


		Any chance of a more definitive answer?

		If I can't get one, I'm going to have to 
		submit a CLD.


			Steve 
827.5Rumours and ideas!!!AIMTEC::LAMBERSON_MFri Jun 12 1992 13:5923
    Steve,
    
     A couple of things you may wish to check for that are rumoured to
    cause this type of error.
    
    1. User is logged into ALL-IN-1 via a PC/WORKSTATION, does a send on a
    message with a "large" distribution list, then powers off the
    PC/WORKSTATION before the cursor returns to "home".
    
    2. Users sending second class or deferred mail to PAPER MAIL
    addressees with an invalid PRINTER destination in PROFILE.DAT. For
    example a user sends deferred mail and PRINTER is set to SYS$PRINT
    (Default) when SYS$PRINT is undefined on the system, or the field is
    set to PORT and the users terminal printer is disconnected or not
    available. Remember to check both the originators and the POSTMASTER
    profile records since there is now a question about which is used by
    the SENDER run.
    
    These ideas have not been tested and verified but several problems that
    have been worked by the CSC seem to have been related to these issues.
    
    HTH
    
827.6WAYOUT::ALLAMSIf you're not confused, you're misinformed.Fri Jun 19 1992 10:573

No, neither of these are the case.
827.8An educated guess??AIMTEC::LAMBERSON_MThu Jun 25 1992 17:1423
    There is a very good possibility that this problem is related to file
    access "timeouts" on the SDAF or PENDING files. You may want to check
    the IO activity to the devices holding these files during the periods
    using the SPM or other performance monitoring package. If the IO rate
    for that time period is significant (over 1) then the possibility of
    file timeouts exist. This probability is significantly increased if the 
    indexed files are not regularly compressed by the Reorganize System
    Files housekeeping procedure or manual CONVERT/FDL= procedures on these
    files. It is also a good practice to create optimized FDLs for these
    files if they have grown significantly. You may also want to look at
    the IO rates to other devices that are accessed via the same HSC or
    disk interface module. If these devices have a higher (hardware)
    priority (lower requestor slot in the HSC, etc) then the ALL-IN-1 disk
    will have to wait for these disks to complete their IO requests
    communications before it can complete its own. This can also increase
    the probability that a timeout may occur.
    
    This is all conjecture at this point but after working a similar
    problem with another customer yesterday it does seem likely that a
    timeout was the cause.
    
    Hope this helps