[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

30.0. "pending - duplicate entries on a dq command" by WAYOUT::ALLAMS () Tue Feb 18 1992 11:06

Hi,

	My customer has a very large ALL-IN-1 setup, with an average userbase
	of 200 concurrent users.  What he is finding is that when he does a 
	DQ, he has over 150 mails witing to be sent, a large number of those
	defered.  This display shows a large number of duplicate entries as 
	well.

	If you examine the pending file directly, these duplicates do not exsist.

	Believing it to be a corruption in the pending file, I performed a 
	fix_pending on the file.  This uncovered one error, which it duly
	fixed.  The customer checked the DQ option again, and all was ok.

	This was all done with no users on the system.  The next morning, he 
	checked again, and the problem reoccured - The only thing that had
	happened was that ~150 users had logged on, and were using ALL-IN-1.

	

		Does anyone have any ideas?



			Steve Allam
			UK CSC
T.RTitleUserPersonal
Name
DateLines
30.1Bad count on the sender queueAIMTEC::PORTER_TTerry Porter, ALL-IN-1 Support, Atlanta CSCTue Feb 18 1992 16:3516
The problem you are seeing is that the counter at the start of the sender queue
(key MAIL QUEUE in PENDING.DAT) is too high. The only time the value of this
counter has any impact is in DQ.

DQ will read the count and display that many entries from the sender queue, if
there are not enough entries in the queue it will simply repeat previous 
entries.

That explains the symptom but not the cause. Are there any errors in 
OA$MTI_ERR that may point to a problem? If the count is too high then it
is probably failing to get decremented when it should, the sender is the
only thing that decrements the count normally (canceling deferred mail
will decrement the count as well, but that is not a common action) so it
would be worth checing the sender logs for any errors or warnings.

Terry
30.2WAYOUT::ALLAMSWed Feb 19 1992 14:3215
Ok, 


	If we accept that these are the symtoms, how is it possible to fix that 
	counter in the pending record, without blowing it away?




		Steve
	

	
	
30.3more info:WAYOUT::ALLAMSWed Feb 19 1992 14:4220
Hi,

	 I have checked this counter on one customers system (I now have two 
	customers with the same problem).  The counter is set at 238.  If you 
	count the records from a dump, there are 238.  If you do a DQ, it
	displays 238 records, but some of them are duplicated.  These are NOT
	duplicated in pending.  If you do a DQ when no users are on, all is ok.

	Is it due to some record locking on the pending file - that locks certain
	records in the file, so that they are skipped over, but it still 
	displays count no. of records, so duplicating some of the unlocked 
	records.  

	The customer also has his default mail type set to 2nd class, which 
	is the only peculiarity.

	

		Steve
30.4There is a timeout on PENDINGAIMTEC::PORTER_TTerry Porter, ALL-IN-1 Support, Atlanta CSCWed Feb 19 1992 15:2426
ALL-IN-1 has a 30 second timeout on access to data files (including PENDING) 
therefore if the disk containing PENDING.DAT was very overloaded it is possible 
that ALL-IN-1 is timing out on access to PENDING.DAT.

These timeouts are not always reported (in fact they are often ignored) and
I am not 100% sure if a timeout would lead to the symptoms you are seeing,
however I am willing to believe it.

Can you monitor the disk with PENDING.DAT on during peak usage and see if
the I/O rates are getting close to the capacity of the hardware and/or you
are seeing a request queue for that disk.

$ MONITOR/DISK to look at IO rates

$ MONITOR/DISK/ITEM=QUEUE to look at I/O request queues. You are looking for
very low numbers here, Ideally the average should be 0.1 or less. For example
0.5 indicates that 50% of the time a process is waiting to request IO to that 
disk (i.e. has not even got to the point of starting the IO).

Check OA$MTI_ERR log file for timeouts, but remember that the absence of any 
timeout errors does not mean there are no timeouts!

If the problem is timeouts you need to move something from that disk to another
drive.

Terry
30.5CAB$PENDING helps.FULMER::LAAHSTwo Cute Celts are better than oneWed Feb 19 1992 15:4312
    Re .2 Fixing the count.
    
    You can adjust the count by doing the following:-
    
    WRITE CHANGE CAB$PENDING %KEY = "MAIL QUEUE", COUNT = "n"
    
    However, be warned, that this is not a documented way of doing things
    and that you had better be careful when using it. Ensure that noone is
    using the system so that you can safely set it to the correct number.
    
    Kevin