[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

29.0. "strange problem - mail messages vanishing" by WAYOUT::ALLAMS () Tue Feb 18 1992 10:58

Hi,

	I have a customer with the following setup:

	VMS 5.4 ALL-IN-1 v2.4/british with v2.0 update.

	They have had one user lose all his mail messages, apparently after a 
	run of trm.  When I say lose, he has only in fact lost the body file
	s of the messages, not the pointers to them.  The run of trm reported 
	no errors, and none of the delete flags were set.  The files were not
	all in the same shared directory, and the only common link to the files
	is the user who owned them.  The files have all been recovered from a
	backup, and all appears to be ok, but the customer is unwilling to 
	run trm, until they can be given some reason as to why this may have 
	happened, and given confidence that this will not happen again.

	Could there have been something wrong with the users daf, which caused
	his body files to have appeared as orphans, and then they were deleted?
	If this was the case, how were they deleted without the delete flag 
	being set. (It was in fact unset - not 0 - could this be the reason?).

		Any ideas??


			Steve Allam
			UK CSC 
T.RTitleUserPersonal
Name
DateLines
29.1RTOEU::JGIBBONSTue Feb 18 1992 12:089
Sounds like you might have some corruption in your SDAF.  Try dumping out the
first few records of the SDAF ($ DUMP/REC=(START:1,END:10).  If the records 
start with a valid filename and contain other header information they are 
probably ok, otherwise they are corrupt.

There are ways to uncorrupt a corrupt SDAF, but not (as far as I know) to
save the bad records.

Jenny
29.2Not likely to be TRMAIMTEC::PORTER_TTerry Porter, ALL-IN-1 Support, Atlanta CSCTue Feb 18 1992 16:2223
When you say the user still has pointers to the files I assume you mean in his
DOCDB. Do the SDAF records still exist (the SDAF records are keyed on the 
filename so it is easy to check using DCL to attempt to read them).

If the SDAF records are missing then the most likely cause is incorrect usage
counts, have past TRM runs shown any low usage counts?

TRM will not delete a document if a user has a pointer to it, there used
to be some problems in this area if the disk containing the user's DOCDB
was offline when TRM run, but I thought these problems were resolved in V2.4.

Is it just that one user impacted, and if so did he lose ALL his documents,
or many of them. How many lost documents are we talking about (a few could be
put down to coincidence, a lot points to some problem with TRM processing 
that user)? Does that user appear in the TRM log and is the number of documents
processed correct?

Sorry for so many questions, but other than low usage counts I can not think
of a cause at the moment.

Regards,

Terry
29.3Another thing ...AIMTEC::PORTER_TTerry Porter, ALL-IN-1 Support, Atlanta CSCTue Feb 18 1992 16:2714
Just thought of something else. If the SDAF records are missing then just
recovering the files from backup will not fully recover the documents.

If the TO and CCs are still on the document then the SDAF record is there, if
the TOs and CCs are missing then you also need to recover the missing SDAF
records by copying them (using DCL) from a backup copy of the SDAF to
the live SDAF, remember that there may be more than one SDAF record for
one document. ALL the SDAF records for the same document will have a key
of the filename in the first 63 bytes, the last 2 bytes of the key being a
counter to make the keys unique.

Regards,

Terry
29.4some answersWAYOUT::ALLAMSWed Feb 19 1992 12:0244
Hi,


	To answer all the questions:

	
>Sounds like you might have some corruption in your SDAF.  Try dumping out the
>first few records of the SDAF ($ DUMP/REC=(START:1,END:10).  If the records 
>start with a valid filename and contain other header information they are 
>probably ok, otherwise they are corrupt.

	All the sdaf records look ok.

>When you say the user still has pointers to the files I assume you mean in his
>DOCDB. Do the SDAF records still exist (the SDAF records are keyed on the 
>filename so it is easy to check using DCL to attempt to read them).

	yes, his sdaf records were missing for those documents.

>If the SDAF records are missing then the most likely cause is incorrect usage
>counts, have past TRM runs shown any low usage counts?

	Difficult to say - TRM is only run once a quarter - the previous run did
	not show any bad usage counts.

>TRM will not delete a document if a user has a pointer to it, there used
>to be some problems in this area if the disk containing the user's DOCDB
>was offline when TRM run, but I thought these problems were resolved in V2.4

	I didn't think trm would delete a document if it didn't have a pointer to 
	it - if the flags were set to nodelete.

>Is it just that one user impacted, and if so did he lose ALL his documents,
>or many of them. How many lost documents are we talking about (a few could be
>put down to coincidence, a lot points to some problem with TRM processing 
>that user)? Does that user appear in the TRM log and is the number of documents
>processed correct?

	Yup, just one user affected - he lost ~200 documents.  The number of documents
	in the trm log for that user is about right.


		Steve
29.5SWAG....IOSG::PYEGraham - ALL-IN-1 Sorcerer's ApprenticeWed Feb 19 1992 14:545
    Could that user have lost his DOCDB or PDAF? Had an old one restored?
    Had a CONVERT/FDL drop some records off the end? Been logged in or
    something such that his files weren't acessible to FCVR?
    
    Graham
29.6This is a strange oneAIMTEC::PORTER_TTerry Porter, ALL-IN-1 Support, Atlanta CSCWed Feb 19 1992 15:0521
Steve,

Either you have discovered a new TRM bug or something caused low usage counts 
on that user's documents prior to the TRM run.

Did the customer run any other housekeeping (e.g. EW) between the time the 
user was last 100% sure the documents were there and the TRM run. For example 
if the user accidentally deleted these documents at a time when he was not 
the only user referencing them, then restoring the user's DOCDB from backup 
would appear to get the documents back, unfortunately all those documents 
would have low usage counts and will disapear on a future EW run once other 
users have deleted their references to them. In this case TRM would fix the 
usage counts, but only if it was run before EW.

Could the user have been accidentally deleted and then restored from backup,
this would leave all his mail messages with too low usage counts and could
eventually lead to a similar problem.

Regards,

Terry
29.7WAYOUT::ALLAMSThu Feb 20 1992 09:0711
Well:


	The user hasn't lost his daf of docdb, a convert/fdl hadn't been done
	manually - and the records for these lost files were still there, and 
	as far as I know, the rest of his documents were still ok.  The log
	seemed to process him fine - no sign of him being logged in, or files
	inaccessible

		Steve
29.8first snow this weekend!NCBOOT::HARRISooopppsWed Oct 14 1992 23:5714
    something similar has happened to 1(that i know) of my users.
    
    she has 3 mial messages with header/no text. an SH showed that they
    were in 3 different directorys. doing a DIR on each file showed that
    the files were no longer there.  the files were there on Oct 1.  the
    only housekeeping that ran between then and Oct 10 (when she discovered
    the missing files) was: EW and TRM.  EW had no problems, TRM ran, but
    errored when it was opening SMLOG2.TMP. (see note 1558).
    
    any thoughts on what might have caused the files to go away. i've
    suggested the obvious - someone deleted them at DCL level. but could
    anything else have done this?
    
    	thanks - ann
29.9FORTY2::ASHGrahame Ash @REOThu Oct 15 1992 14:135
How much of the header is there? If the Subject and addressees are missing, 
thn the SDAF record has gone as well - so DCL Delete can't have caused the 
problem.

g
29.10go figure....NCBOOT::HARRISooopppsThu Oct 15 1992 16:219
    just the date: and from: fields are filled.
    no subject
    both dates are the same, but times are different
    
    	1-Oct-1992 04:36pm CDT
    	1-Oct-1992 12:51pm CDT
    
    
    	thanks!
29.11same user, different filesANGLIN::HARRISits seemed like the right thing to doMon Jan 18 1993 20:1246
    well, its been 3 months and this same user has had the same types
    of message "sccidnetly" deleted from her account.  to make matters
    "worse", its my manager (at the customer site).  this time 4 messages
    were deleted.
    
    ALL-IN-1 2.4 (unpatched), VMS 5.5-1
    TRM is run once a month.
    
    these are soem particulars about the messages:
    
    	1. 3 of the messages were sent from a PROFS system to ALl-IN-1
    	2. Only the 1 recipient on each message (cathy)
    	3. The 4th message was sent directly to CAthy's PROFS account.
    	4. In ALL-IN-1, Cathy READ a message, did FM to folder QUATERLY
    	REPORTS and then DELETED from READ folder (into wastebasket).
        5. The 4th message was forwarded from PROFS to ALL-IN-1 by Cathy
    	and then step 4 is performed.
    
    	6. TRM ran on 1/10, EW ran on 1/16. Noticed messages missing on
    	1/18 8AM (what a way to start the day! :) ).
    	7. TRM does not produce a full log because i haven't installed the 
    	K603 patch yet. Scheduled for 2/6.
    	8. TRU ran on 1/16. Nothing unusual about this users account.
    
    	9. So far, this is the only user who has called regarding this
    	problem.
    
    	10. What could be particular about these messages? 2 senders are
    	the same senders whose messages had this problem last quarter.
    
    	11. According to Cathy, the way she files the messages "FM to new
    	folder, then delete from READ folder" is the way she's been doing
    	it for the last 4 years.
    
    	12. TRM did have the delete orphans flag to 2 (delete).
    
    	13. I had CAthy look thru her ohter folders for similar docs.But
    	the only ones missing are the 4.
    
    I'm having the files restored from backup, bu would like to try to find
    out WHY this happens (supposedly only to this user) and only on the
    first TRM run for the new quarter. 
    
    
		Thanks - ann
    
29.12FORTY2::ASHGrahame Ash @REOWed Jan 20 1993 10:3416
Hi ann,

Well, this is 'jolly interesting' isn't it?! I'm trying to imagine what might 
be different about those PROFS messages . . . can you get a Show Header output 
of one for us?

It shouldn't be the addresses, as none of the code which deletes messages will 
look at them. I'm wondering if perhaps the message comes in as an empty cover 
note, with the text of the PROFS message as an attachment, and then FM has 
trouble updating the usage counts in that case  . . or something (sorry 
Stuart!).

It'd also be handy if you could find out the usage count of the message, 
after the FM and after the Delete - sorry I can't remember how you do that.

g