[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference orarep::nomahs::rdb_60

Title:	Oracle Rdb - Still a strategic database for DEC on Alpha AXP!
Notice:	RDB_60 is archived, please use RDB_70..
Moderator:	NOVA::SMITHISON

Created:	Fri Mar 18 1994
Last Modified:	Fri May 30 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5118
Total number of notes:	28246

5030.0. "Any other place to see why distributed transactions aborted?" by BROKE::BASTINE () Fri Feb 14 1997 13:17

I have a customer who has a remote "Headquarters" system that they load data
into on a nightly basis.  The load is issuing a remote connection to do its
work.

Once and a while the load will fail with:

"Distributed transaction was aborted, NOSECERR, unable to determine secondary
error"

This is expected, and documented.  We know that.  However, the only place to
look for why it failed is the NETSERVER.LOG.  The customer knows this, but
has this problem.  If the error occurs at 3:00am, the NETSERVER.LOG for this
process that failed has been purged.  Is there ANY other place to look for
why it failed?  Is there some logical or process they could put in place
to force such errors to be logged in a log file that won't get purged?  
According to the customer, and I don't know much about the NETSERVER stuff,
they are purged as a function of the network and how it works.  Is there
anyway to stop this purging, or a way in the RDBSERVER.COM file to capture
remote errors somewhere else so that they will have a record of what happened?

Are there any DECdtm log files that might capture way it aborted the 
transaction?

Thanks,
Renee

T.R	Title	User	Personal Name	Date	Lines
5030.1		DUCATI::LASTOVICA	Is it possible to be totally partial?	`Fri Feb 14 1997 13:24`	17
	you could, I suppose, edit SYS$SYSTEM:NETSERVER.COM to remove or alter how the purging is done. Off the top of my head, something along the lines of: change: $ IF F$SEARCH("SYS$LOGIN:NETSERVER.LOG;-10") .NES. "" THEN - PURGE /KEEP:3 SYS$LOGIN:NETSERVER.LOG to: $ IF F$SEARCH("SYS$LOGIN:NETSERVER.LOG;-100") .NES. "" $ THEN TYPE/OUTPUT=SYS$LOGIN:OLDLOGS.TXT SYS$LOGIN:NETSERVER.LOG.* $ PURGE SYS$LOGIN:NETSERVER.LOG/KEEP=3 $ ENDIF this may result in some duplicates in OLDLOGS.TXT;*, but it might help avoid missing stuff. Or, just turn off purging altogether and just do it manually once a day or so.
5030.2	What if they have the log files and there is nothing in them	BROKE::BASTINE		`Fri Feb 14 1997 20:29`	15
	Thanks for the idea... it is worth a try. On another similar note, the customer had the same situation occur today, during the day, so the netserver.logs were still around, and he couldn't find any errors in any of the netserver.log files to indicate why the distributed transaction aborted. He is really getting worried that there is a problem with their code that makes RDB fail, makes the distributed transaction fail, but leaves no trace on what the problem is so that they can fix it! Has anyone ever seen a situation where these errors occured and there was nothing logged in the netserver.log? Renee
5030.3		NOVA::SMITHI	Don't understate or underestimate Rdb!	`Fri Feb 14 1997 22:20`	9
	Another thing to do is something like SEARCH NETSERVER.LOG;-1 "RDB"/WINDOW=(1,5) if every netserver did this then the log would carry forward the information (if it was logged at one time). This might be easier than just incorporating the older logs in the latest. Ian
5030.4	remote monitor log?	NOVA::BRYDEN		`Sun Feb 16 1997 14:17`	2
	Can you see the remote attach appearing in the monitor log? are there any errors reported there?
5030.5	We did check the monitor logs, nothing abnormal	BROKE::BASTINE		`Mon Feb 17 1997 08:30`	5
	We checked the monitor logs on the remote node. All attaches look normal and there were no unexpected or abnormal terminations. According to the monitor log, all was fine. Renee
5030.6		NOVA::BRYDEN		`Mon Feb 17 1997 15:49`	10
	Renee, Can you actually see the request coming in and being logged on the remote nodes monitor log? I assume from your base note the customer is doing 2 attaches. One local,one remote. If you get the timestamp from the local attach, you should see one in the remote monitor log around the same time... does this attach look correct? Dave
5030.7	Will check again...	BROKE::BASTINE		`Tue Feb 18 1997 09:51`	12
	I didn't actually see the monitor log, but the customer looked at it. He knew the time of the failure and didn't see that any attaches right before that failure had any abnormalities. They all looked normal, normal successful attaches and disconnects. However, the customer did say that they have many many processes attaching in this manner per day. That it would be difficult to track which one specifically came in and was the failed process, but I can ask him to check again to see if the process was registering an attach at all. Thanks, Renee
5030.8	More information	BROKE::BASTINE		`Tue Feb 18 1997 13:12`	12
	I spoke with the customer again. He just did a blanket check of the monitor log, he didn't know the pid that failed, but said he could probably find it and trace the pid to see if it did a successful attach/disconnect from Rdb. They did find some database corruption in the database over the weekend, and think that may have been the reason for the aborted distribed transactions, but he won't be sure until they don't see the aborts anymore. If this is the case, the netserver.log didn't seem to log any errors, nor did any other log we looked at, which was kind of disconcerting to the customer. Thanks Renee