[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 49.910::kav30

Title:	VAX on VMEbus: KAV30
Notice:	Could have been as fast as 68K but its a VAX!
Moderator:	CSSVMS::KAV30_SUPP

Created:	Thu Apr 18 1991
Last Modified:	Fri Aug 02 1996
Last Successful Update:	Fri Jun 06 1997
Number of topics:	159
Total number of notes:	645

109.0. "KAV$BUS_WRITE problems" by ZYDECO::REDDY () Mon Dec 13 1993 21:24

I would like to know under what conditions calling KAV$BUS_WRITE would result
in the system rebooting itself.

I have a customer who is accessing a VME D/A card.  When he does a bus write
and the card is there, no problems.  When he does the KAV$BUS_WRITE and the
card is not there, sometimes the KAV$BUS_WRITE results in the system reboot.
No exceptions or machine checks.  If the code was compiled with a /noopt the
bus write to a card that is not there works.  If the code is compiled with a
/OPT it works some times and it causes a reboot other times.  

Thanks,

Sumithra

T.R	Title	User	Personal Name	Date	Lines
109.1		ZYDECO::REDDY		`Tue Dec 14 1993 02:45`	19
	KAVSYS_VME.MAR in the eln directory has the following. What CVAX chip problem is it talking about and how does it impact a kav$bus_write. ;++ ; This code is a workaround for a CVAX chip problem. Can otherwise cause ; double error halt! BBC #KAV$V_BLOCK_IRQ, - DATA_TYPE(AP), 12$ ; Br if not raise IPL SETIPL #IPL$K_POWER ; Block all device interrupts BRB 15$ . . . Thanks, Sumithra
109.2	never write to non-existent VMEbus addresse repeatedly...	GOBANG::LEMMER		`Tue Dec 14 1993 11:54`	63
	.re 0 Under NO circumstances a KAV30$BUS_WRITE should cause the system to reboot. Nevertheless if you are accesing 'non-existent' VMEadresses there may be some problems: Due to the 'disconnected writes' of the cvax an error at the VMEbus (which could be a timeout) is detected a long time after the write has been posted, therefore this error is handled in the machine check handler (if you use the KAV30$BUS_WRITE routine, but if you write directly, nothing is handled...). This is not a problem, as long as you do not write for many times to the non exixtent address. Now if you do this writing in a tight loop for about 1M times and at maximum speed, the rtVAX300 may fail over with: 02 DBL ERR halt This is caused by a problem in the rtVAX microcode. For 'real' applications, this should not be a problem: either your VMEboard is there, then everything is ok. If it's not there, 'probing' for one time should be enough to realize that the board is not there and you should never retry to write to this location. The differences in the behaviour using/not using the /OPT switch may be caused by the compiler, especially check your return arguments and make them 'volatile' or something like this (we have discovered a lot of 'weird' optimisations with the 'C' compiler....) .re 1 The cvax bug is the above described one, but we should not discuss the details here in public... This bug can be seen with all cvax implementations, but it only causes a problem on the KAV30, since only on VMEbus you may have an 'outstanding' write waiting for a long time period (caused by retries on the VME). The code path you have depicted there shows the usage of the undocumented qualifier 'BLOCK_IRQ' which can be used with the KAV30$BUS_WRITE service to block any other interrupt while a write to the VMEbus is pending. This qualifier was implemented for debug purposes only. WE STRONGLY DISCOURAGE YOU FROM USING THIS QUALIFIER, SINCE IT MAY CAUSE UNPREDICTED BEHAVIOUR OF YOUR APPLICATION. IT RAISES IPL TO POWER AND STAYS THERE FOR A 'LONG' (SEVERAL 10TH OF MICROSECONDS) TIME. The usage of this qualifier lowers the probabillity of getting the 'double error halt', but it does not cure the problem. If you write long enough and with high speed to a non-existent VMEbus address, you may see it again. The only possible workaround is: Once you have detected a non existent address on the VMEbus by getting back the 'VMEbus write error' (don't remember the correct writing..) from the KAV30$BUS_WRITE service, never write to this address again. Best regards, Thomas
109.3		ZYDECO::REDDY		`Tue Dec 14 1993 18:06`	36
	Thomas, Thanks for responding back. The program KAVSYS_VME.MAR is included in the ELN kit that is sent to customers. My customer looked at the comments and wanted to know what this was all about. The following is listed in several places in that file: ; This code is a workaround for a CVAX chip problem. Can otherwise cause ; double error halt! My customer (Corning) was able to get their application work consistently by using KAV$M_BLOCK_IRQ in their KAV$BUS_WRITE. It is possible that while his write to the non-existent board was pending, another interrupt came in and caused the system halt. They have two VME D/A boards, one will always be present and the second may or may not be there. Sometimes they have gotten the right status back in their status argument. They will use the kav$bus_write routine once to check to see if the board is there or not. (He tells me that they cannot do reads to determine this.) Do you still think it is dangerous to use the KAV$M_BLOCK_IRQ? He said that he will use the "volatile" attribute and see what happens. If the only workaround for them is to use the KAV$M_BLOCK_IRQ with their KAV$BUS_WRITE and they are only going to use this routine once, what kind of problems do you think they would run into? If we do not want the customers to know about the CVAX problem or KAV$M_BLOCK_IRQ why was the KAVVME_SYS.MAR program included in the kit? Thanks for all the help, Sumithra
109.4		BAYERN::WOLFF	Conformism is for little minds.	`Wed Dec 15 1993 09:09`	28
	>If we do not want the customers to know about the CVAX problem or >KAV$M_BLOCK_IRQ why was the KAVVME_SYS.MAR program included in the kit? This is common, you usually document the code when you go along writing it. The customer can also buy a VMS listing disk and look at the comments there which contain other 'clue's if you want to call this that way. What we do not want is to tell the customer why it exactly happens - it is sufficient if he knows that there is a problem. There is nothing wrong with the comments since it only hints to the fact that there is a problem. If your customer has to probe the second D/A card and he does this once there is no problem - in whatever way he does that. This specific problem only occurs as Thomas mentioned in tight loops writing to NXM addresses on VME As with all undocumented features you should work with the customer to see whether you can solve the problem without using this undocumented feature, however I can assure you that on KAV30 this won't go away, so if there is an absolute need use BLOCK_IRQ, (but don't call us when something else, like ethernet breaks) The customers program should do have a INIT routine in which the the two boards are probed and determined what's there BEFORE any traffic (VME INTs) can be posted, then the code has to set some flags as to what the config is and after that you start the full application. If you do that way you do not need any BLOCK_IRQ modifieres in your code, and you won't see the DBLERR problem either. It's more a design question then anything else really. Julian.
109.5		ZYDECO::REDDY		`Wed Dec 15 1993 15:13`	4
	Thanks, Julian. sr