[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 49.910::kav30

Title:VAX on VMEbus: KAV30
Notice:Could have been as fast as 68K but its a VAX!
Moderator:CSSVMS::KAV30_SUPP
Created:Thu Apr 18 1991
Last Modified:Fri Aug 02 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:159
Total number of notes:645

96.0. "PROBLEM ACCESSING UNMAPPED KAV30 MEMORY RAPIDLY, VIA VMEBUS" by GOBANG::LEMMER () Wed Jul 14 1993 17:03

	The following QAR has been reported to VAXELN eng. together with
	the suggested fix for the problem. I will post a note here on which
	release of ELN will have it implemented.

Thomas




 
                    VAXELN  QAR Problem Report.
                   =============================

 Date :  13-July-1993 
 ======

 Author : 
 =========
            John Cushnie @PCS,
            VAXELN Sustaining Engineer,
            DECPCS Munich




QAR #  Status  Category     Component        Date in       Date out
-----  ------  --------    --------------  ------------  -------------
  ?      ?        ?        KAV30 Kernel     13-July-1993

PROBLEM ACCESSING UNMAPPED KAV30 MEMORY RAPIDLY, VIA VMEBUS.

Reproducible at will: Yes

CPU        Memory     


 Summary of problem : 
 =====================

          Following a customer reported problem with acccessing a second KAV30
          via the VMEBus from a Master KAV30, a detailed investigation has been 
          carried out at DEC Munich to isolate and fix the problem.

          The customer reported that his application suffered from 
          unpredictable processor 'freezes', when one KAV30 (Master) processor
          was acccessing the RAM of second KAV30 (Slave) processor, via the 
          VMEBus. 

	  The RAM on the Slave KAV30 was not mapped to VME, but the hardware 
	  had been set to the correct address range (using A24 rotary switch
	  or A32 system parameter), and therefore, the VMEcycle initiated by 
          the Master processor aborts with BERR signal.

          By manually halting the Slave KAV30 (using the Break key 
          on the console terminal), and re-booting with the command >>> boot, 
          the following error message was displayed, following the freeze :

              >>> boot

               83 BOOT SYS
 
              ?30 UNXINT 20041C21 04150000 00C
                                                         
          The error message was always the same, and if the Ebuild file for the 
          Slave.sys was modified to minimise the system, ie remove the DECnet 
          driver, ECL, Console etc, the same problem was still present.

          The problem was produced with the following system development 
          software versions :

                 VMS     V5.5-2
                 VAXELN  V4.4

          The state (ie halted, VAXELN initialised....) of the Slave 
          KAV30 processor being accessed via the VMEBus did not affect the       
          observed results.  

  Solution of Problem :
 =======================

	  The problem observed is the result of a 'race' condition in the 
          KAV30 Hardware, that causes it to enter the POWER_FAIL condition 
          when the Slave KAV30 is being accessed rapidly, via the VMEBus.

	  Under the conditions described above the following events occur : 

             Accessing unmapped Slave KAV30 memory locations over VMEbus       
             generates an 'HW_SLAVE_ERROR' IRQ, which is handled at POWER_FAIL 
             IPL (this is dictated by the HW implementation). 

             If the ISR is then hit by another 'HW_SLAVE_ERROR' while 
             executing, at some point in time the information about the reason 
             (HW_SLAVE_ERROR) is lost, causing the next IRQ to be treated as 
             a 'real' POWER_FAIL. 

             In this situation the VAX goes into the POWER_FAIL loop, therefore
	     'freezing' the application. 

             Since the IRQ is treated differently now, the register which had 
             shown the 'HW_SLAVE_ERROR' is not cleared, and the IRQ is still 
             pending. Therefore the ?30 UNXINT is shown when a re-boot is 
             attempted.



T.RTitleUserPersonal
Name
DateLines