[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference bulova::decw_jan-89_to_nov-90

Title:DECWINDOWS 26-JAN-89 to 29-NOV-90
Notice:See 1639.0 for VMS V5.3 kit; 2043.0 for 5.4 IFT kit
Moderator:STAR::VATNE
Created:Mon Oct 30 1989
Last Modified:Mon Dec 31 1990
Last Successful Update:Fri Jun 06 1997
Number of topics:3726
Total number of notes:19516

2844.0. "SPX, DECW-Server crashes ( MEMMGR/XFREE )" by EIGER::STACHER () Thu May 31 1990 09:22

Hi there

Two different Customer have the same Problem shown below.


VAXstations 3100 M38 SPX     Satellite configuration!
VMS 5.3-1

                                                                                
   The Windowserver sometimes stoppes after the user logged out                 
   or the whole Workstation "hangs" after a period of time, f.ex.               
   in the morning.                                                              

   Errorlog of DECW$SERVER :                       
                                                                                
   SYSTEM-F-ACCVIO reason mask=05 virtual addr 200000ff, pc=001317f2         
      psl=03c00001                                                             
   Unrecoverable server internal error (code=12), terminatimg all connections.
                                                                                
   and then: ... free unused fonts                                              
              ... destroying loadable microcode context                         
              ... deallocating Ucode ROM and Ucode FIXED                        
               MEMMGR/XFREE MEMORY block 135244 has invalid header              
               and is not freed.                                                
               Fatal server bug!                                                
                Server runtime error limit exceeded, restart.                   
                                                                                
  


I've read in the this converence that there is a patch for server crashes
available. 
Does this patch fix also this problem? If so, where can I get this patch (name
and location) ?


Thank's for help.

Cheers 
Christian
T.RTitleUserPersonal
Name
DateLines
2844.1Need more informationSTAR::VATNEPeter Vatne, VMS DevelopmentThu May 31 1990 18:045
I'm not promising to look at this crash.  However, if you want anyone to
be able to decode this crash, you will have to post the entire contents
of DECW$SERVER_0_ERROR.LOG.  The stuff we need occurs before the ACCVIO
message.  Everything after that is just the server attempting to continue
as best it can, which unfortunately for your customer is not too far.
2844.2Another oneUTRTSC::HELDENFri Jun 01 1990 08:496
    My customer has the same problem with his 3100-spx,also when his system
    "hangs" he has TWA_ processes with PFW state and Page flts  2400000.
    
    Hope this will give the solver more information.
    
    Mark
2844.33100 SPX - server crashCSC32::JJONESJeff JonesFri Jun 01 1990 20:25182
Here is the full scoop, with work-around, but no solution. I believe the 
solution will require a patch to the appropriate source code - MEMMGR.LIS.

[config]
vaxstation 3100 spx
16 mb main memory
rz55 (only disk)
tk50
vms 5.3-1
Local system disk

[symptom]
Periodically, if a user goes to quit out of a session, the session 
manager goes bye-bye. You will get a black screen if you hit return. 
Bottom half white, top half black. And you will get a USERNAME prompt.
He is running all applications locally.
He loggs in interactively and does "@DECW$STARTUP RESTART"; loggs out 
and then the login box comes back.
There appears to be no pattern as to when it happens and when it doesn't.

I got the customer to recreate the problem (first try) and after 
logging in we discovered that all decwindows processes except a 
DECW$FD are gone. This process is running 
DKB100:[SYS0.SYSCOMMON.][SYSEXE]DECW$DWT_FONT_DAEMON.EXE;1 .

A checksum of the server image is good.

There is a SERVER dump.

ACCVIO, PC=DC01, VA=20D, RM=04, PSL=03C00000
DBG> ex/i 0dc01-20:0dc01
SHARE$DECW$SERVER_DIX+99E1:     BGTRU    SHARE$DECW$SERVER_DIX+9A18
SHARE$DECW$SERVER_DIX+99E3:     MOVZWL   B^02(R1),R3
SHARE$DECW$SERVER_DIX+99E7:     ADDL3    B^04(R1),R2,R0
SHARE$DECW$SERVER_DIX+99EC:     ASHL     R3,R0,R3
SHARE$DECW$SERVER_DIX+99F0:     MOVAB    B^14(R1),R0
SHARE$DECW$SERVER_DIX+99F4:     MOVL     (R0)[R3],R0
SHARE$DECW$SERVER_DIX+99F8:     TSTL     B^0C(R0)
SHARE$DECW$SERVER_DIX+99FB:     BEQL     SHARE$DECW$SERVER_DIX+9A10
SHARE$DECW$SERVER_DIX+99FD:     MOVL     B^0C(R0),R5
SHARE$DECW$SERVER_DIX+9A01:     MOVW     R4,B^04(R5)
DBG> ex r5
0\%R5:  00000209
DBG> EX/W 20D
0000020D:       4420
DBG> EX R4
0\%R4:  00000060

The "C" code which this macro is emulating is the following:
([SERVER.DIXKERNEL.LIS]MEMMGR.LIS)

17029    2   		 ALLOCATE(amount, memtype, ptr);

                                 56 D4    007C		clrl	r6
                     000001F4 8F 54 D1    007E		cmpl	r4,#500
                                 59 18    0085		bgeq	sym.8
                                 52 D5    0087		tstl	r2
                                 55 13    0089		beql	sym.8
                            51 6744 D0    008B		movl	(r7)[r4],r1
                                 4F 13    008F		beql	sym.8
                           50 24 A1 3C    0091		movzwl	36(r1),r0
                                 49 13    0095		beql	sym.8
                           51 20 A1 D0    0097		movl	32(r1),r1
                                 43 13    009B		beql	sym.8
                           0C A1 52 D1    009D		cmpl	r2,12(r1)
                                 35 1A    00A1		bgtru	sym.7
                           53 02 A1 3C    00A3		movzwl	2(r1),r3
                        50 52 04 A1 C1    00A7		addl3	4(r1),r2,r0
                           53 50 53 78    00AC		ashl	r3,r0,r3
                           50 14 A1 9E    00B0		movab	20(r1),r0
                            50 6043 D0    00B4		movl	(r0)[r3],r0
                              0C A0 D5    00B8		tstl	12(r0)
                                 13 13    00BB		beql	sym.6
                           55 0C A0 D0    00BD		movl	12(r0),r5
                           04 A5 54 B0    00C1		movw	r4,4(r5)
                        0C A0 08 A5 D0    00C5		movl	8(r5),12(r0)
                           55 2C A0 C0    00CA		addl2	44(r0),r5
                                 13 11    00CE		brb	sym.9
                                          00D0	sym.6:
                              56 01 D0    00D0		movl	#1,r6
                                 0E 11    00D3		brb	sym.9
                                 50 D5    00D5		tstl	r0
                                    01    00D7		nop	
                                          00D8	sym.7:
                              56 01 D0    00D8		movl	#1,r6
                                 06 11    00DB		brb	sym.9
                                 50 D5    00DD		tstl	r0
                                    01    00DF		nop	
                                          00E0	sym.8:
                              56 01 D0    00E0		movl	#1,r6
                                          00E3	sym.9:
                                 56 D5    00E3		tstl	r6
                                 24 13    00E5		beql	sym.11
                                 52 DD    00E7		pushl	r2
                                 54 DD    00E9		pushl	r4
                    00000000* EF 02 FB    00EB		calls	#2,ALLOCATE_SLOW
                              55 50 D0    00F2		movl	r0,r5




******************************************************************
Using SDA to look at the running image (all code lines up)
Below is a format of the ICB's (Image Control Blocks) verifying the 
offset of the shareable library.
******************************************************************
SDA> ex .;68
00200103 007F0068 7FFE2684 7FFCFCC0  ���..&.h..... .     7FFCFB00
52455652 45532457 43454410 00000240  @....DECW$SERVER     7FFCFB10
00000000 00000000 0000004E 49414D5F  _MAIN...........     7FFCFB20
00000000 00000000 00000000 00000000  ................     7FFCFB30
000007FF 00000200 00000000 00000000  ................     7FFCFB40
00000200 7FFE970C 00000000 00000000  ...............     7FFCFB50
00000000 00000000 00000000 00000000  ................     7FFCFB60
SDA> ex @.;68
00400300 007F0068 7FFCFB00 7FFCFB70  p��..��.h.....@.     7FFCFCC0
52455652 45532457 4345440F 00000241  A....DECW$SERVER     7FFCFCD0
00000000 00000000 3130305F 5849445F  _DIX_001........     7FFCFCE0
00000000 00000000 00000000 00000000  ................     7FFCFCF0
000553FF 00004200 04000005 00000002  .........B...S..     7FFCFD00
00004200 7FFE967C 00000000 00000000  ........|...B..     7FFCFD10
00000000 00000000 00000000 00000000  ................     7FFCFD20


**********************************************************************
DECW$SERVER_ERROR.LOG
**********************************************************************
$ ty decw$server_0_error.log;
 1-JUN-1990 11:29:56.2 Hello, this is the X server
Dixmain address=13074
Now attach all known txport images
%DECW-I-ATTACHED, transport DECNET attached to its network
in SetFontPath
Connection 99700 is accepted by Txport
out SetFontPath
ScanProc color support loaded
scn$InitOutput address=12acd0
Connection Prefix: len == 60
 1-JUN-1990 11:30:40.9 Now I call scheduler/dispatcher
 1-JUN-1990 11:30:42.9 Connection 99738 is accepted by Txport
 1-JUN-1990 11:30:47.3 Connection 99700 is closed by Txport
 1-JUN-1990 14:51:27.4 Connection 99738 is closed by Txport
 1-JUN-1990 14:51:29.5 Connection 99700 is accepted by Txport
 1-JUN-1990 14:51:32.2 Connection 99770 is accepted by Txport
 1-JUN-1990 14:51:42.5 Connection 99738 is accepted by Txport
 1-JUN-1990 14:52:03.8 Connection 9adf8 is accepted by Txport
 1-JUN-1990 14:53:12.3 Connection 9adf8 is closed by Txport
 1-JUN-1990 14:53:36.6 Client 3 resets the server
  1-JUN-1990 14:53:36.9 Destroying Loadable Microcode Context
 1-JUN-1990 14:53:37.2 ...Deallocating Ucode ROM
 1-JUN-1990 14:53:37.4 ...Deallocating Ucode FIXED
 1-JUN-1990 14:53:37.8 ScanProc color support loaded
scn$InitOutput address=12acd0
Connection 99700 is accepted by Txport

**********************************************************************
DECW$SERVER_OUTPUT.LOG
**********************************************************************
$ ty decw$server_0_output.log;

scn_FreeVisual
scn_closeScreen


[solution]
No solution at this point in time. 

A workaround IS to log into the console, under the  SYSTEM account,
and issue the following command:

	$ @DECW$STARTUP RESTART

If the console appears hung and does not respond to a tap of the 
RETURN key do the following:

	1) Press the HALT button on the back of the cpu.
	2) Type 'C' and RETURN
	3) Press RETURN on the keyboard and log in.

jjjones/Colorado CSC

p.s. Should I SPR this?
2844.4KLARGO::HECKMMon Jun 04 1990 14:179
I too am having the same problems. I have been rebooting after the
hang so this will help a lot.

Is it known what privs are needed to restart the server. I would like
for the users to be able to do this, but I would rather not give them
all system privs.

Thanx
Mark
2844.5May already have been identified and fixed.CSC32::M_MURRAYMon Jun 04 1990 18:2422
It might be worthwhile trying CSCPAT_0183, which includes
some stuff for 3100/SPX memory corruption and server hanging


\                                                                             
\       ECO004          25-APR-1990       R.D.L                               
\                       (actual patch written by someone else)                
\              This patch fixes two problems with DECwindows on the VAXstation
\              3100/SPX:                                                      
\                                                                             
\              1) Running certain applications and then quitting the session  
\                 cause memory corruption in the server.  Once memory is      
\                 corrupted, the server may crash, or applications may see    
\                 XIO errors, or the SPX may hang.                            
\                                                                             
\              2) Logging into a session, AND quitting, four times causes     
\                 the the SPX to hang on the fourth login.                    
                                                                        

Cheers,
Mike
2844.6Another person with the same problemLAIDBK::ELLISONThat is truly a wetbrain concept.Mon Jun 04 1990 19:0515
    
    
    My customer, McDonnell-Douglas, is having this same problem on their
    primary DEMONSTRATION units...
    
    While we've been using the DECW$STARTUP RESTART, it is now becoming a
    major issue inside MD and I'm starting to get more heat.
    
    In the meantime, is anything being done to get this resolved...
    
    	Jan Ellison
    	dtn: 533-7787
    
    If it helps I also have SERVER PROCESS crash dumps available over the
    network...
2844.7There's another patchUTRTSC::HELDENTue Jun 05 1990 04:31188

  I recieved the following mail from Valbonne when I posted the problem
  there.
  I haven't tryed it out yet, but it might be useful.

  -Mark-




From:	BEAGLE::AVIGDOR      "Patricia Avigdor"  1-JUN-1990 17:29:39.52
To:	UTRTSC::HELDEN
CC:	AVIGDOR
Subj:	Log# 01JunL01 - Decwindows crash when quitting session


	Hello Mark,

	The following note should help you.




+---------------------------+TM
|   |   |   |   |   |   |   |
| d | i | g | i | t | a | l |			INTEROFFICE MEMORANDUM
|   |   |   |   |   |   |   |
+---------------------------+


TO: DISTRIBUTION				DATE: April 10, 1990
						FROM: Rodney Boyle
						DEPT: CSSE TIMA Management
						 EXT: 276-8781
					LOC/MAILSTOP: OGO1-2/E16

SUBJECT: TIME DEPENDENT INFORMATION

The attached information is from the CSSE/VMS Support Group. 

It contains important information regarding: Workstation will crash 
when quitting a Decwindows session.

Please distribute to all Branch and Support Engineers ASAP.  

If you have any questions regarding technical issues, or content of the
article, please contact the author.  If you have an questions regarding
distribution  or administration  of  these articles  please contact  me
directly.


Regards,

Rodney Boyle

===========================================================================

      +---------------------------+TM
      |   |   |   |   |   |   |   |
      | d | i | g | i | t | a | l |             TIME DEPENDENT CASE	
      |   |   |   |   |   |   |   |
      +---------------------------+


 
      TITLE: Workstation will crash when quitting a Decwindows session 
             due to a corrupted Flink in the SRP lookaside list.

					        DATE: 10-APR-1990

      AUTHOR: Paul Lacombe			TD #: 000259
      DTN: 381-1697
      ENET: VMSSPT::LACOMBE                     CROSS REFERENCE #'s:
      DEPARTMENT: CSSE/VMS Support Group           (SPR's, CLD's, TD's)
							CXO-04711
							AKO-00981
							MST-10351

      INTENDED AUDIENCE: U.S./EUROPE/GIA        PRIORITY LEVEL: 1
						   (1 = Time Critical, 
						    2 = NON-Time Critical)
						   See attachment below 
						   for additional info.

      ---------------------------------------------------------------------

	Author Identification:
	----------------------

	   Name : Tom Carr
	   DTN :  381-1964
	   Mail Stop : ZKO1-1/D19
	   E-net Address : VMSSPT::TCARR
	   Department : CSSE/VMS Support Group

	Article Identification:
	-----------------------

	   Title/Problem Summary : Workstation will crash when quitting
				   a Decwindows session due to a corrupted
				   Flink in the SRP lookaside list.

	   Operating System/Layered Product : VMS

	   Component/Utility : Decwindows

	   Version Information : VMS v5.2(Decwindows V1), v5.3(Decwindows V2)

	   Is the problem reproducible at will? : NO



	DETAILED Problem Information:
	-----------------------------

	   Problem Description/Symptoms :  

		 The user logs out of the workstation by selecting 'quit' in 
		the Session Manager's 'session' menu option. The workstation 
		may crash and the analysis of the dump will show the SRP list 
		corruption.  The corruption is caused when the 'FLINK' of 
		the SRP is decremented after it has been returned to the SRP 
		lookaside list. 


	   Hardware configuration specifics : 

		Workstations running Decwindows


	   Software configuration specifics :

		VMS v5.2(Decwindows V1) or v5.3(Decwindows V2)

           Potential Impact on System Operation :

		If all workstations in a LAVC log out at the same time and
		some percentage of them crash, the system could become slow
		due to workstations rebooting.

	   Frequency of Occurrence :

		Unknown, this is a timing problem and a user may never
		experience it


	DETAILED Resolution Information:
	--------------------------------

	   Problem Resolution/Workaround Description :

		There is a patch that can be obtained by any CSC
		from the CSSE VOID patch distribution system. The
		patch name is 
		
			DECWTRNSPT$PATCH01_530 (for V5.3-n ONLY)


           When is the final fix expected (Version/Timeframe)? : 
		
		VMS v5.4	(AETNA)

	   Can the fix be engineered/applied to any previous 
		versions?  If so - when? : NO


	   Installation Instructions :

		The kit that the CSC will provide when you request the
		patch contains a .README file which has installation 
		instructions

	   Limiting Parameters on Hardware Environment :

		NONE KNOWN

	   Limiting Parameters on Software Environment :

		NONE KNOWN

	   Additional Comments :



                        *** DIGITAL INTERNAL USE ONLY ***


2844.8Patch, where?ASDS::SYSTEMTue Jun 05 1990 10:338
    
    Ok, I'll plead ignorance. I need the patch mentioned in .7 but I have
    no idea what the CSSE VOID Patch Distribution System is. Can anyone
    help me out or point me to where I can obtain the patch over the net?
    
    Thanks!
    
    Jim C.
2844.9Please contact your CSCVMSSPT::J_OTTERSONTue Jun 05 1990 12:324
Hi,
  You can get the patch(es) for SPX workstations from your CSC.

Regards, Jeff.
2844.10VMSSG::LEMBREEJust do it.Tue Jun 05 1990 12:403
If the problem is specific to the SPX, you probably need the patch
DECWSERVER$PATCH02_053, and not the patch for the transport.  This is also
available from the CSCs.  Related notes are 2665 and 2824.
2844.11Am I a page faultUTRTSC::HELDENWed Jun 06 1990 05:0115
    
    
    Well I havn't much experience with DECwindows, so please correct if
    wrong.
    
    I thought that decw$transport_local.exe manages the transport
    between the client and the server.
    Because my customer has TWA processes in PFW state and lots of 
    page faults, couldn't there be a connection between PFW and
    decw$transport_local.exe ?
    
    Thanx in advance for help.
      
    -Mark-
     
2844.12more info about crash SPXEIGER::STACHERWed Jun 06 1990 06:3848
Hope this is enough. I had only a hardcopy of the DECW$SERVER_ERROR.LOG file


Connectuin 26b230 is accepted by Txport
31-May-1990 17:24:22.0 Now I call scheduler/dispatcher
31-May-1990 17:24:25.5 Conection 26ac18 is accepted by Txport
31-May-1990 17:24:25.8 Connection 26b230 is closed by Txport
 1-Jun-1990 14:05:04.7 Connection 26ac18 is closed by Txport
 1-Jun-1990 14:05:08.6 Connection 26ac50 is accepted by Txport
 1-Jun-1990 14:05:10.9 Connection 26ac88 is accepted by Txport
 1-Jun-1990 14:05:21.5 Connection 26ac18 is accepted by Txport
 1-Jun-1990 14:06:31.5 %SYSTEM-F-ACCVIO, access violation, reason mask mask=05,
virtual address=639C0062, PC=001317F2, PSL=03C00001

Unrecoverable server error(error code = 12) found, terminating all connections.
Exception Call stack dump follows:
	8f8c5
	1317f2
	131218
	12ab4f
	12a94a
	12ecf4
	220aa
	d5ee
	1083d
	10355
	13343
********** marking the end of call stack dump **********
********************************************************
 1-Jun-1990 14:06:32.3 Destroying Loadable Microcode Context
 1-Jun-1990 14:06:32.8 ...Deallocating Ucode ROM
 1-Jun-1990 14:06:33.1 ...Deallocating Ucode FIXED
 1-Jun-1990 14:06:33.4 MEMMGR/XFREE - memory block 135244 has an invalid
header and is not freed
 1-Jun-1990 14:06:33.8
Fatal server bug!
 1-Jun-1990 14:06:34.1 Server runtime erro limit exceeded - type @SYS$MANAGER:
DECW$STARTUP server to restart, or see your system manager. 1-JUN-1990 14:06:
34.5
 


Thanks for analizing

cheers
Christian


2844.13Sorry, not enough informationSTAR::VATNEPeter Vatne, VMS DevelopmentMon Jun 11 1990 20:211
The key piece of information required is the InitOutput address.
2844.14more info from another customersitesUTOPIE::GAISHAUSERShe's always a VAX to meThu Aug 16 1990 10:1422

 	Hi y'all !

	Two customers experience the same problem.
        One only on /SPX-es one on /GPX and /SPX.

	The contents of the SERVER_0_ERROR.LOG file are the same
	(of course not the Connection #)

	The scn$InitOutput address is 12acd0

	Hope this helps.

	BTW: which patch should I apply for the cust with the /GPX and /SPX ?

    ______________
    \            /
     \	Thanks  /
    |-\ Helmut /
    |__\  Hg  / 
        \____/