[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference bulova::decw_jan-89_to_nov-90

Title:DECWINDOWS 26-JAN-89 to 29-NOV-90
Notice:See 1639.0 for VMS V5.3 kit; 2043.0 for 5.4 IFT kit
Moderator:STAR::VATNE
Created:Mon Oct 30 1989
Last Modified:Mon Dec 31 1990
Last Successful Update:Fri Jun 06 1997
Number of topics:3726
Total number of notes:19516

2814.0. "Problems with ADA and multi-tasking... Server err" by PEACHS::BELDIN () Thu May 24 1990 18:11

	A customer running VMS 5.3 and ADA multitasking is getting a 
	number of very strange errors in their (very large) application.  
	Some of them cause the server the croak, others just kill the
	application - not sure if the process dies too - at least all of
	the widgets disappear from the screen.  Here are some brief
	descriptions of the problems.  Any pointers to some/any of the
	solutions would be helpful... (Note - they may have some code that
	�may� be trying to use the toolkit in a re-entrant manner - they
	are checking on that).

	Rick Beldin
	Atlanta CSC
	
	Here they come....

	1. This one occurs shortly after ApplicationCreateShell, effect	
	   is to 'hose' (what does that mean?) entire application. This
	   happens 1 time in 10.

		X Toolkit Warning: Cannot convert string
			 "-*-MENU-MEDIUM-R-Normal--*-120-*-*-P-ISO8859-1"
			 to type FontList, using fixed font


	2. These two sets of error messages have the effect of 'killing
	   the application' - again not sure if process is deleted or
	   just that windows disappear...  They don't appear together,
	   but are symptomatic of separate crashes.


		XIO: fatal IO error 65535 on XServer "MIP::0.0"
		  after 5053 request 61 events remaining

		XLib: sequence lost ( 0x10020 > 0xb3e ) in reply 0x7!

	3. This third problem appears to be around server memory deallocation.
	   All windows freeze, and it appears that the server has run out
	   of page file quota.  The following error appears when running
	   their application:

	
X error event received from server: BadImplementation - server reported
  implementation error

   Failed request major opcode 53 (X_CreatePixmap)
   Failed request minor opcode 0 (if applicable)
   ResourceID 0x2011e6 in failed request (if applicable)
   Serial number of failed request 44729
   Current serial number in output stream 44730

XIO: fatal IO error 65535 on X server "MIP::0.0"
   after 44732 requests (44730 known processed) with 0 events remaining

XIO: fatal IO error 65535 on X server "MIP::0.0"
   after 44734 requests (44730 known processed) with 0 events remaining

	Here are some stats gathered at the time of problem number 3.

VAX/VMS V5.3  on node DMIP  23-MAY-1990 16:07:12.45   Uptime  142 02:17:25
  Pid    Process Name    State  Pri      I/O       CPU       Page flts Ph.Mem
00000021 SWAPPER         HIB     16        0   0 00:03:22.71         0      0
000093E3 _VTA26:         HIB      7     2039   0 00:03:01.61     20023   1773 
00001B44 _VTA23:         HIB      4    19157   0 00:52:24.65    247262  12000 
00000026 ERRFMT          HIB      8     6333   0 00:00:36.04        82    137 
00000027 OPCOM           HIB      7     1138   0 00:00:13.06       506    202 
00000028 AUDIT_SERVER    HIB     10       30   0 00:00:01.14      1340    243 
00000029 JOB_CONTROL     HIB      8    53212   0 00:01:53.49       132    310 
0000002A CONFIGURE       HIB      8       11   0 00:00:00.13        98    165 
0000002B NETACP          HIB     10    89697   0 00:19:39.23       291    484 
0000002C EVL             HIB      6     1506   0 00:00:12.56    189621     68 
0000002D REMACP          HIB      9      114   0 00:00:00.30        77     73 
0000884E GRIFFIN         HIB      7     1076   0 00:00:05.29      1425    658 
00009A4F _RTA1:          CUR      4      116   0 00:00:01.43       668    429 
00005B51 Len Day         LEF      4      421   0 00:00:05.09      1529    356 
000047F6 DECW$SERVER_0   HIB      8    21296   0 00:18:19.35     43273   3872 
00000057 Window Manager  LEF      4       74   0 00:06:54.14    730331    500 
$ sh proc /all /id=47f6

23-MAY-1990 16:07:25.06   User: RWP_OPS          Process ID:   000047F6
                          Node: DMIP             Process name: "DECW$SERVER_0"

Terminal:
User Identifier:    [RWP,RWP_OPS]
Base priority:      6
Default file spec:  Not available

Devices allocated:  GAA0:
                    NET3772:

Process Quotas:
 Account name:
 CPU limit:                      Infinite  Direct I/O limit:       100
 Buffered I/O byte count quota:     47664  Buffered I/O limit:      60
 Timer queue entry quota:               7  Open file quota:         81
 Paging file quota:                     0  Subprocess quota:         8
 Default page fault cluster:           16  AST quota:               97
 Enqueue quota:                        28  Shared file limit:        0
 Max detached processes:                0  Max active jobs:          0

Accounting information:
 Buffered I/O count:      1775  Peak working set size:       4000
 Direct I/O count:       19521  Peak virtual size:          27195
 Page faults:            43273  Mounted volumes:                0
 Images activated:           0
 Elapsed CPU time:      0 00:18:19.35
 Connect time:          6 00:19:49.88

Process privileges:
 CMKRNL               may change mode to kernel
 SYSNAM               may insert in system logical name table
 PRMMBX               may create permanent mailbox
 WORLD                may affect other processes in the world
 NETMBX               may create network device
 PRMGBL               may create permanent global sections
 SYSGBL               may create system wide global sections
 PFNMAP               may map to specific physical pages
 SYSPRV               may access objects via system protection


There is 1 process in this job:

  DECW$SERVER_0 (*)

	...from sda

SDA> set proc decw$server_0
SDA> sh proc
Process index: 0016   Name: DECW$SERVER_0   Extended PID: 000047F6
------------------------------------------------------------------
Process status:  00140011   RES,PSWAPM,PHDRES,LOGIN

PCB address              802ED330    JIB address              805387D0
PHD address              80D76000    Swapfile disk address    00000000
Master internal PID      023F0016    Subprocess count                0
Internal PID             023F0016    Creator internal PID     00000000
Extended PID             000047F6    Creator extended PID     00000000
State                       HIB      Termination mailbox          0000
Current priority               11    AST's enabled                KESU
Base priority                   6    AST's active                 NONE
UIC                [00400,000171]    AST's remaining                97
Mutex count                     0    Buffered I/O count/limit       58/60
Waiting EF cluster              0    Direct I/O count/limit        100/100
Starting wait time       19001919    BUFIO byte count/limit      47664/47664
Event flag wait mask     0000000C    # open files allowed left      81
Local EF cluster 0       60000001    Timer entries allowed left      7
Local EF cluster 1       80000000    Active page table count         0
Global cluster 2 pointer 00000000    Process WS page count        3772
Global cluster 3 pointer 00000000    Global WS page count          100

Process index: 0016   Name: DECW$SERVER_0   Extended PID: 000047F6
------------------------------------------------------------------
Saved process registers
-----------------------
R0   = 0000000F    R1   = 80193BA0    R2   = 80004BB8    R3   = 00009A24
R4   = 802ED330    R5   = 00008CC0    R6   = 0000600C    R7   = 0004A5B0
R8   = 00008F14    R9   = 0002FA2C    R10  = 0002C85C    R11  = 0004A35C
AP   = 7FF8C640    FP   = 7FF8C61C    PC   = 7FFEDF8A    PSL  = 03C00000
KSP  = 7FFE7800    ESP  = 7FFE9800    SSP  = 7FFED800    USP  = 7FF8C61C
P0BR = 80D89400    P0LR = 00006662    P1BR = 80615E00    P1LR = 001FFC50
T.RTitleUserPersonal
Name
DateLines
2814.1...GSRC::WESTHelp stamp out and abolish redundancy !Thu May 24 1990 20:109
   Are you doing Toolkit calls and X calls from more than one task?  If so
this is a definite no no.  The toolkit is most definitly NOT re-entrant
and there is question as to whether Xlib is either.

  One recommendation is to not use tasking, but if you must, then only one
task should do the X and toolkit calls.

						-=> Jim <=-

2814.2Xlib is re-entrant (except for one bug)STAR::VATNEPeter Vatne, VMS DevelopmentThu May 24 1990 21:3311
Point of clarification: the Xlib is definitely designed to be re-entrant.
There was a bug with Xlib-DECwindows transport interaction that occasionally
messed things up.  The bug is fixed in VMS V5.4.  However, the symptoms of
that problem don't match your symptoms.

What is strange is that your application can't find the menu fonts.  Are
they definitely displaying to a VMS workstation?  Not finding the menu
fonts is symptomatic of displaying to other vendors' workstations.

Any server crashes should be immediately QARed.  What would be of most
interest here is the contents of SYS$MANAGER:DECW$SERVER_O_ERROR.LOG.
2814.3Fully re-rentrant?LEOVAX::TREGGIARIFri May 25 1990 10:135
I know VMS Xlib is AST re-entrant, but is it *fully* re-entrant?  If not, I'm
not sure that AST re-entrancy will suffice for Ada multi-tasking.  Anyone
know for sure?

Leo 
2814.4QUARK::LIONELFree advice is worth every centSat May 26 1990 17:386
    Re: .3
    
    AST reentrancy is not sufficient for Ada multithread execution.  I don't 
    know how reentrant Xlib is.
    
    				Steve
2814.5More info... Application architecture...PEACHS::BELDINThu May 31 1990 17:0533
	Some more info on the customer's architecture.  The customer's
	code has a total of 33 different ADA tasks running, without
	timeslice.  There are a total of 6 tasks that execute Xlib
	and X-toolkit calls, all at the same priority (8).  One of these
	is an event dispatcher that basically call XtPending and 
	XtProcessEvent.  As I understand it, under this strategy,
	none of these should be interrupted by another task at the
	same priority, they should all run to completion.   There is
	a database processing task that runs at a lower priority, 7, and
	a communications something-or-other that runs at priority 12.
	Neither of these has any calls to Xlib or toolkit.  Does this
	kind of setup appear to violate any 'multi-tasking rules'?  I
	know it violates what appears to be a cardinal rule of having
	all the X-toolkit stuff in one task...


	There is also a problem that they call the 'sudden death 
	syndrome.'   It appears that somewhere they come across some
	error that their ADA handler can't handle ( fatal NON-ADA 
	error?) and then their process appears to go into hibernation.
	Looking at the process, it appears that user-mode ASTs
	are somehow disabled - which I think is kind of critical to
	the way that ADA tasking works. He suspects that the ADA 
	tasking mechanism is somehow loosing its mind - I don't know
	enough about ADA to know... Maybe someone could shed some light
	on this as a possibility? 

	I should be getting a copy of the DECW$SERVER_0_ERROR.LOG file -
	these guys are working fast and furious and purge things
	before I even get a chance to look at it...

	Rick Beldin
2814.6PEACHS::BELDINFri Jun 01 1990 10:1673
	Following is the server error log.

		-<DECW$SERVER_0_ERROR.LOG>-
30-MAY-1990 14:58:30.4 Hello, this is the X server
Dixmain address=127b0
Now attach all known txport images
%DECW-I-ATTACHED, transport DECNET attached to its network
in SetFontPath
out SetFontPath
GPX color/monochrome support loaded
gpx$InitOutput address=1397f4
Connection Prefix: len == 42
30-MAY-1990 14:59:41.8 Now I call scheduler/dispatcher
30-MAY-1990 15:03:25.3 Using extra todo packet pool...
30-MAY-1990 15:40:34.7 Connection 98d38 is closed by Txport
30-MAY-1990 15:48:36.8 Connection 1b3600 is closed by Txport
30-MAY-1990 16:14:41.7 Using extra todo packet pool...
30-MAY-1990 17:00:08.9 Connection 1b3600 is closed by Txport
30-MAY-1990 17:32:52.8 Connection 1b3600 is closed by Txport
30-MAY-1990 18:39:26.6 Connection 1b3600 is closed by Txport
31-MAY-1990 08:25:09.4 Connection 1b3600 is closed by Txport
31-MAY-1990 08:28:14.7 Connection 1b3600 is closed by Txport
31-MAY-1990 09:56:28.1 Connection 1b3600 is closed by Txport
31-MAY-1990 10:22:10.2 Connection 1b3600 is closed by Txport
31-MAY-1990 10:25:28.6 Connection 1b3600 is closed by Txport
31-MAY-1990 12:39:48.7 Connection 1b3600 is closed by Txport
31-MAY-1990 14:07:25.7 Connection 1b3600 is closed by Txport
31-MAY-1990 15:15:21.3 %LIB-?-INSVIRMEM, insufficient virtual memory
Request opcode 53 is ignored due to internal runtime error 158217 for client 2(#
error = 1)
Exception Call stack dump follows: 
	8eec5
	fb4d
	dff4
	dcaa
	13a785
	140bcf
	140c17
	202cf
	d60a
	10f5b
	10a91
	12a7f
	41a
	801bfad3
	801bfa84
********** marking the end of call stack dump **********
********************************************************
31-MAY-1990 15:15:22.7 %LIB-?-INSVIRMEM, insufficient virtual memory
Client 2 has made too many runtime errors(2), its connection is marked for termi
nation
Exception Call stack dump follows: 
	8eec5
	fb4d
	dff4
	dcaa
	13a785
	140bcf
	140c17
	202cf
	d60a
	10f5b
	10a91
	12a7f
	41a
	801bfad3
	801bfa84
********** marking the end of call stack dump **********
********************************************************
31-MAY-1990 15:15:23.0 ..ddx layer returns bad status(17)
31-MAY-1990 15:15:23.3 ..Dispatcher close down connection 2
31-MAY-1990 15:54:08.2 Connection 1b3600 is closed by Txport
2814.7PEACHS::BELDINFri Jun 01 1990 10:4523
	On one of the other errors the customer got - the XIO error
	(which as I found out was also part of the pagefile quota
	 error) was followed with:

	
X error event received from server: BadImplementation - server reported
  implementation error.

  Failed request major opcode 53 ( X_CreatePixmap )
  Failed request minor opcode 0 (if applicable)
  ResourceID 0x2011e6 in failed request (if applicable)
  Serial number of failed request ...

XIO: fatal IO error 65535 on X server "MIP::0.0"
  after ... request ...


	Sounds like he was allocating a lot of pixmaps in the server
	and there wasn't enough memory.  My suspicion is that he should
	try and free them or try and trap this error...   

	Rick Beldin