[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference kernel::csguk_systems

Title:CSGUK_SYSTEMS
Notice:No restrictions on keyword creation
Moderator:KERNEL::ADAMS
Created:Wed Mar 01 1989
Last Modified:Thu Nov 28 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:242
Total number of notes:1855

202.0. "United artists and INET software" by COMICS::GLEDHILL () Wed Dec 28 1994 13:40

T.RTitleUserPersonal
Name
DateLines
202.1COMICS::GLEDHILLSun Jun 04 1995 19:57429
Company          TELEWEST - UNITED ARTISTS                                      
Department       COMMUNICATIONS (SCOTLAND)               
Street           1 SOUTH GYLE CRESCENT                   
City             EDINBURGH                     
Postal Code      EH12 9EG                    PO No      26-MAY-1995 16:24   

Caller           ANDY THORN                  Title      MR             
Phone            01753 790 470               Extension  D/L   
Service Wish     ** PLEASE SEE 'A' DESC **                                      

---------------------------------Description------------------------------------

--------------------------------------------------------------------------------
Log No             70112.00-54B-1UVO           Desc type      TS
Sequence no        01                          Authr badge no 064234
                                               Creation D/T    3-JUN-1995 15:28
--------------------------------------------------------------------------------
THIS IS FOR THE FIRST DUMP
""""""""""""""""""""""""""

System crash information
------------------------
Time of system crash: 25-MAY-1995 23:35:15.27
Version of system: VAX/VMS VERSION V5.5-2H4
System Version Major ID/Minor ID: 1/0
VAXcluster node: EAGLE, a VAX 7000-640
Crash CPU ID/Primary CPU ID:  02/00
Bitmask of CPUs active/available:  0000000F/0000000F
CPU bugcheck codes:
        CPU 02 -- PGFIPLHI, Pagefault with IPL too high
        3 others -- CPUEXIT, Shutdown requested by another CPU

CPU 02 Processor crash information
----------------------------------
CPU 02 reason for Bugcheck: PGFIPLHI, Pagefault with IPL too high
Process currently executing on this CPU: FMHELP
Current IPL: 8  (decimal)
CPU database address:  804B4000

        ISP    = 804B513C
        KSP    = 0000000A  ......... WHAT HAPPENED TO THE KSP?
        ESP    = 0010C8D7
        SSP    = 8050C020  ......... THIS DOES NOT LOOK RIGHT
        USP    = 804F7C3C  ......... NOR DOES THIS


CPU 03 Processor crash information
----------------------------------
CPU 03 reason for Bugcheck: CPUEXIT, Shutdown requested by another CPU
Process currently executing on this CPU: II_DBMS_3322
Current image file: DSA100:[INGRES.BIN]IIDBMS.EXE;1
Current IPL: 31  (decimal)
CPU database address:  804B2000


** NO PROCESS ON CPU 00 OR 01 **

Current operating stack (INTERRUPT):

                804B511C  00000030
                804B5120  00000030
                804B5124  803BBF8C
                804B5128  804B5170
                804B512C  804B515C
                804B5130  804B5134
                804B5134  8050C014      MMG$MODIFY_FAULT+000EC
                804B5138  04080000

         SP =>  804B513C  00000000
                804B5140  00000000
                804B5144  00000000
                804B5148  8740271B
                804B514C  8037E435
                804B5150  04080008
                804B5154  00000064
                804B5158  8037D8E0
                804B515C  00000000
                804B5160  20000000      CPU$M_CPUSPEC2
                804B5164  804B51A8
                804B5168  804B5190
                804B516C  8037D981
                804B5170  00000003
                804B5174  803A3E0C
                804B5178  00000008
                804B517C  8037D8E0
                804B5180  803A3E0C
                804B5184  FFFFFFFF
                804B5188  00000008
                804B518C  00000001
                804B5190  00000000
                804B5194  28000000
                804B5198  7FFE97B0
                804B519C  7FFE77E4      CTL$GL_KSTKBAS+005E4
                804B51A0  80384A01
                804B51A4  0C040007

SDA> sho page/sys 8740271B;1

System page table
-----------------

         ADDRESS     SVAPTE    PTE       TYPE  PROT  BITS PAGTYP    LOC STATE
TY
PE  REFCNT   BAK       SVAPTE    FLINK      BLINK

        87402600    BDE0404C 30000000    DZERO ERKW     K


SDA> e/i 8037E435-30;30
8037E405:  HALT
8037E406:  MOVL    04(AP),R0
8037E40A:  CLRL    5A(R0)
8037E40D:  BICL3   #0000007F,04(AP),-(SP)
8037E416:  CALLS   #01,803777C0
8037E41D:  BRB     8037E453
8037E41F:  MOVL    04(AP),R0     ! this is where the original R0 came from <--+
8037E423:  TSTL    10(R0)                                                     |
8037E426:  BEQL    8037E43F                                                   |
8037E428:  PUSHL   0C(AP)                                                     |
8037E42B:  PUSHL   #00000064                                                  |
8037E431:  MOVL    10(R0),R0     ! where did original come from <-------------+
8037E435:  PUSHL   5A(R0)        ! where did R0 come from?  >-----------------+ 
whose code?

SDA> ex @ap+4
804B5174:  803A3E0C   ".>:."
SDA> ex @.+10
803A3E1C:  874026C1   "�&@."
SDA> ev @.+5a
Hex = 8740271B   Decimal = -2025838821  ! failing VA
SDA> ex r0
R0:  874026C1   "�&@


Process index: 0042   Name: FMHELP   Extended PID: 20C09142
-----------------------------------------------------------
Process status:  02040001   RES,PHDRES

PCB address              823574A0    JIB address              83934220
PHD address    *******   A482D600    Swapfile disk address    00000000
Master internal PID      00910042    Subprocess count                0
Internal PID             00910042    Creator internal PID     00000000
Extended PID             20C09142    Creator extended PID     00000000
State                     CUR 02     Termination mailbox          0000
Current priority                2    AST's enabled                KESU
Base priority                   1    AST's active                 NONE
UIC                [00500,000430]    AST's remaining               103
Mutex count                     0    Buffered I/O count/limit       99/100
Waiting EF cluster              0    Direct I/O count/limit        100/100
Starting wait time       1B001E1D    BUFIO byte count/limit      98848/98944
Event flag wait mask     BFFFFFFF    # open files allowed left     148
Local EF cluster 0       C0000001    Timer entries allowed left     40
Local EF cluster 1       80000000    Active page table count         0
Global cluster 2 pointer 00000000    Process WS page count         449
Global cluster 3 pointer 00000000    Global WS page count          117
                                                                          
SDA> show dev

I/O data structures
-------------------
                                DDB list
                                --------

         Address    Controller     ACP       Driver      DPT   DPT size
         -------    ----------     ---       ------      ---   --------
         83F0D480    INET                    INETDRIVER 81EA69E0  0980 
         83F0D100    NTY                     NTYDRIVER  81EA6180  01B0

SDA> SEARCH/STEPS=BYTE/LENGTH=LONG MMG$A_SYS_END:@EXE$GL_RPB 54454E49
Searching from 8000AF92 to 805A1200 in BYTE steps for 54454E49...
Match at 80364A0C
Match at 80383B9F
SDA> ex 80364A0C;10
EF17003C 00015EE2 EF17083C 54454E49  INET<..��^..<..�     80364A0C
                                       ^
WOLLONGONG-----------------------------|
SDA> ex 80383B9F;10
00535245 56524553 5F54454E 490C0000  ...INET_SERVERS.     80383B9C

SDA> SHOW PROCESS INET_SERVERS

Process index: 001A   Name: INET_SERVERS   Extended PID: 20C0011A
-----------------------------------------------------------------
Process status:  00148001   RES,NOACNT,PHDRES,LOGIN

--------------------------------------------------------------------------------
Log No             70112.00-54B-1UVO           Desc type      TS
Sequence no        02                          Authr badge no 064234
                                               Creation D/T    4-JUN-1995 09:04
--------------------------------------------------------------------------------
THIS IS FOR THE SECOND CRASH
""""""""""""""""""""""""""""

System crash information
------------------------
Time of system crash: 30-MAY-1995 11:04:36.46
Version of system: VAX/VMS VERSION V5.5-2H4
System Version Major ID/Minor ID: 1/0
VAXcluster node: EAGLE, a VAX 7000-640
Crash CPU ID/Primary CPU ID:  00/00
Bitmask of CPUs active/available:  0000000F/0000000F
CPU bugcheck codes:
        CPU 00 -- KRNLSTAKNV, Kernel stack not valid
        3 others -- CPUEXIT, Shutdown requested by another CPU

CPU 00 Processor crash information
----------------------------------
CPU 00 reason for Bugcheck: KRNLSTAKNV, Kernel stack not valid
Process currently executing on this CPU: II_GCC_1621
Current image file: DSA100:[INGRES.BIN]IIGCC.EXE;1
Current IPL: 31  (decimal)
CPU database address:  84398000
MPB address:   81DD57F0

                Spinlocks currently owned by CPU 00

IOLOCK8                            Address   8058BCF0
Owner CPU ID             00        IPL             08
Ownership Depth        0001        Rank            14
CPUs Waiting           0000        Index           34


CPU 01 Processor crash information
----------------------------------
CPU 01 reason for Bugcheck: CPUEXIT, Shutdown requested by another CPU
Process currently executing on this CPU: II_DBMS_E30
Current image file: DSA100:[INGRES.BIN]IIDBMS.EXE;1
Current IPL: 31  (decimal)
CPU database address:  804B6000

                No spinlocks currently owned by CPU 01

*** NO PROCESS ON CPU 02 ***

CPU 03 Processor crash information
----------------------------------
CPU 03 reason for Bugcheck: CPUEXIT, Shutdown requested by another CPU
Process currently executing on this CPU: II_DBMS_E2F
Current image file: DSA100:[INGRES.BIN]IIDBMS.EXE;1
Current IPL: 31  (decimal)
CPU database address:  804B2000

                No spinlocks currently owned by CPU 03


SDA> E/I @PC-10;10
%SDA-W-INSKIPPED, unreasonable instruction stream - 1 bytes skipped
EXE$COMPAT+00035:  MOVZWL  #042C,-(SP)
EXE$COMPAT+0003A:  PUSHL   #04
EXE$COMPAT+0003C:  BRW     EXE$EXCEPTION
EXE$COMPAT+0003F:  HALT
EXE$KERSTKNV:  BUGW   #020C         ! BUG$_KRNLSTAKNV
EXE$MCHECK:  MOVZWL  #02BC,-(SP)

SDA> sho sta

CPU 00 Processor stack
----------------------
Current operating stack (INTERRUPT):

                843991D8  00000000
                843991DC  83963394
                843991E0  803A4680
                843991E4  7FFE7224      CTL$GL_KSTKBAS+00024
                843991E8  7FFE7208      CTL$GL_KSTKBAS+00008
                843991EC  843991F0
                843991F0  804F7D44      EXE$MCHECK
                843991F4  041F0000

         SP =>  843991F8  823C8272
                843991FC  00080000      UCB$M_MNTVERPND

SDA> sho sta/k

Process stacks (on CPU 00)
--------------------------
KERNEL stack:

                7FFE7200  823C8272
                7FFE7204  00080000      UCB$M_MNTVERPND

         SP =>  7FFE7208  00000000
                7FFE720C  200A0000
                7FFE7210  7FFE7250      CTL$GL_KSTKBAS+00050
                7FFE7214  7FFE7230      CTL$GL_KSTKBAS+00030
                7FFE7218  823C8548
                7FFE721C  C0C00000
                7FFE7220  839632F0
                7FFE7224  00000001
                7FFE7228  7FFE722C      CTL$GL_KSTKBAS+0002C
                7FFE722C  00000000
                7FFE7230  00000000
                7FFE7234  200E0000
                7FFE7238  7FFE72C8      CTL$GL_KSTKBAS+000C8
                7FFE723C  7FFE72A8      CTL$GL_KSTKBAS+000A8
                7FFE7240  81E00EC6      EXDRIVER+01006
                7FFE7244  C0C00000
                7FFE7248  0000007A
                7FFE724C  839632F0
                7FFE7250  00000004
                7FFE7254  C0C00114
                7FFE7258  00000007
                7FFE725C  8000000F      EXE$QIOW_2+00007
                7FFE7260  00000000
                7FFE7264  81E006BB      EXDRIVER+007FB
                7FFE7268  83963301
                7FFE726C  83963304
                7FFE7270  81E00DD5      EXDRIVER+00F15
                7FFE7274  81EA8140      INETDRIVER+00F10
                7FFE7278  00000009
                7FFE727C  80381DFE
                7FFE7280  81E00927      EXDRIVER+00A67
                7FFE7284  81E028E2      EXDRIVER+02A22
                7FFE7288  81E024A9      EXDRIVER+025E9
                7FFE728C  839632F0
                7FFE7290  80381E06
                7FFE7294  0000001C
                7FFE7298  00000000
                7FFE729C  81EA8140      INETDRIVER+00F10
                7FFE72A0  80385A4B
                7FFE72A4  00000008
                7FFE72A8  00000000
                7FFE72AC  20380000
                7FFE72B0  7FFE7320      CTL$GL_KSTKBAS+00120
                7FFE72B4  7FFE72F4      CTL$GL_KSTKBAS+000F4
                7FFE72B8  80369B20
                7FFE72BC  83963394
                7FFE72C0  00000000
                7FFE72C4  00000000
                7FFE72C8  00000002
                7FFE72CC  80381E06
                7FFE72D0  839632F0
                7FFE72D4  00000000
                7FFE72D8  000000B0
                7FFE72DC  000000B0
                7FFE72E0  00000008
                7FFE72E4  80573980
                7FFE72E8  80381E06
                7FFE72EC  81EA8140      INETDRIVER+00F10
                7FFE72F0  80381D88
                7FFE72F4  00000000
                7FFE72F8  2FC00000
                7FFE72FC  7FFE736C      CTL$GL_KSTKBAS+0016C
                7FFE7300  7FFE7348      CTL$GL_KSTKBAS+00148
                7FFE7304  80368C0F
                7FFE7308  00000009
                7FFE730C  80381D88
                7FFE7310  803A46E0
                7FFE7314  7FFE7382      CTL$GL_KSTKBAS+00182
                7FFE7318  803A4680
                7FFE731C  80381D88
                7FFE7320  00000004
                ........  ........

SDA> eval ctl$gl_kstkbas
Hex = 7FFE7200   Decimal = 2147381760           CTL$GL_KSTKBAS

SDA> eval ctl$gl_kspini-1
Hex = 7FFE77FF   Decimal = 2147383295           CTL$GL_KSTKBAS+005FF

SDA> sho proc

Process index: 0021   Name: II_GCC_1621   Extended PID: 21201621
----------------------------------------------------------------
Process status:  00140023   RES,DELPEN,RESPEN,PHDRES,LOGIN

PCB address              81EFD840    JIB address              8398AE30
PHD address              8CF38000    Swapfile disk address    00000000
Master internal PID      00160021    Subprocess count                0
Internal PID             00160021    Creator internal PID     00000000
Extended PID             21201621    Creator extended PID     00000000
State                     CUR 00     Termination mailbox          0000
Current priority               16    AST's enabled                KESU
Base priority                  16    AST's active                 U
UIC                [00035,000001]    AST's remaining              1371
Mutex count                     1    Buffered I/O count/limit     1315/1450
Waiting EF cluster              0    Direct I/O count/limit        435/530
Starting wait time       1C001C1B    BUFIO byte count/limit     ******/1572584
Event flag wait mask     DFFFFFFF    # open files allowed left     728
Local EF cluster 0       E0000001    Timer entries allowed left    489
Local EF cluster 1       C8000000    Active page table count         0
Global cluster 2 pointer 00000000    Process WS page count        9539
Global cluster 3 pointer 00000000    Global WS page count           13

*** WHAT IS WRONG WITH BUFIO byte count/limit ***

SD> sho proc/chan

Process index: 0021   Name: II_GCC_1621   Extended PID: 21201621
----------------------------------------------------------------


                            Process active channels
                            -----------------------

Channel  Window           Status        Device/file accessed
-------  ------           ------        --------------------
  0010  00000000                        DSA100:
  0020  83FDB780                        DSA100:(2636,2,0)
  0030  83FC6800                        DSA0:(2079,1,0) (section file)
  0040  83FCA080                        DSA0:(1434,1,0) (section file)
  0050  83FC8180                        DSA0:(143,1,0) (section file)
  0060  00000000             Busy       MBA9892:
  0070  00000000             Busy       MBA9897:
  0080  00000000                        NET710:
  00B0  00000000             Busy       INET2420:
  00C0  00000000             Busy       INET2608:
  00D0  00000000             Busy       INET2653:
  00E0  00000000             Busy       INET2574:
  00F0  00000000             Busy       INET2607:
  0100  00000000             Busy       INET2668:
  0110  00000000             Busy       INET2670:
  0120  00000000                        MBA8294:
  0130  00000000             Busy       INET2460:
  0140  00000000             Busy       INET2491:
  0150  00000000             Busy       INET2623:
  0160  00000000                        MBA7477:
  0170  00000000             Busy       MBA7478:
  0180  00000000             Busy       INET2461:
  0190  00000000                        MBA1851:
  01A0  00000000             Busy       MBA1852:
  01B0  00000000                        MBA1853:
  01C0  00000000             Busy       MBA1854:
  01D0  00000000                        MBA7479:
  01E0  00000000                        MBA1875:
  ....  ........                        .......

*** THIS CONTINUES WITH SOMETHING LIKE 144 BUSY MBA CHANNELS ***
***  AND SOMETHING LIKE 114 INET CHANNELS ***
202.2My conclusionsCOMICS::GLEDHILLSun Jun 04 1995 20:0191
--------------------------------------------------------------------------------
Regarding the first dump that seem to be a coding problem in inetdriver code 
as we discussed on the phone.
--------------------------------------------------------------------------------

The second one has the following call sequence (this I got from show call
 - show call/next sequence, searching on Saved PC to get this extract).

This shows all the return pcs, newest at the top...

823C8548	
81E00EC6	EXDRIVER+01006
80369B20	
80368C0F	
8036AD78	
8036AEBD	
80368B02	
80370478	
80375CAA	
80376BF8	
80376756	
80378676	
80378357	
81EA7AF3	INETDRIVER+008C3
804F8020	EXE$EXCEPTION+00225 ! system service despatch call frame.
8055C9B2      *	EXE$RUNDWN+001D2    ! call sys$dassgn here.
804F8020	EXE$EXCEPTION+00225 ! system service despatch call frame.
8056DF85	EXE$CREPRC+00C95    ! call sys$rundwn.
8055DD55	EXE$ASTDEL+00003    ! delete ast queue to process.
8055A830	EXE$EXIT+00030      ! sys$exit calls sys$delprc.
804F8020	EXE$EXCEPTION+00225 ! system service despatch call frame.

Process calls sys$exit first of all...

This shows that the process is in deletion and is in the delete process ast,
calling sys$dassgn (at stage *) for each channel the process (it does each one
at a time, calling sys$assign with the channel no as parameter). at this point
we are processing channel 1890 (show proc/chan doesn't show it, as it has
already been deleted, however can find what it was as follows).

SDA> form/type = ccb @ctl$gl_ccbbase-1890
7FF3D160   CCB$L_UCB                       821E47E0	UCB
7FF3D164   CCB$L_WIND                      00000000	 
7FF3D168   CCB$B_STS                             00	 
7FF3D169   CCB$B_AMOD                          00	 
7FF3D16A   CCB$W_IOC                       0000	 
7FF3D16C   CCB$L_DIRP                      00000000	 
           CCB$C_LENGTH                    

Note that most stuff in the ccb has been cleared by now, this stops show
proc/chan from recognising it as a valid chanel.

SDA> show dev/addr =@(@ctl$gl_ccbbase-1890)
I/O data structures
-------------------
INET2667                                Unknown           UCB address:  821E47E0

Device status:   00010010 online,deleteucb
Characteristics: 0C140001 rec,avl,mbx,idv,odv
                 00000000 

Owner UIC [000035,000001]   Operation count          1   ORB address    821E4890
      PID        00000000   Error count              0   DDB address    83F4FF00
Class/Type          00/00   Reference count          0   DDT address    81EA730C
Def. buf. size      65535   BOFF                  0000   CRB address    83F4FE80
DEVDEPEND        00000000   Byte count            0000   I/O wait queue    empty
DEVDEPND2        00000000   SVAPTE            00000000                          
FLCK index             34   DEVSTS                0002                          
DLCK address     8058BCF0                                                       
Charge PID       00160021                                                       

	*** I/O request queue is empty ***

So by calling dassgn for that device we end up calling the inetdriver code 
(to be expected as deassgn calls the cancel code for the device we are
deassigning). Once we go into inetdriver we seems to end up going through a
lot of call frames in allocatable system space. THis turns out to be a block
of code that if you examine text right at the start of it has the string
INET in it. (see stars article by searching on INET allocatable system space).

However all these frames are different, so doesn't look we got in an endless
loop (usual cause of these krnlstkinv crashes), rather looks like the kernel
stack is not big enough for what we are trying to do. (this is often a problem
with C code that runs in kernel mode, the stack can get used heavily if you call
a lot of subroutines with lots of arguments/saved registers etc  - you can
easily run out of k-stack space).

On axp there is a sysgen param, for the kernal stack size (kstackpages) but
not aware of anything on vax. I think it is fixed at 4 pages or whatever.
Will need to get the customer to get back to the inet vendors as they will need
to modify their code to use the stack less...