[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

1762.0. "File Cabinet Server FCS internal errors" by GIDDAY::LEH () Thu Nov 12 1992 09:57

Doesn't seem this error has been raised in this notesfile:

	OAFC-E-INTERR, Internal error in FCS
	Msg: FCS has been violated, pls submit SPR

which occurred on a newly upgraded system that runs VMS 5.5-1 and MR 3.2

Other errors incl:

	MCC-E-LERT_TERMREQ, thread termination requested
	Msg: SrvTimeoutSysMan; receive alert to terminate thread

	MCC-E-LERT_TERMREQ, thread termination requested
	Msg: CsiCacheCloseDrwStruct; error closing docdb

	MCC-E-EXISTENCE_ERROR, object doesnot exist

Since the upgrade, this system has been experiencing slow FCS response, e.g. a 
couple of minutes for an end-user account doing drawer indexing. Most of the 
fine-tuning hints given in the manual have been used except one or two 
non-dynamic SYSGEN params.

Thanks for any insights into these problems and fixes.

Hong
CSC Sydney
T.RTitleUserPersonal
Name
DateLines
1762.1in STARSGIDDAY::LEHThu Nov 12 1992 10:1713
    Found several STARS articles after having posted this note, which
    suggested the use of cross filing to remote drawers from local filecab
    
    I will need to check it tomorrow as the info in the base note was faxed
    to me and I'd no chance to ask questions.
    
    Should have also mentioned the sequence of those msgs in previous note:
    
    Those MCC-E- msgs seemed to occur just prior to FCS startup
    
    while OAFC-E-INTERR msgs seemed to happen randomly
    
    Hong
1762.2Acc vio is very badCHRLIE::HUSTONThu Nov 12 1992 16:1627
    
    >	OAFC-E-INTERR, Internal error in FCS
    >	Msg: FCS has been violated, pls submit SPR
    
    This one is obviously VERY BAD, it means that one of the FCS threads
    has access violated, we need to know what you did and if it is 
    reproducable.
    
    >	MCC-E-LERT_TERMREQ, thread termination requested
    >	Msg: SrvTimeoutSysMan; receive alert to terminate thread
    >
    >	MCC-E-LERT_TERMREQ, thread termination requested
    >	Msg: CsiCacheCloseDrwStruct; error closing docdb
    >
    >	MCC-E-EXISTENCE_ERROR, object doesnot exist
    
    You say that these show up just BEFORE the startup message. This is 
    probably ok. These messages mean that one of the background threads
    has been told to kill itself. This is normal when shutting down the
    server.
    
    As for the slow performance, did this just start after the upgrade?
    Can you show the server parameters and the process limits for the
    OAFC$SERVER account.
    
    --Bob
    
1762.3still happening - more detailsGIDDAY::LEHFri Nov 13 1992 05:41146
Bob

Thanks for your suggestions. The customer didn't believe doing anything 
sophisticated since upgrade less than a week ago, mostly Indexing jobs and 
refiling across drawers. It was somehow reproducible since I saw 
OAFC-E-THREADACCVIO again as well as previous days.

BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed 
Sharing , as said in a couple of STARS articles ? This site hasn't had it yet.

Herewith some more details :

****** Errors in FCS logs ********

More errors in OAFC$SERVER.LOG

Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
Message: CsiCacheBlockAstService; Error from mcc_astevent_receive

Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
Message: SrvTimeoutSysess; receive alert to terminate thread

Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1]

Error: %MCC-E-EXISTENCE_ERROR, object does not exist  
Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab)

Error: %MCC-E-EXISTENCE_ERROR, object does not exist  
Message: CsiCacheGarbageCollect; Error from FlushDrawerCache

Error: %MCC-E-IN_USE_ERROR, in use error  
Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock

Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved  
Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc

Error: %RMS-E-RNF, record not found  
Message: CsiOpenRmsFileCab; error getting record: TRAN MAN                      

Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
Message: SrvTimeoutSysMan; receive alert to terminate thread

Error: %OAFC-E-INTERR, Internal error in File Cabinet Server  
Message: FCS has access violated, please submit an SPR.

Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information  
Message: CsiEndFunction; Error putting EOC on out stream

Error: %RMS-F-EXENQLM, exceeded enqueue quota  
Message: SecGetRigthsList; Can't read users rightlist.

Error: %RMS-F-EXENQLM, exceeded enqueue quota  
Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN                    

Error: %RMS-F-EXENQLM, exceeded enqueue quota  
Message: SecGetRigthsList; Can't read users rightlist.

Error: %RMS-W-TMO, timeout period expired  
Message: CsiReserve; unable to get info about reserved doc

Error: %SYSTEM-F-IVLOCKID, invalid lock id  
Message: CsiCacheFlushDrwStruct; Error dequeing VMS lock block

........

I'll look them up in manuals although some of them are quite obvious on 
server's quota or RMS access problems. Have you spotted any special ones ?

Also in OAFC$SERVER_ERROR.LOG did I pick up

ALL-IN-1 Index Server Internal Error:
    Error locking DAB during cache garbage collection:

****** Other info in FCS management menus ********

o I got error:

Client buffer not big enough for requested information
 %OA-I-LASTLINE, CNB08V::"73=" must be running to perform operation

when trying to do MS MSC   Manage server clients

o Info on server

Server Name:  CNB08V::"73="


Configuration:Process Name:       CNB08V$SRV73
              Type:               LOCAL
              Startup State:      ENABLED

              Configuration File: OA$DATA_SHARE:CNB08V$SERVER73.DAT

 Server         Authorized Timeout:    600        Max Client Connects:   512
Attributes:    Distribution:          OFF        Max # of Drawers:      140
               Drawer Cache:           50        Object Number:          73
               Drawer Timeout:        600        Session Timeout:      1200
                                                 Drawer Collisions:       0
                        
  
			        Server Resources

                                CNB08V::"73="

                          CPU Time:           19631
                          Virtual page count: 92138
                          Direct IO:          23849
                          Buffered IO:        30794
                          Page faults:        124433
                          Working set:        8096

                          Server Auditing Information

                               CNB08V::"73="

                     Version: 0               Conn_acc: 309
             Threads_created: 4100            Conn_rej: 0
             Threads_deleted: 4092            Data_rcv: 3567
                  Sess_create: 305          Data_rbytes: 502686
                 Sess_delete: 242            Data_sent: 3668
                 Out_is_sess: 0            Data_sbytes: 279112
                  In_is_sess: 0                  Bytlm: 131113
               Tasks_created: 4100              Maxbuf: 31802
               Tasks_deleted: 4099          Channelcnt: 1024

o partial UAF of the server account:

Username: OAFC$SERVER                      Owner:  FILECABSERVER

Maxjobs:         1  Fillm:       150  Bytlm:        40000
Maxacctjobs:     0  Shrfillm:      0  Pbytlm:           0
Maxdetach:       0  BIOlm:        50  JTquota:       4096
Prclm:           6  DIOlm:        50  WSdef:          512
Prio:            4  ASTlm:       100  WSquo:         3072
Queprio:         0  TQElm:        50  WSextent:      5120
CPU:        (none)  Enqlm:      1000  Pgflquo:      40000

Authorized Privileges:
  CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK
Default Privileges:
  CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK

Thanks again for your interest

Hong
1762.4Some things to tryCHRLIE::HUSTONMon Nov 16 1992 15:4665
    
    
    re .3
    
>BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed 
>Sharing , as said in a couple of STARS articles ? This site hasn't had it yet.
    
    Nope, if you read that in a stars article it was wrong. Whenever
    a FCS thread access violates, this message will be logged and the FCS
    will attempt to tell the client that an access violation occured. 
    
    Can you repost the server log contents with the timestamps in it? They
    should have timestamps, it looks as though you edited it before
    posting.
    
>Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
>Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1]
>
>Error: %MCC-E-EXISTENCE_ERROR, object does not exist  
>Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab)
>
>Error: %MCC-E-EXISTENCE_ERROR, object does not exist  
>Message: CsiCacheGarbageCollect; Error from FlushDrawerCache
>
>Error: %MCC-E-IN_USE_ERROR, in use error  
>Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock
>
>Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved  
>Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc
>
>Error: %RMS-E-RNF, record not found  
>Message: CsiOpenRmsFileCab; error getting record: TRAN MAN                      
>
>Error: %MCC-E-ALERT_TERMREQ, thread termination requested  
>Message: SrvTimeoutSysMan; receive alert to terminate thread
>
>Error: %OAFC-E-INTERR, Internal error in File Cabinet Server  
>Message: FCS has access violated, please submit an SPR.
    
    This sequence is somewhat distrubing, it appears as though for one
    reason or another the users fc could not be opened and hence no
    drawers were found, but IOS continued to make requests about the fc 
    to the FCS.  Again, the timestamps would help sort out if this was all
    very close together in time.
    
    
>
>Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information  
>Message: CsiEndFunction; Error putting EOC on out stream

    This is a known bug in the system management UI to the FCS.
    
>Error: %RMS-F-EXENQLM, exceeded enqueue quota  
>Message: SecGetRigthsList; Can't read users rightlist.
>
>Error: %RMS-F-EXENQLM, exceeded enqueue quota  
>Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN                    
>
>Error: %RMS-F-EXENQLM, exceeded enqueue quota  
>Message: SecGetRigthsList; Can't read users rightlist.
>
    Again, disturbing, try bumping the enq limit of the server process.
    
--Bob
    
1762.6quotas in OAFC$STARTUPGIDDAY::LEHWed Nov 18 1992 10:0819
    As shown in the log file in .5, FCS access violation caused users
    filecab to be locked , which must be fixed with stop/restart the server
    
    Herewith the values used in their FCS startup:
    
$       oafc$server_astlm = 2624
$       oafc$server_biolm = 1856
$       oafc$server_bytlm = 1312000
$       oafc$server_diolm = 1856
$       oafc$server_enqlm = 4000 ! was 3136
$       oafc$server_fillm = 2624
$       oafc$server_pgflquota = 180000
$       oafc$server_wsextent = 8096
$       oafc$server_wsquota = 2048
    
Thanks for any other advices 
    
    
    Hong
1762.7More problems with the SERVER and DDSGIDDAY::SETHIMan from DownunderTue Nov 24 1992 06:0560
    Hi All,
    
    The customer was unable to access the WP, EM sub-systems and the system
    managers option to shutdown ALL-IN-1 just hung the process.
    
    To shutdown the ALL-IN-1 system I asked the customer to do the
    following so that an orderly shutdown could take place :-)
    
    1. From the ALL-IN-1 managers account enter the shoutown symbols
    
    OA$SHUT_FIRST_MSG              1
    OA$SHUT_REASON                test
    OA$SHUT_TIME                  1992112416000000
    
    2. Submit the batch job OA$LIB_SHARE:SM_SCHEDULED_SHUTDOWN.COM, with
       the /params="MANAGER"
    
    3. ALL-IN-1 shutdown and we started ALL-IN-1 up again without a
       problem.
    
    Now some things to note
    
    1. The customer has just adopted DDS as his primary directory
       OA$DDS_PRIME set to 1.
    
    2. Users when they went to the WP or EM menu's got the following error
       message :-(
    
       %RMS-I-CRMP, CRMPSC system service failed to map global buffers
       %OAFC-I-RMSERROR, RMS Error has occurred. Refer to exteneded status
    	                 for RMS error code
    
       I have advised the customer to examine the following parameters
    
    	RMS_GBLBUFQUO, GBLPAGFIL, GBLPAGES, GBLSECTIONS, and VIRTUALPAGECNT
    
       As per the stars article.
    
    3. Set the OA$DDS_PRIME to 0 everything no more error message as in
       point 2
    
    4. Also note that the users and the ALL-IN-1 manager could not get into
       the above mentioned sub-systems from nodes in the cluster that had
       OA$DDS_PRIME set to 0.  So this leads me to believe that file server
       may have caused the problem.
    
    5. The process OA$FCV was in SUSPO state and swapped out (obvious
       comment), the <nodename>$SRV73 process was in hibination state.
    
    To me it looks like a combination of problems to me file server and DDS
    problems.  Some related to tuning issues and others with the file
    server.
    
    Sorry if I have not provided you with more information this I was in
    fire fighting mode, the system was unusable.  Any pointers would be
    useful so I can determin why this has happened.
    
    Thanks
    
    Sunil
1762.8WSEXTENT quota, hopefullyGIDDAY::LEHFri Nov 27 1992 06:3316
    Recent problem with insufficient GBLPAGFIL complicated the initial FCS
    internal error problems; however, 2 things have been done:
    
    o increase of GBLPAGFIL 
    o increase of WSEXTENT for FC server process during startup
    
    and the effects have been quite positive: users DOCDB, PDAF and
    RESERVATIONS were no longer locked by server process, decent responses
    from drawer indexing and only one internal error recorded in 2 days,
    compared with 10+ times a day previously seen on same 2 troubled nodes.
    
    I hope it's customer oversight on the value of WSEXTENT that has been
    causing all the troubles. BTW, this problem has been CLD'ed and I'll
    post the final results
    
    Hong