[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:	OLD ALL-IN-1 (tm) Support Conference
Notice:	Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:	IOSG::PYE

Created:	Thu Jan 30 1992
Last Modified:	Tue Jan 23 1996
Last Successful Update:	Fri Jun 06 1997
Number of topics:	4343
Total number of notes:	18308

1762.0. "File Cabinet Server FCS internal errors" by GIDDAY::LEH () Thu Nov 12 1992 09:57

Doesn't seem this error has been raised in this notesfile:

	OAFC-E-INTERR, Internal error in FCS
	Msg: FCS has been violated, pls submit SPR

which occurred on a newly upgraded system that runs VMS 5.5-1 and MR 3.2

Other errors incl:

	MCC-E-LERT_TERMREQ, thread termination requested
	Msg: SrvTimeoutSysMan; receive alert to terminate thread

	MCC-E-LERT_TERMREQ, thread termination requested
	Msg: CsiCacheCloseDrwStruct; error closing docdb

	MCC-E-EXISTENCE_ERROR, object doesnot exist

Since the upgrade, this system has been experiencing slow FCS response, e.g. a 
couple of minutes for an end-user account doing drawer indexing. Most of the 
fine-tuning hints given in the manual have been used except one or two 
non-dynamic SYSGEN params.

Thanks for any insights into these problems and fixes.

Hong
CSC Sydney

T.R	Title	User	Personal Name	Date	Lines
1762.1	in STARS	GIDDAY::LEH		`Thu Nov 12 1992 10:17`	13
	Found several STARS articles after having posted this note, which suggested the use of cross filing to remote drawers from local filecab I will need to check it tomorrow as the info in the base note was faxed to me and I'd no chance to ask questions. Should have also mentioned the sequence of those msgs in previous note: Those MCC-E- msgs seemed to occur just prior to FCS startup while OAFC-E-INTERR msgs seemed to happen randomly Hong
1762.2	Acc vio is very bad	CHRLIE::HUSTON		`Thu Nov 12 1992 16:16`	27
	> OAFC-E-INTERR, Internal error in FCS > Msg: FCS has been violated, pls submit SPR This one is obviously VERY BAD, it means that one of the FCS threads has access violated, we need to know what you did and if it is reproducable. > MCC-E-LERT_TERMREQ, thread termination requested > Msg: SrvTimeoutSysMan; receive alert to terminate thread > > MCC-E-LERT_TERMREQ, thread termination requested > Msg: CsiCacheCloseDrwStruct; error closing docdb > > MCC-E-EXISTENCE_ERROR, object doesnot exist You say that these show up just BEFORE the startup message. This is probably ok. These messages mean that one of the background threads has been told to kill itself. This is normal when shutting down the server. As for the slow performance, did this just start after the upgrade? Can you show the server parameters and the process limits for the OAFC$SERVER account. --Bob
1762.3	still happening - more details	GIDDAY::LEH		`Fri Nov 13 1992 05:41`	146
	Bob Thanks for your suggestions. The customer didn't believe doing anything sophisticated since upgrade less than a week ago, mostly Indexing jobs and refiling across drawers. It was somehow reproducible since I saw OAFC-E-THREADACCVIO again as well as previous days. BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed Sharing , as said in a couple of STARS articles ? This site hasn't had it yet. Herewith some more details : **** Errors in FCS logs **** More errors in OAFC$SERVER.LOG Error: %MCC-E-ALERT_TERMREQ, thread termination requested Message: CsiCacheBlockAstService; Error from mcc_astevent_receive Error: %MCC-E-ALERT_TERMREQ, thread termination requested Message: SrvTimeoutSysess; receive alert to terminate thread Error: %MCC-E-ALERT_TERMREQ, thread termination requested Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1] Error: %MCC-E-EXISTENCE_ERROR, object does not exist Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab) Error: %MCC-E-EXISTENCE_ERROR, object does not exist Message: CsiCacheGarbageCollect; Error from FlushDrawerCache Error: %MCC-E-IN_USE_ERROR, in use error Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc Error: %RMS-E-RNF, record not found Message: CsiOpenRmsFileCab; error getting record: TRAN MAN Error: %MCC-E-ALERT_TERMREQ, thread termination requested Message: SrvTimeoutSysMan; receive alert to terminate thread Error: %OAFC-E-INTERR, Internal error in File Cabinet Server Message: FCS has access violated, please submit an SPR. Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information Message: CsiEndFunction; Error putting EOC on out stream Error: %RMS-F-EXENQLM, exceeded enqueue quota Message: SecGetRigthsList; Can't read users rightlist. Error: %RMS-F-EXENQLM, exceeded enqueue quota Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN Error: %RMS-F-EXENQLM, exceeded enqueue quota Message: SecGetRigthsList; Can't read users rightlist. Error: %RMS-W-TMO, timeout period expired Message: CsiReserve; unable to get info about reserved doc Error: %SYSTEM-F-IVLOCKID, invalid lock id Message: CsiCacheFlushDrwStruct; Error dequeing VMS lock block ........ I'll look them up in manuals although some of them are quite obvious on server's quota or RMS access problems. Have you spotted any special ones ? Also in OAFC$SERVER_ERROR.LOG did I pick up ALL-IN-1 Index Server Internal Error: Error locking DAB during cache garbage collection: ** Other info in FCS management menus ****** o I got error: Client buffer not big enough for requested information %OA-I-LASTLINE, CNB08V::"73=" must be running to perform operation when trying to do MS MSC Manage server clients o Info on server Server Name: CNB08V::"73=" Configuration:Process Name: CNB08V$SRV73 Type: LOCAL Startup State: ENABLED Configuration File: OA$DATA_SHARE:CNB08V$SERVER73.DAT Server Authorized Timeout: 600 Max Client Connects: 512 Attributes: Distribution: OFF Max # of Drawers: 140 Drawer Cache: 50 Object Number: 73 Drawer Timeout: 600 Session Timeout: 1200 Drawer Collisions: 0 Server Resources CNB08V::"73=" CPU Time: 19631 Virtual page count: 92138 Direct IO: 23849 Buffered IO: 30794 Page faults: 124433 Working set: 8096 Server Auditing Information CNB08V::"73=" Version: 0 Conn_acc: 309 Threads_created: 4100 Conn_rej: 0 Threads_deleted: 4092 Data_rcv: 3567 Sess_create: 305 Data_rbytes: 502686 Sess_delete: 242 Data_sent: 3668 Out_is_sess: 0 Data_sbytes: 279112 In_is_sess: 0 Bytlm: 131113 Tasks_created: 4100 Maxbuf: 31802 Tasks_deleted: 4099 Channelcnt: 1024 o partial UAF of the server account: Username: OAFC$SERVER Owner: FILECABSERVER Maxjobs: 1 Fillm: 150 Bytlm: 40000 Maxacctjobs: 0 Shrfillm: 0 Pbytlm: 0 Maxdetach: 0 BIOlm: 50 JTquota: 4096 Prclm: 6 DIOlm: 50 WSdef: 512 Prio: 4 ASTlm: 100 WSquo: 3072 Queprio: 0 TQElm: 50 WSextent: 5120 CPU: (none) Enqlm: 1000 Pgflquo: 40000 Authorized Privileges: CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK Default Privileges: CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK Thanks again for your interest Hong
1762.4	Some things to try	CHRLIE::HUSTON		`Mon Nov 16 1992 15:46`	65
	re .3 >BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed >Sharing , as said in a couple of STARS articles ? This site hasn't had it yet. Nope, if you read that in a stars article it was wrong. Whenever a FCS thread access violates, this message will be logged and the FCS will attempt to tell the client that an access violation occured. Can you repost the server log contents with the timestamps in it? They should have timestamps, it looks as though you edited it before posting. >Error: %MCC-E-ALERT_TERMREQ, thread termination requested >Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1] > >Error: %MCC-E-EXISTENCE_ERROR, object does not exist >Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab) > >Error: %MCC-E-EXISTENCE_ERROR, object does not exist >Message: CsiCacheGarbageCollect; Error from FlushDrawerCache > >Error: %MCC-E-IN_USE_ERROR, in use error >Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock > >Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved >Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc > >Error: %RMS-E-RNF, record not found >Message: CsiOpenRmsFileCab; error getting record: TRAN MAN > >Error: %MCC-E-ALERT_TERMREQ, thread termination requested >Message: SrvTimeoutSysMan; receive alert to terminate thread > >Error: %OAFC-E-INTERR, Internal error in File Cabinet Server >Message: FCS has access violated, please submit an SPR. This sequence is somewhat distrubing, it appears as though for one reason or another the users fc could not be opened and hence no drawers were found, but IOS continued to make requests about the fc to the FCS. Again, the timestamps would help sort out if this was all very close together in time. > >Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information >Message: CsiEndFunction; Error putting EOC on out stream This is a known bug in the system management UI to the FCS. >Error: %RMS-F-EXENQLM, exceeded enqueue quota >Message: SecGetRigthsList; Can't read users rightlist. > >Error: %RMS-F-EXENQLM, exceeded enqueue quota >Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN > >Error: %RMS-F-EXENQLM, exceeded enqueue quota >Message: SecGetRigthsList; Can't read users rightlist. > Again, disturbing, try bumping the enq limit of the server process. --Bob
1762.6	quotas in OAFC$STARTUP	GIDDAY::LEH		`Wed Nov 18 1992 10:08`	19
	As shown in the log file in .5, FCS access violation caused users filecab to be locked , which must be fixed with stop/restart the server Herewith the values used in their FCS startup: $ oafc$server_astlm = 2624 $ oafc$server_biolm = 1856 $ oafc$server_bytlm = 1312000 $ oafc$server_diolm = 1856 $ oafc$server_enqlm = 4000 ! was 3136 $ oafc$server_fillm = 2624 $ oafc$server_pgflquota = 180000 $ oafc$server_wsextent = 8096 $ oafc$server_wsquota = 2048 Thanks for any other advices Hong
1762.7	More problems with the SERVER and DDS	GIDDAY::SETHI	Man from Downunder	`Tue Nov 24 1992 06:05`	60
	Hi All, The customer was unable to access the WP, EM sub-systems and the system managers option to shutdown ALL-IN-1 just hung the process. To shutdown the ALL-IN-1 system I asked the customer to do the following so that an orderly shutdown could take place :-) 1. From the ALL-IN-1 managers account enter the shoutown symbols OA$SHUT_FIRST_MSG 1 OA$SHUT_REASON test OA$SHUT_TIME 1992112416000000 2. Submit the batch job OA$LIB_SHARE:SM_SCHEDULED_SHUTDOWN.COM, with the /params="MANAGER" 3. ALL-IN-1 shutdown and we started ALL-IN-1 up again without a problem. Now some things to note 1. The customer has just adopted DDS as his primary directory OA$DDS_PRIME set to 1. 2. Users when they went to the WP or EM menu's got the following error message :-( %RMS-I-CRMP, CRMPSC system service failed to map global buffers %OAFC-I-RMSERROR, RMS Error has occurred. Refer to exteneded status for RMS error code I have advised the customer to examine the following parameters RMS_GBLBUFQUO, GBLPAGFIL, GBLPAGES, GBLSECTIONS, and VIRTUALPAGECNT As per the stars article. 3. Set the OA$DDS_PRIME to 0 everything no more error message as in point 2 4. Also note that the users and the ALL-IN-1 manager could not get into the above mentioned sub-systems from nodes in the cluster that had OA$DDS_PRIME set to 0. So this leads me to believe that file server may have caused the problem. 5. The process OA$FCV was in SUSPO state and swapped out (obvious comment), the <nodename>$SRV73 process was in hibination state. To me it looks like a combination of problems to me file server and DDS problems. Some related to tuning issues and others with the file server. Sorry if I have not provided you with more information this I was in fire fighting mode, the system was unusable. Any pointers would be useful so I can determin why this has happened. Thanks Sunil
1762.8	WSEXTENT quota, hopefully	GIDDAY::LEH		`Fri Nov 27 1992 06:33`	16
	Recent problem with insufficient GBLPAGFIL complicated the initial FCS internal error problems; however, 2 things have been done: o increase of GBLPAGFIL o increase of WSEXTENT for FC server process during startup and the effects have been quite positive: users DOCDB, PDAF and RESERVATIONS were no longer locked by server process, decent responses from drawer indexing and only one internal error recorded in 2 days, compared with 10+ times a day previously seen on same 2 troubled nodes. I hope it's customer oversight on the value of WSEXTENT that has been causing all the troubles. BTW, this problem has been CLD'ed and I'll post the final results Hong