T.R | Title | User | Personal Name | Date | Lines |
---|
1762.1 | in STARS | GIDDAY::LEH | | Thu Nov 12 1992 10:17 | 13 |
| Found several STARS articles after having posted this note, which
suggested the use of cross filing to remote drawers from local filecab
I will need to check it tomorrow as the info in the base note was faxed
to me and I'd no chance to ask questions.
Should have also mentioned the sequence of those msgs in previous note:
Those MCC-E- msgs seemed to occur just prior to FCS startup
while OAFC-E-INTERR msgs seemed to happen randomly
Hong
|
1762.2 | Acc vio is very bad | CHRLIE::HUSTON | | Thu Nov 12 1992 16:16 | 27 |
|
> OAFC-E-INTERR, Internal error in FCS
> Msg: FCS has been violated, pls submit SPR
This one is obviously VERY BAD, it means that one of the FCS threads
has access violated, we need to know what you did and if it is
reproducable.
> MCC-E-LERT_TERMREQ, thread termination requested
> Msg: SrvTimeoutSysMan; receive alert to terminate thread
>
> MCC-E-LERT_TERMREQ, thread termination requested
> Msg: CsiCacheCloseDrwStruct; error closing docdb
>
> MCC-E-EXISTENCE_ERROR, object doesnot exist
You say that these show up just BEFORE the startup message. This is
probably ok. These messages mean that one of the background threads
has been told to kill itself. This is normal when shutting down the
server.
As for the slow performance, did this just start after the upgrade?
Can you show the server parameters and the process limits for the
OAFC$SERVER account.
--Bob
|
1762.3 | still happening - more details | GIDDAY::LEH | | Fri Nov 13 1992 05:41 | 146 |
| Bob
Thanks for your suggestions. The customer didn't believe doing anything
sophisticated since upgrade less than a week ago, mostly Indexing jobs and
refiling across drawers. It was somehow reproducible since I saw
OAFC-E-THREADACCVIO again as well as previous days.
BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed
Sharing , as said in a couple of STARS articles ? This site hasn't had it yet.
Herewith some more details :
****** Errors in FCS logs ********
More errors in OAFC$SERVER.LOG
Error: %MCC-E-ALERT_TERMREQ, thread termination requested
Message: CsiCacheBlockAstService; Error from mcc_astevent_receive
Error: %MCC-E-ALERT_TERMREQ, thread termination requested
Message: SrvTimeoutSysess; receive alert to terminate thread
Error: %MCC-E-ALERT_TERMREQ, thread termination requested
Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1]
Error: %MCC-E-EXISTENCE_ERROR, object does not exist
Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab)
Error: %MCC-E-EXISTENCE_ERROR, object does not exist
Message: CsiCacheGarbageCollect; Error from FlushDrawerCache
Error: %MCC-E-IN_USE_ERROR, in use error
Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock
Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved
Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc
Error: %RMS-E-RNF, record not found
Message: CsiOpenRmsFileCab; error getting record: TRAN MAN
Error: %MCC-E-ALERT_TERMREQ, thread termination requested
Message: SrvTimeoutSysMan; receive alert to terminate thread
Error: %OAFC-E-INTERR, Internal error in File Cabinet Server
Message: FCS has access violated, please submit an SPR.
Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information
Message: CsiEndFunction; Error putting EOC on out stream
Error: %RMS-F-EXENQLM, exceeded enqueue quota
Message: SecGetRigthsList; Can't read users rightlist.
Error: %RMS-F-EXENQLM, exceeded enqueue quota
Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN
Error: %RMS-F-EXENQLM, exceeded enqueue quota
Message: SecGetRigthsList; Can't read users rightlist.
Error: %RMS-W-TMO, timeout period expired
Message: CsiReserve; unable to get info about reserved doc
Error: %SYSTEM-F-IVLOCKID, invalid lock id
Message: CsiCacheFlushDrwStruct; Error dequeing VMS lock block
........
I'll look them up in manuals although some of them are quite obvious on
server's quota or RMS access problems. Have you spotted any special ones ?
Also in OAFC$SERVER_ERROR.LOG did I pick up
ALL-IN-1 Index Server Internal Error:
Error locking DAB during cache garbage collection:
****** Other info in FCS management menus ********
o I got error:
Client buffer not big enough for requested information
%OA-I-LASTLINE, CNB08V::"73=" must be running to perform operation
when trying to do MS MSC Manage server clients
o Info on server
Server Name: CNB08V::"73="
Configuration:Process Name: CNB08V$SRV73
Type: LOCAL
Startup State: ENABLED
Configuration File: OA$DATA_SHARE:CNB08V$SERVER73.DAT
Server Authorized Timeout: 600 Max Client Connects: 512
Attributes: Distribution: OFF Max # of Drawers: 140
Drawer Cache: 50 Object Number: 73
Drawer Timeout: 600 Session Timeout: 1200
Drawer Collisions: 0
Server Resources
CNB08V::"73="
CPU Time: 19631
Virtual page count: 92138
Direct IO: 23849
Buffered IO: 30794
Page faults: 124433
Working set: 8096
Server Auditing Information
CNB08V::"73="
Version: 0 Conn_acc: 309
Threads_created: 4100 Conn_rej: 0
Threads_deleted: 4092 Data_rcv: 3567
Sess_create: 305 Data_rbytes: 502686
Sess_delete: 242 Data_sent: 3668
Out_is_sess: 0 Data_sbytes: 279112
In_is_sess: 0 Bytlm: 131113
Tasks_created: 4100 Maxbuf: 31802
Tasks_deleted: 4099 Channelcnt: 1024
o partial UAF of the server account:
Username: OAFC$SERVER Owner: FILECABSERVER
Maxjobs: 1 Fillm: 150 Bytlm: 40000
Maxacctjobs: 0 Shrfillm: 0 Pbytlm: 0
Maxdetach: 0 BIOlm: 50 JTquota: 4096
Prclm: 6 DIOlm: 50 WSdef: 512
Prio: 4 ASTlm: 100 WSquo: 3072
Queprio: 0 TQElm: 50 WSextent: 5120
CPU: (none) Enqlm: 1000 Pgflquo: 40000
Authorized Privileges:
CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK
Default Privileges:
CMKRNL SYSNAM PRMMBX TMPMBX EXQUOTA NETMBX SYSPRV SYSLCK
Thanks again for your interest
Hong
|
1762.4 | Some things to try | CHRLIE::HUSTON | | Mon Nov 16 1992 15:46 | 65 |
|
re .3
>BTW doesn't the error OAFC-E-THREADACCVIO apply only to Distributed
>Sharing , as said in a couple of STARS articles ? This site hasn't had it yet.
Nope, if you read that in a stars article it was wrong. Whenever
a FCS thread access violates, this message will be logged and the FCS
will attempt to tell the client that an access violation occured.
Can you repost the server log contents with the timestamps in it? They
should have timestamps, it looks as though you edited it before
posting.
>Error: %MCC-E-ALERT_TERMREQ, thread termination requested
>Message: CsiOpenRmsFC; error opening filecab: DISK$USER15:[HCA2.MUDFOS.A1]
>
>Error: %MCC-E-EXISTENCE_ERROR, object does not exist
>Message: CsiCacheFlushDrawerCache; Error from mcc_mutex_lock (dab)
>
>Error: %MCC-E-EXISTENCE_ERROR, object does not exist
>Message: CsiCacheGarbageCollect; Error from FlushDrawerCache
>
>Error: %MCC-E-IN_USE_ERROR, in use error
>Message: CsiCacheFlushDrawerAccess; Error from mcc_mutex_try_lock
>
>Error: %OAFC-E-DOCRESERVED, Cannot perform requested operation, document is reserved
>Message: OafcGsmCopyJacket; Couldn't delete copied/refiled doc
>
>Error: %RMS-E-RNF, record not found
>Message: CsiOpenRmsFileCab; error getting record: TRAN MAN
>
>Error: %MCC-E-ALERT_TERMREQ, thread termination requested
>Message: SrvTimeoutSysMan; receive alert to terminate thread
>
>Error: %OAFC-E-INTERR, Internal error in File Cabinet Server
>Message: FCS has access violated, please submit an SPR.
This sequence is somewhat distrubing, it appears as though for one
reason or another the users fc could not be opened and hence no
drawers were found, but IOS continued to make requests about the fc
to the FCS. Again, the timestamps would help sort out if this was all
very close together in time.
>
>Error: %OAFC-W-BUFFOVFL, Client buffer not big enough for requested information
>Message: CsiEndFunction; Error putting EOC on out stream
This is a known bug in the system management UI to the FCS.
>Error: %RMS-F-EXENQLM, exceeded enqueue quota
>Message: SecGetRigthsList; Can't read users rightlist.
>
>Error: %RMS-F-EXENQLM, exceeded enqueue quota
>Message: CsiOpenRmsFileCab; error getting record: BATEUP KEN
>
>Error: %RMS-F-EXENQLM, exceeded enqueue quota
>Message: SecGetRigthsList; Can't read users rightlist.
>
Again, disturbing, try bumping the enq limit of the server process.
--Bob
|
1762.6 | quotas in OAFC$STARTUP | GIDDAY::LEH | | Wed Nov 18 1992 10:08 | 19 |
| As shown in the log file in .5, FCS access violation caused users
filecab to be locked , which must be fixed with stop/restart the server
Herewith the values used in their FCS startup:
$ oafc$server_astlm = 2624
$ oafc$server_biolm = 1856
$ oafc$server_bytlm = 1312000
$ oafc$server_diolm = 1856
$ oafc$server_enqlm = 4000 ! was 3136
$ oafc$server_fillm = 2624
$ oafc$server_pgflquota = 180000
$ oafc$server_wsextent = 8096
$ oafc$server_wsquota = 2048
Thanks for any other advices
Hong
|
1762.7 | More problems with the SERVER and DDS | GIDDAY::SETHI | Man from Downunder | Tue Nov 24 1992 06:05 | 60 |
| Hi All,
The customer was unable to access the WP, EM sub-systems and the system
managers option to shutdown ALL-IN-1 just hung the process.
To shutdown the ALL-IN-1 system I asked the customer to do the
following so that an orderly shutdown could take place :-)
1. From the ALL-IN-1 managers account enter the shoutown symbols
OA$SHUT_FIRST_MSG 1
OA$SHUT_REASON test
OA$SHUT_TIME 1992112416000000
2. Submit the batch job OA$LIB_SHARE:SM_SCHEDULED_SHUTDOWN.COM, with
the /params="MANAGER"
3. ALL-IN-1 shutdown and we started ALL-IN-1 up again without a
problem.
Now some things to note
1. The customer has just adopted DDS as his primary directory
OA$DDS_PRIME set to 1.
2. Users when they went to the WP or EM menu's got the following error
message :-(
%RMS-I-CRMP, CRMPSC system service failed to map global buffers
%OAFC-I-RMSERROR, RMS Error has occurred. Refer to exteneded status
for RMS error code
I have advised the customer to examine the following parameters
RMS_GBLBUFQUO, GBLPAGFIL, GBLPAGES, GBLSECTIONS, and VIRTUALPAGECNT
As per the stars article.
3. Set the OA$DDS_PRIME to 0 everything no more error message as in
point 2
4. Also note that the users and the ALL-IN-1 manager could not get into
the above mentioned sub-systems from nodes in the cluster that had
OA$DDS_PRIME set to 0. So this leads me to believe that file server
may have caused the problem.
5. The process OA$FCV was in SUSPO state and swapped out (obvious
comment), the <nodename>$SRV73 process was in hibination state.
To me it looks like a combination of problems to me file server and DDS
problems. Some related to tuning issues and others with the file
server.
Sorry if I have not provided you with more information this I was in
fire fighting mode, the system was unusable. Any pointers would be
useful so I can determin why this has happened.
Thanks
Sunil
|
1762.8 | WSEXTENT quota, hopefully | GIDDAY::LEH | | Fri Nov 27 1992 06:33 | 16 |
| Recent problem with insufficient GBLPAGFIL complicated the initial FCS
internal error problems; however, 2 things have been done:
o increase of GBLPAGFIL
o increase of WSEXTENT for FC server process during startup
and the effects have been quite positive: users DOCDB, PDAF and
RESERVATIONS were no longer locked by server process, decent responses
from drawer indexing and only one internal error recorded in 2 days,
compared with 10+ times a day previously seen on same 2 troubled nodes.
I hope it's customer oversight on the value of WSEXTENT that has been
causing all the troubles. BTW, this problem has been CLD'ed and I'll
post the final results
Hong
|