[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1421.0. "Export process dies with QFILE-F-WRITERR or RWMBX" by MAN02::STRUTH (Thomas Struth CS Mannheim) Thu Aug 29 1991 08:22


I try to export snmp interface counters from 29 wellfleet routers, each of 
them has a different number if interfaces, the total number of 
interfaces ia about 100. Our customer wants to run exporting permanently.
We chosed a polling interval of 10 min, and one export target database
for all interfaces.  ( Is this a problem ?? )
The exporting is started by a MCC command procedure, using 1 export 
command per snmp interface.  ( I had a lot of trouble using wildcards.. )
During the first minutes after starting the background process hangs 
in MWAIT state quite often.  
It works principially, but after a certain time wich is sometimes 
several hours, and sometimes more than a day, the background prosess
either dies with " QFILE-F-WRITERR " , or he is in RWMBX state.
It never works for more than 2 days. 
The collected data in the RDB database look ok until the process stops.
I checked this using DECdecision. 
Restarting the background process is no problem... for some hours.

The only poll failures that are reported in the MCC sho export.. is
"Failed to call ETP" .
The background process log file contains about 800 lines with 
"Tested false for termination alert" per day, before the final
" QFILE... " message.

The hardware config is a VAX 4000/300, 32MB , 3* RF31, 
VT1300, doing nothing else than exporting most of the time.
WE use BMS 1.1 and MCCTCPIP 1.0, RDB 4.0
Sysgen params look ok, I checked them against the recommended values.

thanks in advance for any help,    

Thomas

T.RTitleUserPersonal
Name
DateLines
1421.1need more informationTOOK::SHMUYLOVICHThu Aug 29 1991 14:5647
	RE: .0

	Let's try to break your problem into several smaller.

 	"QFILE-F-WRITERR" error description:

	The Exporter foreground and background processes communicate 
  using queue file. The background reads this file every 10 sec trying
  to receive a new command from the foreground. Every time it reads the
  queue file it writes the "current time" in the queue. This is only place
  where "QFILE-F-WRITERR" error is generated due to any error returned
  from RMS services. 
  I believe that you have enough disk space (100 exporting requests with
  period 10 min gives you more that 14,000 rows in the rdb file after 1 day).
  Could you, please, check disk errors ?
  
   Did you run Exporter with smaller number of requests? 
   If you did this what results you had?
   If not it will be very helpful to run about 10 exporting request and 
   see the result.

   Did you try to export other that snmp interface entities? 
   If not it will be very helpful to do this.

   Would you please send the background log file to me
   (TOOK::SHMUYLOVICH).

   I hope that answers on these questions allow to locate the problem:
   Exporter FM, TCTIP AM or your environment.

>						... the total number of 
>interfaces ia about 100. Our customer wants to run exporting permanently.
>We chosed a polling interval of 10 min, and one export target database
>for all interfaces.  ( Is this a problem ?? )

	Theoretically one background process can handle up to 500 exporting
  requests.

>The exporting is started by a MCC command procedure, using 1 export 
>command per snmp interface.  ( I had a lot of trouble using wildcards.. )

	Please, give more information about wildcard problems.


	SAm
    
1421.2more infoMAN02::STRUTHThomas Struth CS MannheimMon Sep 02 1991 12:0884
Some more information:
- the disk with the rdb database has enough free space ( more than 40.000
  blocks free, no errors at any disk , no disk quotas .
- I did not try to export other entities for a longer time. I tried it just 
  for a day , it worked so far. 
- Today I set up a new exporting for only 10 interfaces; I am in suspense
  wether it works. I let you know about the results immediately. 

  Before I started the 10- interface - exporting, I noticed the following:
  I resubmitted the background process ( it had died two days before ),
  with the result that exporting of all (100) interfaces started again .
  ( like in all cases before ) 
  During the first minutes the process spent a lot of time in RWMBX state 
  but became computable always after 5- 15 sec.  
  I observed two or three successfull cycles for ALL interfaces, and
  then I started my delete_export procedure. It terminated without error
  telling me successful deletion. Back in DCL, I noticed that the batch 
  job had just died, leaving us the below listed log file.
  It was the first time that the background process died after such a short
  time. Should there be an interaction of my dele export snmp .....
  and process termination / Queue File Error  in the same moment ??

  Now I resubmitted the job and started exporting for only 10 interfaces.
   ( we will see tomorrow ... ) 

Log file : 

$	set noverify

 tested FALSE for termination alert.
 ...  
 ( I deleted 190 lines ) 
 ...
 ...
 tested FALSE for termination alert.
 tested FALSE for termination alert.
 tested FALSE for termination alert.
 tested FALSE for termination alert.
%QFILE-E-WRITERR, error writing queue file
  MCC          job terminated at  2-SEP-1991 14:11:49.95

  Accounting information:
  Buffered I/O count:           10078         Peak working set size:   11017
  Direct I/O count:              3298         Peak page file size:     21066
  Page faults:                   8878         Mounted volumes:             0
  Charged CPU time:           0 00:01:49.87   Elapsed time:     0 00:29:59.33


The next log file is an older one, I am quite sure that the access violation was 
caused by insufficient pql values: 
 ...
 tested FALSE for termination alert.
 tested FALSE for termination alert.
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=00000000, PC=0000F3A8, PSL=03C00000
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=0005F6A6, PC=000066DA, PSL=03C00004
%QUEUE-F-ALLOCFAIL, virtual memory allocation failed
  MCC          job terminated at 10-AUG-1991 10:47:48.20

  Accounting information:
  Buffered I/O count:          328839         Peak working set size:   16400
  Direct I/O count:            134322         Peak page file size:     55289
  Page faults:                  89353         Mounted volumes:             0
  Charged CPU time:           0 01:17:38.36   Elapsed time:     0 17:13:53.89

  

- Sample command how I start exporting:
EXPORT SNMP Z75RT1 INTERFACE 2 EXPORT TARGET SYS_MCC:MCC_EXPORT.RDB,
       SEQUENCE NAME Z75RT1,EXPORT PERIOD= 00:10

- Wildcarding problems: when I first "played" with exporting six weeks ago,
  I tried to start it with : $mcc export snmp * interface * , export target ...
  or $mcc export snmp ROUTER1 interface * , export target ...
  The "sho export " did not give similar results for all interfaces of the same
  router, and there were different results for "sho export snmp ROUTER1 
  interface * " and " sho export snmp ROUTER1 interface 1 " for the same 
  interface ! , specially regarding export state.
  It was also impossible to delete it. ( No exporting for specified entity...)
  However, it is some weeks ago and I don't have kept any logfile. Can we 
  discuss this separately? I would like to reproduce it first, and I don't 
  think that it is related to my actual problem.
 
Thomas


1421.3it's past "tommorrow", your results?GOSTE::CALLANDERSat Sep 07 1991 10:012
    and your results....
    
1421.4polling 10- interfacesMUNICH::HILGERMon Sep 09 1991 10:417
    
    The test with 10 interfaces has been done and run successfully over the
    weekend. The problem only showed up with a large number of interfaces
    to
    be polled.
    
            regards         Peter