[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | SCHEDULER |
Notice: | Welcome to the Scheduler Conference on node HUMANE ril |
Moderator: | RUMOR::FALEK |
|
Created: | Sat Mar 20 1993 |
Last Modified: | Tue Jun 03 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1240 |
Total number of notes: | 5017 |
1094.0. "Jobs stuck in QUEUED state V2.1B-9" by CSC32::WATERS (The Agony of Delete) Mon May 13 1996 14:13
We are currently seeing a problem at one customer site where
jobs are going into a queued state and not running. At times there
has been 10-15 jobs in a queued state. This seems to happend
three-four times a week.
Scheduler version 2.1B-9 on 5 VAX nodes, no Alpha's no remote
agents.
EMJVX2 V2.1B-9 29-APR-1996 06:46:22 2 100 5 4 8188 <--Default
EMJVX1 V2.1B-9 29-APR-1996 06:46:07 3 100 5 4 4596
EMJVX0 V2.1B-9 5-MAY-1996 11:10:45 0 100 5 4 5020
EMJVX3 V2.1B-9 5-MAY-1996 11:23:19 0 100 5 4 11549
EMJVX4 V2.1B-9 7-MAY-1996 14:24:12 1 100 5 4 7601
o There is nothing in the Vermont_creamery.log file to show that
the job was queued. You can see the previous run of the job,
queued, started, finished.
o There is no entry left in the queue, all queue's are init'ed
with /RETAIN=ERROR.
o There is nothing in the accountng file, so the job did try to
run and then aborted.
o
Below, I have included some of the logfile from one of the nodes.
There are alot of delete_qentry entries in this file. Any idea
what they are ?
I tried to find a good output of a job in the queued state, but the
dailin log is trashed. Here is on of the jobs that was stuck at on time
but not in this output.
Job Name Entry User_name State Next Run Time
-------- ----- --------- ----- -------------
WIN_ROSI_UPD_465 3830 1OPERATOR Scheduled 3-JUN-1996 22:00
VMS_Command : @SCH:UBS ITA WIN_BRN_ROSI_UPD
Group : ROSI_UPD Type : MONTHEND
Comment : WIN_465_ROSI_U2-APR-1996 06:00 Last Exit Status :UNKNOWN
Schedule Interval : M 22:00 Mode : Batch
Mail to : @SCH:INV.DIS (on ERROR)
Days : (MON,TUE,WED,THU,FRI)
Output File : WIN$LOG:WIN_ROSI_UPD_465.LOG
Cluster_CPU : <Ignored> Notify user upon
completion
Submit Queue : WIN$PRT0
CPULimit (x100ms) : 0 QPriority : 100
Max_Time Warning : None Job Always retained
Stall Notify : None No Retry on Error
Success Count : 0 Failure Count : 7
Owner UIC : [400,0] Restart on Crash
Read Identifier : SCH$READ
Pre Function : @SCH:PRE2 WIN 00465 INV ROSI_UPD RPT, Last Exit Status
: SUCCESS
Post Function : @SCH:OP, Last Exit Status : UNKNOWN
No local jobs depend upon this job.
All dependencies must successfully complete after: 12-APR-199606:00:28.75
Job Dependencies: ([WIN_END_OF_BATCH])
We have check process quota's, scheduler started under SYSTEM,
EMJVX0_field_>type EMJVX2.log<CR>
$ VERIFY = F$VERIFY(0) ! Set NOVERIFY and save current state
Nsched Version V2.1B-9 starting...
Setting Debugging OFF
Setting Job Max to 100
Setting Logging to 5
Setting Default Job Priority to 4
Setting Load Balancing ON
Setting Restart Params to CLEAR on job completition
Setting Remote Jobs ENABLED
Setting brkthru/notify wait to 300 seconds
%DCL-I-SUPERSEDE, previous value of NSCHED$MAILBOX has been superseded
%DCL-I-SUPERSEDE, previous value of NSCHED$TERM_MAILBOX has been superseded
CPUtype= 23 CPU_count= 4 Total pages= 3727846 Meg= 1908 VUPS= 132
had to open fresh log file....
Check Requested
Default was GAINED
Current log file name: NSCHED$:VERMONT_CREAMERY.LOG
Renaming to: NSCHED$:VERMONT_CREAMERY.OLD
Closing log file.
... spawning SCHED_SUMMARIZE_LOG.EXE...
Return status ==> 1 for spawning Process # 1082354416
Calling vss$get_history to summarize success records.....done,status=1
Calling get_history to summarize failure records.....done, status=1
delete_qentry: entry= 1007442 q=WIN$PAY0
delete_qentry: entry= 1007454 q=WIN$PAY3
%MAIL-E-OPENIN, error opening DSA45:[USER.WIN.CLIENT.UBS.SCH]PEJPER.DIS;
as input
-RMS-E-FNF, file not found
sys$dequeue returned bad status: 2544
delete_qentry: entry= 1007463 q=WIN$PAY4
%MAIL-E-OPENIN, error opening DSA45:[USER.WIN.CLIENT.UBS.SCH]PEJPER.DIS;
as input
-RMS-E-FNF, file not found
sys$dequeue returned bad status: 2544
delete_qentry: entry= 1007471 q=WIN$PAY2
%MAIL-E-OPENIN, error opening DSA45:[USER.WIN.CLIENT.UBS.SCH]PEJPER.DIS;
as input
...
delete_qentry: entry= 6151 q=WIN$SLS3<CR>
delete_qentry: entry= 6159 q=WIN$SLS4<CR>
delete_qentry: entry= 6162 q=WIN$SLS2<CR>
delete_qentry: entry= 6167 q=WIN$SLS0<CR>
Error creating space for command interpreter symbol table<CR>
Process quota exceeded<CR>
sys$dequeue returned bad status: 2544<CR>
Error creating space for command interpreter symbol table<CR>
Process quota exceeded<CR>
Error creating space for command interpreter symbol table<CR>
Process quota exceeded<CR>
delete_qentry: entry= 6168 q=WIN$SLS2<CR>
delete_qentry: entry= 6174 q=WIN$SLS4<CR>
delete_qentry: entry= 6175 q=WIN$SLS1<CR>
sys$dequeue returned bad status: 2544<CR>
Error creating space for command interpreter symbol table<CR>
Process quota exceeded<CR>
Error creating space for command interpreter symbol table<CR>
Process quota exceeded<CR>
delete_qentry: entry= 6177 q=WIN$SLS2<CR>
delete_qentry: entry= 6180 q=WIN$SLS1<CR>
delete_qentry: entry= 6183 q=WIN$SLS0<CR>
...
T.R | Title | User | Personal Name | Date | Lines |
---|
1094.1 | ??? | CSC32::WATERS | The Agony of Delete | Thu May 16 1996 14:12 | 2 |
|
Anyone know what the Delete_qentry are ?
|
1094.2 | | BACHUS::WILLEMS | Johan Willems @BRO DTN 856-8739 | Fri Jul 26 1996 11:54 | 6 |
| Any news on this subject???
I have the same problem.
Johan Willems.
|