[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference humane::scheduler

Title:SCHEDULER
Notice:Welcome to the Scheduler Conference on node HUMANEril
Moderator:RUMOR::FALEK
Created:Sat Mar 20 1993
Last Modified:Tue Jun 03 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1240
Total number of notes:5017

1167.0. "/RESTART and BATCH mode" by COMICS::JUDD (Geoff Judd. UK TSC. Viables, Basingstoke) Mon Oct 14 1996 11:40

Hi,
  I have a problem using the restart on crash option with batch mode jobs. I
have two nodes A & B in a cluster running V2.1B-7 of the scheduler server. The
cluster has a common queueing system. If I create a scheduler job in batch mode
submitting the job to queue on node A with the /RESTART qualifier and then 
node A crashes the job is not resubmitted to the queue from node B. The job is
left in Scheduled or Waiting state with the final status NSCHED-F-ABORT eg.

Job Name             Entry    User_name    State      Next Run Time
--------             -----    ---------    -----      -------------
GEOFF_TEST           148      JUDD         Waiting    15-OCT-1996 10:00
VMS_Command : wait 2:00:00.00
Group : (none)                             Type : (none)
Last Start Time   : 14-OCT-1996 14:47
Last Finish Time  : 14-OCT-1996 15:07      Last Exit Status : 0C9681EC(ERROR)
Error : %NSCHED-F-JOBABORT, Job was aborted
Schedule Interval : D 10:00                Mode   : Batch
Mail to           : JUDD (No Mail)
Days              : ALL
Output File       : None
Cluster_CPU       : <Ignored>              Notify user upon completion
Submit Queue      : LOOKIN_BATCH
CPULimit (x100ms) : 0                      QPriority : 100
Max_Time Warning  : None                   Job Always retained
Stall Notify      : None                   No Retry on Error
Success Count     : 0                      Failure Count : 5
Owner UIC         : [100,20]               Restart on Crash
No Pre or Post Function for this job
No local jobs depend upon this job.
This job has no Dependencies on other jobs


However if I create a detached scheduler job and the job is running on node A
when it crashes it is restarted on node B as expected.

Please can someone clarify how the /RESTART option is intended to work in
conjunction with batch mode ? 

Thanks in advance,

Geoff Judd.
T.RTitleUserPersonal
Name
DateLines