[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | SCHEDULER |
Notice: | Welcome to the Scheduler Conference on node HUMANE ril |
Moderator: | RUMOR::FALEK |
|
Created: | Sat Mar 20 1993 |
Last Modified: | Tue Jun 03 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1240 |
Total number of notes: | 5017 |
1106.0. "MAIL not sent & Jobs in Slot Wait" by STKHLM::WIDMAN (Modo liceat vivere, est spes.) Wed May 22 1996 15:12
Hi,
I'm looking at some strange problems in (IMHO) a weird config.
The problem(s) : 1. No mail after job completion.
2. No NSCEHD$MAILBOX or NSCEHD$TERM_MAILBOX exists.
3. Jobs in Slot Wait even if there seems to be
slots available.
Question : What's going on ?
Some Info.:
Cluster with three nodes
BLGV07 VAX 6000-630 VMS V5.5-2
BLGV06 VAX 6000-630 VMS V5.5-2
BLGV04 VAX 7000-630 VMS V5.5-2
BLGV04 has its own systemdisk with a 'separate' installation of the
scheduler...
SCHEDULE>sh sta
Node Version Started Jobs Jmax Log Pri Rating
BLGV06 V2.1B-1 12-MAY-1996 22:34:30 0 40 5 4 4659 <--
Default
BLGV07 V2.1B-1 12-MAY-1996 22:44:33 0 40 5 4 10209
BLGV04 V2.1B-1 12-MAY-1996 23:01:22 0 40 5 4 10457
SCHEDULE>sho job 6/fu
Job Name Entry User_name State Next Run Time
-------- ----- --------- ----- -------------
TIME-TEST 6 BOSTROMRO Scheduled 23-MAY-1996 00:00
VMS_Command : show time
Group : (none) Type : (none)
Last Start Time : 22-MAY-1996 17:27
Last Finish Time : 22-MAY-1996 17:27 Last Exit Status : SUCCESS
Schedule Interval : D Mode : Detached
Mail to : BLGV04::FIELD (Always)
Days : ALL
Output File : SHOW-TIME.LOG
Cluster_CPU : BLGV04 Notify user upon completion
Run Priority : Default
Max_Time Warning : None Job Always retained
Stall Notify : None No Retry on Error
Success Count : 78 Failure Count : 14
Owner UIC : [242,21225] Restart on Crash
No Pre or Post Function for this job
No local jobs depend upon this job.
This job has no Dependencies on other jobs
From NSCHED$:BLGV04.LOG
!
Nsched Version V2.1B-1 starting...
Setting Debugging OFF
Setting Job Max to 40
Setting Logging to 5
Setting Default Job Priority to 4
Setting Load Balancing ON
Setting Restart Params to CLEAR on job completition
Setting Remote Jobs ENABLED
Setting brkthru/notify wait to 300 seconds
CPUtype= 23 CPU_count= 3 Total pages= 688879 Meg= 352 VUPS= 99
Check Requested
Setting Debugging ON
timer flag was clear
timer not expired. No earlier event to set.
sleeping
...
...
we woke up!
got mbx msg '>>6 '
06:59 PM processing record # 6 status= S request= N
vss$get_next_start_time: 1 cstat= 211191443 next=
23-MAY-1996 00:00:00.00
Running Job 6 PID=2061BCC9 Count= 1 Priority= 4
timer flag was clear
timer not expired. No earlier event to set.
sleeping
we woke up!
job # 6 finished.... count= 0
exit status of job was 00030001
NSCHED: LIB$SPAWN(MAIL...4 to_list='BLGV04::FIELD'
Sending mail :
MAIL/NOSELF/SUBJ:"Scheduler Job #6 (NAME: TIME-TEST) finished, Status:
Success" NL: "BLGV04::FIELD"
Spawn mail failed: 28
0 remote nodes care about job 6
06:59 PM processing record # 6 status= S request=
Now=22-MAY-1996 18:59:47.01 job_sched_time=23-MAY-1996 00:00:00.00
job 6 is scheduled for the future
06:59 PM updated record # 6 status= S request=
Found 0 local jobs depending on :: 6
cluster_broadcast:---node= msg=CWJ
timer flag was clear
timer not expired. No earlier event to set.
sleeping
$ sh log ns*
(LNM$PROCESS_TABLE)
(LNM$JOB_84789EB0)
(LNM$GROUP_000001)
(LNM$SYSTEM_TABLE)
"NSCHED$" = "SYS$COMMON:[NSCHED]"
"NSCHED$CLEAR_RESTART_PARAM" = "TRUE"
"NSCHED$DEFAULT_JOB_MAX" = "40"
"NSCHED$REMOTE_SUPPORT_ENABLED" = "TRUE"
"NSCHED$UID" = "NSCHED$:SCHEDULER$XUI.UID"
"NSCHED_DEFAULT_SD_ACTION" = "SKIP"
(DECW$LOGICAL_NAMES)
\ H�kan Widman / CSC Sweden
BTW - I've sent ECO7 to the customer today ....
T.R | Title | User | Personal Name | Date | Lines |
---|
1106.1 | subprocess quota ? | RUMOR::FALEK | ex-TU58 King | Wed May 22 1996 15:35 | 4 |
| If you type "exit 28" at the VMS $ prompt, you see "exceeded quota"
In this case it is probably the subprocess quota of the qccount that
the scheduler runs under.
|