T.R | Title | User | Personal Name | Date | Lines |
---|
143.1 | Check Quotas, Specify Needed Quotas on RUN/DETACH... | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed Feb 05 1997 10:03 | 15 |
|
As I'd expect the code should hang only on a quota problem, I'd guess
this is quota-related. (I'd not expect the call to *hang* otherwise
-- most other causes should cause a complete failure of the call. The
quota failure could be a result of different settings of PQL-related
SYSGEN parameters, or a result of even a slight change in OpenVMS and/or
application requirements on the new OpenVMS release.)
Examples of the open call and the RUN/DETACH, please?
(If you are not including a list of quotas on the RUN/DETACH command,
the detached process will be entirely dependent on the vagaries of the
local system configuration. I don't recommend depending on SYSGEN
defaults on RUN/DETACH, on $crembx, etc -- this dependence can cause
transient or isolated failures, failures that can be hard to debug.)
|
143.2 | Nothing fancy here... | CHEFS::WILLIAMSA | I wanna be Luke | Wed Feb 05 1997 11:02 | 78 |
| Steve,
The following is the extract of code that is having the problem...
void access_lock_file(void)
{
FILE *fp2;
send_request("Attempting to gain lock.",2); scratch_int = 0;
while (1)
{
status = system("open/read/write output
decfm$exe:process_monitor_lock.file");
if (status != FILELOCKED)
{
fp2 = fopen("decfm$exe:process_monitor_lock.file","a+");
sprintf(audit_message,"Gained lock. Processing on this node.");
write_audit_file(audit_message);
send_request(audit_message,2);
break;
}
if (scratch_int == 0)
{
scratch_int = 1;
sprintf(audit_message,"Unable to gain lock. Process in STANDBY mode.");
write_audit_file(audit_message);
send_request(audit_message,2);
}
sleep(60);
}
}
The run/detach command is simply run/detach process_monitor_v1.exe, or
on the 6.2 machine run/detach/input=procmon.com loginout.exe. So I'm
not specifying any quotas at all.
The following are show proc/quotas on the two diffrent machines:
Firstly the VMS 6.0 machine where it works...
DARMAH> sh proc/quota
5-FEB-1997 15:53:45.88 User: DEC_SYSTEM Process ID: 20200150
Node: DARMAH Process name: "Alen"
Process Quotas:
Account name: SYSTEM
CPU limit: Infinite Direct I/O limit: 200
Buffered I/O byte count quota: 65408 Buffered I/O limit: 10000
Timer queue entry quota: 200 Open file quota: 300
Paging file quota: 236201 Subprocess quota: 10
Default page fault cluster: 64 AST quota: 198
Enqueue quota: 1000 Shared file limit: 0
Max detached processes: 0 Max active jobs: 0
And now the VMS 6.2 machine where it doesn't...
VAX01 > sh proc/quota
5-FEB-1997 15:51:49.35 User: DEC_SYSTEM Process ID: 202025E2
Node: VAX01 Process name: "_VTA3409:"
Process Quotas:
Account name: SYSTEM
CPU limit: Infinite Direct I/O limit: 200
Buffered I/O byte count quota: 65406 Buffered I/O limit: 10000
Timer queue entry quota: 200 Open file quota: 300
Paging file quota: 97380 Subprocess quota: 10
Default page fault cluster: 64 AST quota: 198
Enqueue quota: 2048 Shared file limit: 0
Max detached processes: 0 Max active jobs: 0
|
143.3 | | CSC64::BLAYLOCK | If at first you doubt,doubt again. | Wed Feb 05 1997 11:10 | 10 |
| status = system("open/read/write output
decfm$exe:process_monitor_lock.file");
This rtl call (system) requires the presence of a CLI. That only
comes from running LOGINOUT as your image and your command procedure
as input. The status returned by this call if you do not run the
LOGINOUT image is LIB$_NOCLI.
|
143.4 | | CSC64::BLAYLOCK | If at first you doubt,doubt again. | Wed Feb 05 1997 16:21 | 6 |
|
Additional information; the 'system' call is waiting in DECC$WAIT
waiting for the subprocess to complete (that never existed
due to a lack of DCL). I believe that this is fixed for V7 such
that system returns a 0 on this failure. TURRIS::DECC would help
in determining that.
|
143.5 | Insunt interdum menda in eo quod est efficax | AUSS::GARSON | DECcharity Program Office | Wed Feb 05 1997 20:53 | 15 |
| re .0
I can't imagine that this *ever* _worked_ - on two counts.
1. I would think that the implementation of system() has not changed in
that it uses LIB$SPAWN and hence requires a CLI and hence requires
LOGINOUT. By testing status != FILELOCKED, you will treat all manner
of failures to create the subprocess as successful "not locked" checks.
2. The locking technique is fairly dubious. The system() call merely
tells you whether the file was locked at the time it was executed. The
file is CLOSEd immediately and then you really open it in C. What
you need to do is forget the system() call altogether and open the file
in C, possibly using some non-standard options to get the right
locking. As it stands, you have a timing window.
|
143.6 | There's quotas and there's quotas | RICKS::OPP | | Wed Feb 05 1997 22:18 | 8 |
| RE: .2
The process quotas shown are for the interactive process, not
a detached process. You need to compare SYSGEN> SHOW /PQL to view
the detached "environment".
Greg
|
143.7 | Can I go home now please? | CHEFS::WILLIAMSA | I wanna be Luke | Tue Feb 11 1997 05:16 | 66 |
| Thank you all for your replies.
re .5 Yes I realise that I have a potential timing problem here,
however for my needs this code is adequate (I've changed it to look for
success rather than failure though).
I can assure you though, running under VMS 6.0 this worked quite nicely
WITHOUT having to go through loginout.exe first. I have found other
differences between the two versions which have caused me other
headaches... The status codes return by GETJPI, or lack there of, but
I've got round that one now. The problem I'm having now with the
differences revolves around SYS$SETIMR...
Under VMS 6.0 the code calls the AST routine I've specified (see
below), and then returns execution to the main program when it's done,
under VMS 6.2 it runs the AST routine, exits it, and then sits there
doing nothing (a further AST call works perfectly well). It's typically
in a sleep statement when the AST routine is triggered.
If anybody has any ideas (I got the format etc from the 6.2 CD) I would
be grateful for the insight.
Alen.
void reset_runtoday(void)
{
write_audit_file("Resetting run today flags.");
send_request("Resetting run today flags.",2);
pointer2 = first_record;
while (pointer2->next != NULL)
{
if ((pointer2->bitmask & RUNNING) != RUNNING)
{
pointer2->bitmask = (pointer2->bitmask &
(BITMASKMAX-RUNTODAY));
pointer2->bitmask = (pointer2->bitmask &
(BITMASKMAX-WARN1));
pointer2->bitmask = (pointer2->bitmask &
(BITMASKMAX-WARN2));
pointer2->finish_time = 0;
}
pointer2 = pointer2->next;
}
sleep(2);
/* Sleep till we get past midnight. */
if (!((status = lib$date_time(¤t_time_desc)) & 1))
lib$stop(status);
strcpy(ast_time,return_extract(current_time,0,11));
strcat(ast_time,"23:59:59.00\0");
if (!((status = sys$bintim(&ast_time_desc,ast_tim)) & 1))
lib$stop(status);
if (!((status = sys$setimr(0,ast_tim,reset_runtoday,0)) & 1))
lib$stop(status);
write_audit_file("Returning to program.");
}
|
143.8 | Check sys$setimr ArgList | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Tue Feb 11 1997 16:16 | 45 |
|
: I can assure you though, running under VMS 6.0 this worked quite nicely
: WITHOUT having to go through loginout.exe first.
Then it silently failed. system() has *always* required a CLI.
This has been a day-one requirement, and has been a question
that has been asked *many* times over the years.
: I have found other
: differences between the two versions which have caused me other
: headaches... The status codes return by GETJPI, or lack there of, but
: I've got round that one now. The problem I'm having now with the
: differences revolves around SYS$SETIMR...
Use stsdef.h and `if (!$VMS_STATUS_SUCCESS( status )) do-failure-stuff;
: Under VMS 6.0 the code calls the AST routine I've specified (see
: below), and then returns execution to the main program when it's done,
: under VMS 6.2 it runs the AST routine, exits it, and then sits there
: doing nothing (a further AST call works perfectly well). It's typically
: in a sleep statement when the AST routine is triggered.
I'd discard the existing synchronization scheme, and move to locks.
($enq/$deq, etc. I would not use these lock file schemes.)
: If anybody has any ideas (I got the format etc from the 6.2 CD) I would
: be grateful for the insight.
There isn't enough code in the example to tell exactly what's going
on... I'm always a little suspicious of C manipulations of string
descriptors, as I've noticed a number of problems introduced in that
area. (I've made a few of these myself, too.)
I'd definitely move to a construct such as:
status = sys$setimr(0,ast_tim,reset_runtoday,0,0);
if (!$VMS_STATUS_SUCCESS( status ))
lib$stop(status);
As it can be easier to debug -- one can get at the values
from within the debugger more easily this way...
Also be aware that sys$setimr picked up a new argument a
while back -- add another null argument to the call.
|
143.9 | | AUSS::GARSON | DECcharity Program Office | Tue Feb 11 1997 20:43 | 39 |
| re .7
(
> I can assure you though, running under VMS 6.0 this worked quite nicely
> WITHOUT having to go through loginout.exe first.
The problem is that it didn't work under VMS V6.0. It just wasn't
evident that it wasn't working. That's software for you.
)
> under VMS 6.2 it runs the AST routine, exits it, and then sits there
> doing nothing (a further AST call works perfectly well). It's typically
> in a sleep statement when the AST routine is triggered.
There is insufficient code shown to explain the behaviour. I would not
expect the sleep() to be affected by the AST routine. That is, if you
are sleeping when the AST is delivered, you will still be sleeping once the
AST routine returns.
What does the sleep() call look like?
What terminates the sleep?
To the extent that I could follow the routine that you showed, it
looked OK (visual inspection). I ASSuME that somehow a space ends up
between the date and the time in the date/time that you construct.
Just a couple of random C programming comments...
I prefer for
> pointer2->bitmask = (pointer2->bitmask & (BITMASKMAX-RUNTODAY));
pointer2->bitmask &= ~RUNTODAY;
and you can clear more than one bit at a time using either approach.
> strcat(ast_time,"23:59:59.00\0");
\0 is redundant here (and nearly always in a string literal)
|