T.R | Title | User | Personal Name | Date | Lines |
---|
662.1 | reference book | STAR::DICKINSON | Peter | Tue Jan 12 1988 11:39 | 9 |
|
If you haven't already read chapter 7 of 'VAX/VMS Internals and Data
structures' ;Kenah and Bates, you might want to strongly consider doing so.
have fun...
|
662.2 | This is THICK! | CSC32::S_HALL | LST... More Coffee ! | Tue Jan 12 1988 11:45 | 5 |
|
Thanks for the steer....I love heavy wading....
Steve H
|
662.3 | | CHOVAX::YOUNG | Back from the Shadows Again, | Tue Jan 12 1988 23:25 | 1 |
| And of course, the "Guide to Writing Modular Procedures."
|
662.4 | Free Tips | TAV02::NITSAN | set profile/personal_name="set profile/personal_name= | Wed Jan 13 1988 00:43 | 30 |
| Some GENERAL recommenations:
* REMEMBER one user AST is waiting to be delivered AFTER the other.
* Keep them small. Especially don't do long i/o (as read operations) from
within an AST (unless you have some well defined need for it).
* Disabling AST from the main program is very much like being in an AST mode.
You may want to do it in critical sections of the main program, or just to
use an "event-driven" program (main program "hibernates" and ASTs only
execute).
* Declaring another AST from within an AST (by $DCLAST) is a nice trick for
queueing it for later delivery to your own process.
* Beware of timers and AST parameters - see a note in this conference, from
a few days ago (I don't remember the number).
* Beware of deadlocks: Don't "lock" things in AST which may be locked by the
main program.
* Caution with reentrancy. If you call the same piece of code in AST mode as
well as in main mode (especially in a static environment like Fortran), you
should be very careful not to do it 'concurrently' (there is some RTL routine
to verify whether you're in AST or not).
* Be careful in mixing ASTs with other asynchronous environments (as Ada).
* If you manage to find an old "VMS V3.* guide for realtime" - KEEP IT (why
don't we have something like this anymore?).
|
662.5 | | PSW::WINALSKI | Paul S. Winalski | Fri Jan 15 1988 21:13 | 24 |
| Two key points to remember about AST routines:
1) An AST cannot interrupt an AST that is in progress. ASTs are single-
threaded. Thus, you must not write code that depends on one AST being
able to interrupt another. For example:
[in subroutine A, which runs as an AST:]
...
$QIO(..., ASTADR=B, ...) ! start I/O w. B as completion
! AST routine
$HIBER() ! wait for B to go off and
! wake us back up
This works if subroutine A runs at normal level, but it will hang if A is
run at AST level because the I/O completion AST (routine B) cannot preempt
subroutine A--it just gets queued to go off as soon as A exits, which
won't happen because of the $HIBER().
2) A pending AST routine can interrupt normal-level code at any point. If you
are sharing data structures (for example) between normal-level code and
code that runs at AST level, you must use $SETAST to disable AST delivery
during critical sections of the normal-level code.
--PSW
|
662.6 | Not difficult, just detailed | SQM::HALLYB | You have the right to remain silent. | Sat Jan 16 1988 14:11 | 16 |
| Trying to be very succinct, let me rephrase some points made earlier:
[1] Do not "wait" at AST level. No $HIBER(), no $WAITFR() or
equivalent such as $QIOW() or $GETJPIW().
[2] It is OK to do *NO* IO at AST level, or *ALL* IO at AST level
but usually not SOME at each level. Often I use $DCLAST() to
make the first call to start IO at AST level.
[3] Try not to use static memory. Use LIB$GET_VM to create buffers,
data blocks, etc., on the fly. And of course use LIB$FREE_VM to
discard them when done. The standard trick is to put all your
context in a data block and use the address of that data block
as the AST parameter.
John
|
662.7 | Crash a system with CMEXEC? you bet! | DPDMAI::BEATTIE | But, Is BLISS ignorance? | Mon Jan 18 1988 12:27 | 22 |
| Let me add a warning about privileged mode ASTs, since I'm currently
working through such a problem:
Scenario: An EXEC mode AST is delivered to my process, and the
system crashes(!!). Why? SWAPPER trimmed the page containing the
entry mask of my AST routine out of my working set while I was
SYS$HIBERnating; EXE$ASTDEL (kernal mode, IPL 0) tried to invoke
my routine using whatever mechanism it uses, page faulted, and voilla!
Too bad the crash dump on the production system had my name on it
(*sigh*).
I've never seen this problem with USER mode ASTs and it may just
be a fluke, but it seems to me that if EXE$ASTDEL doesn't look before
it leaps (so to speak), that this scenario should be possible at
any mode.
Just to be safe, I think I'll SYS$LKWSET the page with the EXEC
mode AST routine entry mask. Any comments? Thanks.
-- Brian (Why oh why can't I get this
happen in my test periods?
(*Geez!*))
|
662.8 | required reading | MIDEVL::EVANS | Robert N. Evans DTN-291-8341 @DLB5-1/E2 | Mon Jan 18 1988 13:03 | 2 |
| Be sure to read and UNDERSTAND the short chapter on AST services in the
System Services reference manual.
|
662.9 | The fuel-supply seems corrupt to me.. | MDVAX3::COAR | M��se Choreographer | Mon Jan 18 1988 13:15 | 18 |
| I thought an AST could interrupt an active AST at a lower processor
mode - that is, a kernel-mode AST could interrupt a user-mode AST
in progress.
What was the crash in .7? If it was PGFIPLHI, I think your analysis
is incorrect. You are allowed to fault in kernel mode, just not
above IPL$_ASTDEL. Faulting *at* IPL$_ASTDEL is permitted, which
is why I think there's something lurking in the woods besides what
you mentioned.
Even a solicited bugcheck in executive mode (as opposed to an
exception) will not crash the system if SYSGEN parameter BUGCHECKFATAL
is set to zero. The most it will do is kill your process and [possibly]
write an error log entry [I think this only happens if the dying
process has BUGCHK privilege, but I'm not sure]. What was the setting
on your `production' machine?
#ken :-)}
|
662.10 | I disagree with John. Certain things are fine. | VIDEO::OSMAN | type video::user$7:[osman]eric.vt240 | Mon Jan 18 1988 13:46 | 33 |
| I don't agree with what John said:
>=============================================================================
>Note 662.6 Guidelines for AST Routines? 6 o
>SQM::HALLYB "You have the right to remain silent." 16 lines 16-JAN-1988 14
> -< Not difficult, just detailed >-
>-----------------------------------------------------------------------------
>
> Trying to be very succinct, let me rephrase some points made earlier:
>
> [1] Do not "wait" at AST level. No $HIBER(), no $WAITFR() or
> equivalent such as $QIOW() or $GETJPIW().
$HIBER won't work at AST level, since no user-mode AST can
possibly wake up the $HIBER. However, $WAITFR will work just
fine, as will $QIOW or $GETJPIW. These latter two
use $WAITFR internally. In fact, I'd say that if you're implementing
a library routine that needs to do I/O and wait for it to complete,
and you don't know whether your customers will call the library
routine at top-level or AST level, you MUST use $WAITFR, and you
MUSTN'T use AST's and $HIBER.
>
> [2] It is OK to do *NO* IO at AST level, or *ALL* IO at AST level
> but usually not SOME at each level. Often I use $DCLAST() to
> make the first call to start IO at AST level.
This seems unnecessary. Let's suppose your normal flow is to
use $QIO with an AST routine, and the AST routine will do the
next $QIO. There's nothing wrong with doing the initial $QIO
at top-level to get things going.
/Eric
|
662.11 | Expanding .7 (*sorry*) | DPDMAI::BEATTIE | But, Is BLISS ignorance? | Mon Jan 18 1988 17:21 | 116 |
| Pray pardon my continuing this only slightly relevent discussion
here, but as it concerns both AST usage considerations and my
previous note (.7) I'd like to include it. It appears that
the customer system configuration is to blame for the crash,
but I'm still concerned that the bugcheck occurred, regardless
of how it was handled...
Re: .9
I looked at the crash again after having read your reply,
and found that, yes, bugcheck crashes are turned on, but
my crash is SSRVEXCEPT. I've selected the info from the
crash that I used to conclude as noted in .7. Is my
conclusion unwarranted?
I'm still pretty green on crash-dump analysis, and would
appreciate knowing if I've overlooked something significant.
-- Brian
$ anal/crash crash.dmp
VAX/VMS System dump analyzer
Dump taken on 15-JAN-1988 16:08:14.89
SSRVEXCEPT, Unexpected system service exception
SDA> define ast_routine_entry_mask=01cfc
SDA> read sys$system:sys.STB
SDA> show crash
System crash information
------------------------
Time of system crash: 15-JAN-1988 16:08:14.89
Version of system: VAX/VMS VERSION V4.5
VAXcluster node name: MTRS
Reason for BUGCHECK exception: SSRVEXCEPT, Unexpected system service exception
Process currently executing: BEATTIE
Current IPL: 0 (decimal)
General registers:
R0 = 00000001 R1 = 8000FC0D R2 = 800021A4 R3 = 803BB3B0
R4 = 803BB360 R5 = 00000000 R6 = 7FFE34BC R7 = 00000002
R8 = 7FFE39C2 R9 = 7FF9C948 R10 = 7FF9C808 R11 = 7FFE0248
AP = 7FFE9DAC FP = 7FFE9D94 SP = 7FFE9D94 PC = 8000FC13
PSL = 01400000
SDA> show stack
Current operating stack
-----------------------
Current operating stack (EXECUTIVE): ! I seem to be in EXEC
! mode here instead
7FFE9D74 7FFE9DE4 ! of KERNEL, as listed
7FFE9D78 8003F4FC ! in .7. 1000 pardons.
7FFE9D7C 00000100
7FFE9D80 00000000
7FFE9D84 7FF9CB00
7FFE9D88 7FFE3A8E CTL$AG_CLIDATA+832
7FFE9D8C 00000000
7FFE9D90 20100000
SP => 7FFE9D94 00000000
7FFE9D98 00000000
7FFE9D9C 7FFED1E4
7FFE9DA0 7FFED1EC
7FFE9DA4 80000014 SYS$CALL_HANDL+004
7FFE9DA8 80017F16 EXE$CONTSIGNAL+07C
7FFE9DAC 00000002 ! <== AP
7FFE9DB0 7FFE9DD0
7FFE9DB4 7FFE9DB8
7FFE9DB8 00000004 ! <== mechargs @8(AP)
7FFE9DBC 7FFED1EC
7FFE9DC0 FFFFFFFD
7FFE9DC4 80863160
7FFE9DC8 00001CFC AST_ROUTINE_ENTRY_MASK
7FFE9DCC 0000000B
7FFE9DD0 00000005 ! <== sigargs @4(AP)
7FFE9DD4 0000000C ! SS$_ACCVIO
7FFE9DD8 00000001 ! reason 1
7FFE9DDC 00001CFC AST_ROUTINE_ENTRY_MASK
7FFE9DE0 80009E5B EXE$ASTDEL
7FFE9DE4 01400000 ! psl
7FFE9DE8 00000005
7FFE9DEC 00148780
7FFE9DF0 20000000
7FFE9DF4 00000000
7FFE9DF8 7FFEE12C SYS$SETPRV+02C
7FFE9DFC 02800000
SDA> exa/inst ast_routine_entry_mask
%SDA-E-NOTINPHYS, 00001CFC : not in physical memory !WHAT?? This is
!in a normal FORTRAN
!$code PSECT. If I
!am allowed a pagefault
!here, It should
!happen!!
SDA> exit
$
$ mc sysgen
SYSGEN> SHOW BUGCHECKFATAL
Parameter Name Current Default Minimum Maximum Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
BUGCHECKFATAL 1 0 0 1 Boolean D
SYSGEN> SHOW BUGREBOOT
Parameter Name Current Default Minimum Maximum Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
BUGREBOOT 1 1 0 1 Boolean D
SYSGEN> EXIT
|
662.12 | Your AST failed for some other reason | WIBBIN::NOYCE | Bill Noyce, Parallel Processing A/D | Tue Jan 19 1988 09:41 | 8 |
| When you get a crash dump, only the current contents of physical
memory is written to the dump. So, if your AST routine was paged
out at the time, SDA can't show it to you. That's all the "not
in physical memory" message means.
You got an ACCVIO because the address 1Cxx was off the end of your
P0 space (that's what "reason 1" means). This doesn't sound very
reasonable... Do you have a full link map of your program?
|
662.13 | Crash name is misleading, too | MDVAX3::COAR | M��se Choreographer | Tue Jan 19 1988 11:47 | 6 |
| And don't be misled by the `system service exception' message -
this crash is one of the two or three catchalls. It basically means
that the system was executing what it considered `trusted' (i.e.,
elevated mode) code, and something went blooey.
#ken :-)}
|
662.14 | watch out for entry masks | CSC32::S_LEDOUX | Scott LeDoux -- 8-522-4953 -- CXO3/2F2 | Tue Jan 19 1988 13:52 | 16 |
|
Two things:
1. Kernel (special kernel ? it's been a while) AST's
DO NOT want routine entry masks...if your ast routine
is using one, I suggest that you remove it. ASTDEL
delivers most ast's with a CALLG, but in the case
of (I think) kernel mode asts, ASTDEL uses JSB.
2. Also, kernel AST's should be delivered to an address
in S0 (makes things much simpler).
Also, aren't AST's delivered at IPL ASTDEL as opposed
to zero ?? Like I said, it's been a while since I've
had the pleasure of hacking in around with the exec.
|
662.15 | Clarification... | DPDMAI::BEATTIE | But, Is BLISS ignorance? | Tue Jan 19 1988 16:46 | 8 |
| re: .14
My AST was just the EXEC mode variety (results from SYS$ENQing
a lock in EXEC mode, and specifying a completion AST address.
SCH$ASTDEL executes at IPL$_ASTDEL, EXE$ASTDEL executes in at
IPL 0 (more or less)
-- Brian
|
662.16 | Eric gives knives to children, too :-) | SQM::HALLYB | You have the right to remain silent. | Tue Jan 19 1988 22:18 | 27 |
| Re: .10 [Re: .7]
Dammit, Eric, there's a difference between "guidelines to help somebody
who knows very little" and "how a hacker might sneak some code to work".
Suppose you write code of the form:
First Time AST routine (future IOs)
$QIO mumble,AST=AST_ROUTINE $QIO mumble, etc.
BLBC R0,error BLBC R0,error
BISL IO_busy_bit,etc. BISL busy_bit,etc.
Not everybody appreciates it, but this example has $QIOs in two
different places, and is therefore more difficult to maintain.
Furthermore, depending upon specifics of the code, there might
be a race condition whereby the "First Time" IO completes right
away, and the "IO_busy_bit" gets set when no IO is in progress.
This is the sort of error that may not be caught until years in the
future. If all IO is done at AST level then no such race will occur.
It seems logical to suggest that novices take that sort of approach,
since it produces cleaner code, in spite of the name of this file.
This is not to suggest that your comments in .10 were incorrect,
merely that that they were incorrectly given.
John
|
662.17 | | JON::MORONEY | Redundancy example: Criminal lawyer. | Tue Jan 19 1988 22:58 | 15 |
| re .16:
Doing a $QIO from non-AST level to get things going is OK as far as race
conditions, as long as that's the only time a non-AST QIO is performed. A race
condition is impossible since the AST won't fire until the first QIO is
*entirely* done. I do this all the time. If you want neatness, you can call a
common routine or do the SETAST, but often I find the first QIO is different
anyway.
If you have a complex net with several devices, and the ast for the QIO for one
device does a QIO on another device (with its own AST), rather than just a
"chain" where the N+1th AST is caused by the QIO from the nth AST, it may get
complex real fast, so you'd be better to do all your QIOs from the AST level.
-Mike
|
662.18 | ammending a rash statement (*sigh*) | DPDMAI::BEATTIE | But, Is BLISS ignorance? | Wed Jan 20 1988 10:31 | 39 |
| Re .7 et. al.
Please allow me to ammend my somewhat rash statements in .7
Instead of "watch out for priv'd mode ASTs", I SHOULD say,
"When your AST routine is in P0 space AND there is any chance
that the image might run down prior to AST delivery, watch out
for priv'd mode AST's".
Let me post the apparent answer to my problem from the VMSNOTES
conference, giving proper credit, as every good (and somewhat
humbled) hacker ought.
Thanks all for your help!
-- Brian
<<< VAXWRK::NOTES$DEVICE:[NOTES$LIBRARY]VMSNOTES.NOTE;2 >>>
-< VAX/VMS and more... *** DIGITAL INTERNAL USE ONLY *** >-
================================================================================
Note 122.5 AST routine only a pagefault away 5 of 7
OCENIA::BLAYLOCK "Kenneth Blaylock" 17 lines 19-JAN-1988 23:44
--------------------------------------------------------------------------------
The address may indeed be in the image file, but once image rundown
has occured, you no longer have a P0 space (^Y EXIT/STOP).
Because you ENQed your lock in EXEC mode, that lock is still hanging
around because image rundown will NOT DEQ the EXEC mode lock(s). When
the lock is granted, the AST is delivered to your process (or the
blocking AST) and you end up with an ACCVIO error in the ast delivery
code.
You are going to have to insure that all your locks are DEQed before
you allow the image to be run down. Because exit handlers are not
invoked when the user issues a STOP command, you'll have to use
a priviledge shareable image user rundown routine (see Appendix A of
the System Services Reference Manual).
-kgb
|
662.19 | Realtime users guide | MAMTS5::JGALLUN | It's a lesson to me... | Wed Jan 20 1988 15:50 | 9 |
| Reply to .4
I too miss the old V3 Realtime Users Guide, but a similar document
is available. It is called the VAX Realtime User's Guide and the
order number is EK-VAXRT-UG001. It seems to have most of the good
stuff that was in the old V3 book and some new stuff, too, even
a little bit about ELN. I'm not sure who to order it from though.
Joel
|
662.20 | BUGCHECKFATAL, exec mode, and a development system | ERIS::CALLAS | I've lost my faith in nihilism. | Thu Jan 21 1988 13:40 | 21 |
| re .18:
The analysis you got seems good enough, given that we don't have the
dump file to look at. If you still have it, go look at the process to
see if you have a P0 space or not.
One more comment: The *real* reason you crashed is that you ACCVIOed in
exec mode with BUGCHECKFATAL set. Now it's nice to have BUGCHECKFATAL
set a lot of the time, because (assuming the analysis of your problem
is correct) this problem would be real hard to debug if you didn't.
You'd simply see the process going *poof* and a 0000000C as the final
status in the accounting log.
If you are trying to do development of privileged code on a development
system, you should probably set BUGCHECKFATAL off. If you notice
strange things happening, you can get the system standalone, or toddle
off to a friendly MicroVAX and crash that sucker. That way you don't
get people marching into your office trying to make you feel guilty
because you ruined their week.
Jon
|
662.21 | John, children should use sharp knives instead of dull ones | VIDEO::OSMAN | type video::user$7:[osman]eric.vt240 | Thu Jan 21 1988 14:46 | 22 |
| I don't see what the problem is. At non-ast level, you have this:
start_io (io_done);
Your ast routine looks like this:
io_done:
handle_io ();
start_io (io_done);
The start_io routine looks like this:
start_io (astadr) :
$qio (efn,func,chan,iosb,astadr,astprm,p1,p2,p3,p4,p5,p6);
So we're doing I/O at top-level the first time, ast level each time
after, and we're cleanly only doing the $QIO in one place.
Have I missed any of your complaints ?
/Eric
|