[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

206.0. "AST delivery" by CECMOW::MODEL () Tue Feb 18 1997 04:33

Hello,

	What is a rate of AST delivery? I mean How many ASTs can
	be delivered to user programm in a second?

regards,
vadim
T.RTitleUserPersonal
Name
DateLines
206.1How long is a piece of string?RTOAL2::MAHERTIER3 simply a better RPC!Tue Feb 18 1997 06:0528
    Hi Vadim,
    
    The answer is it depends on just about everything :-)
    
    1) What sort of machine your on.
    2) Is the machine loaded or does your program have the CPU
       to itself.
    3) I suppose most importanly, what does the AST do.
    
    ASTs are queued so only one AST of the same mode can be active at
    any given time. (ie a user mode AST can't interrupt another user
    mode AST but (if for example your user mode AST calls RMS then
    an RMS) exec mode AST can interrupt it. So I suggest the limiting
    factor as to how many ASTs per second can fire is "What work does
    the AST do?". If yor AST contains a WAIT(1sec) instruction then 
    you'll only get one AST/sec throughput.
    
    Anyway I put a simple example as a reply of an AST that doesn't
    do anything useful but gettime/quadword arithmatic/incr and
    declare itself again and gets around 3.5k to 4.5k per sec on
    a VAX something or other with noone else much on it.
    
    Regards Richard Maher.
    
    PS. ASTs are, have always been, and will always be the simplest
        most powerful and cost effective multi-threading mechanism
        available on VMS.
               
206.2quick exampleRTOAL2::MAHERTIER3 simply a better RPC!Tue Feb 18 1997 06:0663
identification division.
program-id.    test_ast.
data division.
working-storage section.
01  ss$_normal		pic s9(9)       comp value external ss$_normal.
01  ast_addr		pic s9(9)       comp value external tight_ast.
01  ast_param.
    03  start_time	pic s9(11)v9(7) comp.
    03  fire_cnt	pic s9(9)       comp.
    03  curr_time	pic s9(11)v9(7) comp.
01  sys_status          pic s9(9)	comp.
procedure division.
00.
    call "sys$gettim" using start_time giving sys_status.
    if sys_status not = ss$_normal call "lib$stop" using by value sys_status.

    display "Start time  = " start_time with conversion.

    call "sys$dclast"
	using	by value	ast_addr
		by reference	ast_param
		by value	0
	giving  sys_status.
    if sys_status not = ss$_normal call "lib$stop" using by value sys_status.

    display "End time    = " curr_time with conversion.
    display "Total fired = " fire_cnt with conversion.
    
    stop run.
*
end program test_ast.
identification division.
program-id.    tight_ast.
data division.
working-storage section.
01  ss$_normal		pic s9(9)       comp value external ss$_normal.
01  ast_addr		pic s9(9)       comp value external tight_ast.
01  sys_status          pic s9(9)	comp.
linkage section.
01  ast_param.
    03  start_time	pic s9(11)v9(7) comp.
    03  fire_cnt	pic s9(9)       comp.
    03  curr_time	pic s9(11)v9(7) comp.
procedure division using ast_param.
00.
    call "sys$gettim" using curr_time giving sys_status.
    if sys_status not = ss$_normal call "lib$stop" using by value sys_status.

    if curr_time - start_time >= 1 go to fini.

    add 1 to fire_cnt.

    call "sys$dclast"
	using	by value	ast_addr
		by reference	ast_param
		by value	0
	giving  sys_status.
    if sys_status not = ss$_normal call "lib$stop" using by value sys_status.
*
fini.
    exit program.
*
end program tight_ast.
206.3MOVIES::WIDDOWSONRodTue Feb 18 1997 08:518
    >PS. ASTs are, have always been, and will always be the simplest
    >    most powerful and cost effective multi-threading mechanism
    >    available on VMS.
    
    Here Here !!!
    		/rod
    
    
206.4More General Problem Statement?XDELTA::HOFFMANSteve, OpenVMS EngineeringTue Feb 18 1997 09:286
:	What is a rate of AST delivery? I mean How many ASTs can
:	be delivered to user programm in a second?

   What's the real question?  What problem or situation are you looking
   to resolve, or what situation are you looking to use ASTs to solve?

206.5CECMOW::MODELTue Feb 18 1997 12:5037
	Nothing special, just theoretical question,

	Let's imagine that we have communication device (stream-like)
	with extremely high performance (plus infinity), we have some
	kind of SYS$QIO interface to it and amount of data which can be
	transferred in a request is limited by, for example, 64K.

	astp () {
		sys$qio(next request, astp);
	}

	main () {
		// ...
		sys$qio(first request, astp);
	}

	In this programm transfer performance will be of course limited:

		T = K * 65536

	where:
		T is total transfer rate (bytes/sec)
		K is number of SYS$QIO requests a second this programm
		  can issue

	K depends on several things: number of system calls a second,
	how fast interrupts from hardware are processed, AST delivery
	rate, system load and so on... I'm just wondering which factor
	plays the most significant role in this K.

	PS: absolutely agree about ASTs, it is very convenient (and
	efficient) way to transfer data asynchronously, but some things
	require very fast threads, not only ASTs; ASTs is what I'd like
	to see in UNIX (signals are not enough)

	PPS: does anybody knows why astparam is 'int' not 'void *'? it
	makes no sense on 32 bit architectures, but 64 bits...
206.6Run A Test For Your EnvironmentXDELTA::HOFFMANSteve, OpenVMS EngineeringTue Feb 18 1997 13:3733
:	Let's imagine that we have communication device (stream-like)
:	with extremely high performance (plus infinity), we have some
:	kind of SYS$QIO interface to it and amount of data which can be
:	transferred in a request is limited by, for example, 64K.

   Truely high-throughput devices should be handled by code executing
   in the context of a device driver.  Moderately high-throughput code
   should look to the (OpenVMS Alpha V7.0 and later) Fast I/O routines.

:	In this programm transfer performance will be of course limited:
:
:		T = K * 65536

   Resorting to mathematics when empirical testing would seem more a
   appropriate approach? :-)

:	K depends on several things: number of system calls a second,
:	how fast interrupts from hardware are processed, AST delivery
:	rate, system load and so on... I'm just wondering which factor
:	plays the most significant role in this K.

   System load, processor performance, available pool, process quotas,
   I/O contention, the AST-code-path execution duration of any AST
   currently running, etc.   There's no single answer.

:	PPS: does anybody knows why astparam is 'int' not 'void *'? it
:	makes no sense on 32 bit architectures, but 64 bits...

   Because sys$qio is effectively a 32 bit call.  Please read the
   Guide to 64 Bit Addressing for details.  (The Fast I/O services
   use 64 bit addressing.)

206.7Some more data points - lots of ASTsGIDDAY::GILLINGSa crucible of informative mistakesTue Feb 18 1997 17:5215
  FWIW, using the program in .2:

  DEC 3000/500 (150MHz) OpenVMS/Alpha V6.1
Total fired =      28022

  AlphaStation 200 4/233 OpenVMS/Alpha V7.1
Total fired =      57479

  VAX 4000-50 OpenVMS/VAX V6.2
Total fired =       3247

  VAXserver 3100 OpenVMS/VAX V6.1
Total fired =        205	(oh dear!)

					John Gillings, Sydney CSC
206.8SYS$QIO prototype in starlet.h is 64-bitsEVMS::NOELWed Feb 19 1997 12:3228
:	PPS: does anybody knows why astparam is 'int' not 'void *'? it
:	makes no sense on 32 bit architectures, but 64 bits...

>   Because sys$qio is effectively a 32 bit call.  Please read the
>   Guide to 64 Bit Addressing for details.  (The Fast I/O services
>   use 64 bit addressing.)

Not exactly. SYS$QIO is fully 64-bit capable as of V7.0. The Fast I/O
services are also 64-bit capable. If you look at starlet.h (V7.0 and V7.1) 
you will notice that most services declare astprm as int. These are the 
32-bit only services. The 64-bit services, such as SYS$QIO, declare astprm 
as __int64. This allows an astprm declared as int to be accepted by 
the prototype without casting. 

STARLET.H is in SYS$LIBRARY:SYS$STARLET_C.TLB

The reason why astprm is "int" not 'void *' is historical. In STARLET.SDL,
whoever first added all the entries for the services used LONGWORD. In C, 
this turns into int. Of course, MACRO-32 and BLISS were the programming 
languages in use at the time.

Actually, there are several examples of parameters to system services that
can't be typed properly for everyone. Do most people store an int in astprm
or a pointer? What about P1 - P6 for SYS$QIO. These parameters are driver
specific. Since most drivers accept the buffer address in P1, P1 is void *
and the rest are __int64.

- Karen
206.9CECMOW::MODELFri Feb 21 1997 03:1320
> Not exactly. SYS$QIO is fully 64-bit capable as of V7.0. The Fast I/O

Does it mean that any application which passes pointers as parameters for
AST routines requires recompilation only and no code change to be run under
V7.*?

>   in the context of a device driver.  Moderately high-throughput code
>   should look to the (OpenVMS Alpha V7.0 and later) Fast I/O routines.

What is FAST IO? I'm new for VMS, so I do not yet know where to look for
what. Are there on-line books on VMS V7.*

>   AlphaStation 200 4/233 OpenVMS/Alpha V7.1
> Total fired =      57479

Does DCLAST make a call to OS kernel? If yes, AXP is extremely
FAST architecture -- it allows 100K hardware context switches
a second!

vadim
206.10PointersXDELTA::HOFFMANSteve, OpenVMS EngineeringFri Feb 21 1997 08:4637
:> Not exactly. SYS$QIO is fully 64-bit capable as of V7.0. The Fast I/O
:
:Does it mean that any application which passes pointers as parameters for
:AST routines requires recompilation only and no code change to be run under
:V7.*?

   No recompilation, no relinking, and no code changes are needed --
   OpenVMS is highly upward-compatible.

   For information on taking advantage of 64-bit addressing, see the
   Guide to 64 Bit Addressing (or whatever the name of that manual
   is) in the OpenVMS documentation set.

>   in the context of a device driver.  Moderately high-throughput code
>   should look to the (OpenVMS Alpha V7.0 and later) Fast I/O routines.

:What is FAST IO? I'm new for VMS, so I do not yet know where to look for
:what. Are there on-line books on VMS V7.*

   Yes.  See the low-numbered notes in this conference for pointers to
   the OpenVMS documentation.  Also see the low-numbered notes for pointers
   to CANASTA (the e-mail crashdump scanner), to COMET and STARS (some of
   the main support search engines), to the Internal and External
   AltaVista web search engines, and to the distribution kits for various
   OpenVMS releases.

:>   AlphaStation 200 4/233 OpenVMS/Alpha V7.1
:> Total fired =      57479
:
:Does DCLAST make a call to OS kernel? If yes, AXP is extremely
:FAST architecture -- it allows 100K hardware context switches
:a second!

   It's `Alpha', not `AXP'.  And yes, most system services -- including
   SYS$DCLAST -- enter kernel mode.  This AlphaStation 200 4/233 system
   is among the slowest of the Alpha systems.

206.11EVMS::KUEHNELAndy K�hnelFri Feb 21 1997 09:0825
    re .9
    
> Does it mean that any application which passes pointers as parameters for
> AST routines requires recompilation only and no code change to be run under
> V7.*?

    Just to expand a bit on .10...
    
    When you pass a parameter on OpenVMS Alpha, you always pass a 64-bit
    value.  This was true since V1.0.  It's defined this way in the calling
    standard.
    
    Since most of your parameters, including pointers, were only 32-bits
    wide prior to V7.0, the value passed was simply sign extended for the
    call.  It now depends on the called routine to use 32 or 64 bits worth
    of the parameter.
    
    In V7.0 we changed many system services to accept 64-bit values and/or
    64-bit pointers.  As long as you still pass 32-bit values, nothing
    changes, but you need to change your source, or at least re-compile
    with different switches to take advantage of 64-bit addresses.
    
    The V7.0 and V7.1 documentation contains _lots_ of detailed information
    on this stuff.  A good overview can be found in the OpenVMS Guide to
    64-bit Addressing.
206.12STAR::CROLLMon Feb 24 1997 10:5213
FAST-IO refers two four new system services that provide higher performance and
better SMP scaling than $QIO.

There's a very brief description in the V7.0 new features manual (section
4.11.1), and a complete description in the V7.0 (and V7.1) system services
reference manuals.

See BULOVA::DOCD$:[GRYPHON_FINAL.POST]OVMS_V71_SYSSERV_REF1.PS,
OVMS_V71_SYSSERV_REF2.PS for the V7.1 versions.

See BULOVA::DOCD$:[EAGLE_THETA_FINAL.POST]V70_NEW_FEATURES.PS.

John
206.13EEMELI::MOSEROrienteers do it in the bush...Mon Feb 24 1997 14:115
    hmh, I'm counting and counting and re-counting again. John can you
    tell me SS No.4 related to Fast I/O? I'm just counting setup, perform
    and cleanup. Am I missing something here.
    
    /cmos
206.14see VMSNOTES_V12 #622CUJO::SAMPSONMon Feb 24 1997 22:142
	For some Fast I/O programming examples, please see the archived
VMSNOTES_V12 conference, topic #622.
206.15STAR::CROLLTue Feb 25 1997 10:3114
Re 13:

IO$_SETUP
IO$_PERFORM
IO$_PERFORMW
IO$_CLEANUP

Et voila!  (That's French for "I can count better than you!")

(We can argue 'til we're blue about whether IO$_PERFORM and IO$_PERFORMW are
distinct -- but, there are two entries in the system service vector, and that's
how the V7.0 new features manual counted them!)

John
206.16EEMELI::MOSEROrienteers do it in the bush...Tue Feb 25 1997 15:244
    ok, I see how the americans are counting. You know I'm still learning
    lots of new things every day...
    
    /cmos
206.17AMCFAC::RABAHYdtn 471-5160, outside 1-810-347-5160Wed Feb 26 1997 09:5610
re .7, .9, .10:

>... lots of ASTs

>... 100K hardware context switches a second!

>... enter kernel mode.

Typically changing modes is not considered a context switch per se.  The
hardware context switch is just one step in a process context switch.
206.18CECMOW::MODELFri Mar 21 1997 06:3921
> Typically changing modes is not considered a context switch per se.  The
> hardware context switch is just one step in a process context switch.

i understand, enter kernel mode is not a context switch, unless we do a
rescheduling to another process, but this is hardware context switch; old
processors flush their TLBs, caches (partially or even the whole) jumping
between user and kernel, this is why i'm wondering about enter the kernel mode
(insert into list in user mode is much more time effective procedure, then
switch to kernel, insert and get back); i hope alpha does not flush all its own
internals, only necessary, to minimize mode switch impact;

interrupt processing also includes such a hardware context switch, so you
can not increase interrupt processing rate beyond some boundary, the only
way is to poll device, but even polling does not help since you must spent
some time to understand what happened with device, transmit or receive data,
process them, ...

so, originally, i was wondering about AST delivery inspired by interrupts
from communication device

vadim
206.19no flushing needed for these casesWIBBIN::NOYCEPulling weeds, pickin' stonesFri Mar 21 1997 08:388
I don't think we ever built a VAX that flushed a TLB or cache on switching
between user and kernel mode.  Certainly no Alpha works that way.  The TLB
contains "protection bits" that tell whether read and/or write is permitted
from each mode, and the bits are checked during address translation on each
memory reference.  The current mode just selects which bits are checked.

Interrupt processing needs to save and restore some registers, but it does
not generally flush any caches or TLB's (at least on VAX and Alpha).
206.20AMCFAC::RABAHYdtn 471-5160, outside 1-810-347-5160Fri Mar 21 1997 09:045
Just changing hardware state is not what we call a hardware context switch. 
Good grief, in a sense, every opcode (except perhaps a NOP) changes the hardware
state.  Just changing mode, as .19 says, does not flush anything.  Heck, there's
probably individual opcodes that are even more work then a CHMK, for example,
SQRT?