[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::languages

Title:	Languages
Notice:	Speaking In Tongues
Moderator:	TLE::TOKLAS::FELDMAN

Created:	Sat Jan 25 1986
Last Modified:	Wed May 21 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	394
Total number of notes:	2683

383.0. "keeping functional units busy" by STAR::PRAETORIUS (what does the elephant need?) Thu Sep 08 1994 16:12

     With the recent rise in popularity of multiple issue machines,
which usually have a number of functional units dedicated to floating
point operations, the thought occurred to me that integer code might
go a little faster if some of the operations could be simulated in
the floating units (presumably one would pick things that were less
latency sensitive, since latencies tend to be higher on the FP side).

     My next thought was that, since commercial multiple issue machines
have been around for 30 years (the CDC 6600 is a quad issue box), there
must've already been some interesting research on this.  Does anybody
know where I could find it?

T.R	Title	User	Personal Name	Date	Lines
383.1		AUSSIE::GARSON	achtentachtig kacheltjes	`Thu Sep 08 1994 19:05`	6
	re .0 Can't help with research pointers but I guess in a sort of way Alpha already does this by not having an integer divide instruction. Perhaps the folks who designed the Alpha chip looked at this kind of thing and could provide pointers.
383.2		SMOP::glossop	Kent Glossop	`Thu Sep 08 1994 21:28`	39
	This tends to be a fairly classic "phase ordering problem". You can try to pick FP operations for some integer code, but since the latencies are longer, you really only want to do it when you can tell that changing the type will actually help (which would typically be during scheduling). Almost all (or all) production compilers in existance do code selection, then code scheduling. I can imagine doing scheduling, then code changes, then re-scheduling, but... In practise, the only thing I've seen that you might want to be able to do is to do memory/memory copies using the FP regs on occasion (and that isn't for latency or scheduling - it's to avoid integer register pressure). In general, the benefits seem to be extremely marginal. (Much lower than most other transforms which will be more generally applicable.) Part of the issue is that there are very few complete expressions that can be done in floating point without requiring additional conversion operations to be inserted (and in most cases, that winds up being a lose.) Note that one thing that GEM will already do in some limited cases is strength reduction using floating point. For example, if you have an integer variable that is always converted to double in a loop, a version of the value is kept in floating point. A simple example is: double d() { int i; double s = 0.0; for(i=0; i<100; i++) s += i; return s; } Where there's effectively a copy of "i" in the floating register set that gets incremented by 1.0d0. (Note that this is kind of ironic - the best transformation in this case would be to determine that in fact the floating point value only contained integral values, and instead do the whole loop in integer and only convert to floating point at the return - which would be about a 5x improvement on EV4...)
383.3	Gilding a lily	QUARRY::reeves	Jon Reeves, UNIX compiler group	`Fri Sep 09 1994 11:06`	2
	Actually, the best transformation would be to precompute the result and simply turn the function into "return 4950.0;".
383.4		SMOP::glossop	Kent Glossop	`Fri Sep 09 1994 13:41`	24
	Yep. GEM will do that for small numbers when the loop is completely unrolled, but doesn't try to actually interpret loops. Just for example: 1 double d() { 2 int i; 3 double s = 0.0; 4 for(i=0; i<5; i++) 5 s += i; 6 return s; 7 } d:: ; 000001 ldah gp, d ; gp, (r27) lda gp, d ; gp, (gp) ldq r28, (gp) ; r28, (gp) ; 000005 ldt f0, (r28) ; f0, (r28) ret r26 ; r26 ; 000006 .section .lita, QUAD, noexe, rd, nowrt .address .lit8 .section .lit8, QUAD, noexe, rd, nowrt .double 10.0000000000000
383.5	not sure I got your drift	STAR::PRAETORIUS	what does the elephant need?	`Thu Sep 15 1994 14:17`	3
	Is the intent of .2 that it's not really a good idea, or that it's not feasible to do with traditional compiler organization (or both or neither)?
383.6		SMOP::glossop	Kent Glossop	`Thu Sep 15 1994 14:38`	9
	Both: - The opportunities with current hardware appear to be very limited (given the lack of similar available functions in the "architectural functional units") - Attempting to exploit those (very few) opportunities might well take a different compiler organization in order to have a chance of being a net gain.
383.7		RANGER::BRADLEY	Chuck Bradley	`Fri Feb 17 1995 17:55`	8
	re .0 no pointer to research, but some history. i've heard several times of programs on CDC6600 using floating point operations for indexing counted loops. i do not remember ever hearing if it was a compiler trick or was only done by assembly language programmers.
383.8		AUSSIE::BELL	Caritas Patiens est	`Sun Feb 26 1995 21:42`	7
	I don't remember see any CDC compiler that used floating point for loop indexing. But the 6600s did not have an integer multiply instruction, and any index calculation that required a multiplication request floating point operations. This was fixed in the Cyber 70 series when the DXn Xn*Xn was made to do an integer multiply when both exponents were zero. Peter.