[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Digital Extended Math Library |
Notice: | Kit locations: 9.last (UNIX), 10.last (VMS) |
Moderator: | RTL::CHAO FGREN |
|
Created: | Mon Apr 30 1990 |
Last Modified: | Tue Jun 03 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 324 |
Total number of notes: | 1402 |
320.0. "FFT DXML Benchmark" by TAV02::KATZAV () Tue Apr 08 1997 05:44
A customer here in Israel is runing a DXML FFT BM on 2100 with 2 cpus.
The results on 1 cpu is 1.9 sec. while on two is 3.1 !!
Could someone please look into the short piece of code and figure
why we get those results ??
Many Thanks,
Shimon.
==============================================================
Library : DXML
SIZE: 8K array * 1000
==============================================================
Code: try.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <time.h>
#include <dxmldef.h>
#define SIZE1 8192
void main ()
{
struct dxml_s_fft_structue ST ;
float in[SIZE1], out[SIZE1];
int i,a,status=1, stride=1, sz=SIZE1;
long int t_dxml ;
for (a=0;a<SIZE1;a++)
in[a]=a;
sfft_init_(&sz,&ST,&stride);
clock();
for (i=0;i<1000;i++)
sfft_apply ("r", "c", in, out, &ST, &stride);
sfft_exit_(&ST);
t_dxml=clock() ;
printf("Time DXML : %f sec/m", (double)t_dxml/CLOCKS_PER_SEC) ;
}
Compiling :
cc -migrate try.c -0 try -ldxmlp
Results:
cpu 1 : 1.9 sec
cpu 2 : 3.1 sec
KMP_STACKSIZE 262144
Hardware:
AS2100 DUNIX V4.0 (464)
T.R | Title | User | Personal Name | Date | Lines |
---|
320.1 | | RTL::HANEK | | Thu Apr 17 1997 17:20 | 7 |
|
The problem is that an 8k FFT is not big enough to make parallel processing
profitable.
In the sample code, you are performing 1000, 8k FFTs. In order to make parallel
processing attractive for this application, you should consider doing all 1000
FFTs at once - i.e. use some form of the grp_fft routine.
|