Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1175

Testing SIMD on KNL

$
0
0

Hello All,

Hope I am asking in the right forum!!

I have a simple/naive question, , I made a simple program to run on one thread of KNL (68 cores, Flat-Quadrant, MCDRAM used). I ran my code twice with the following configurations:

1) #pragma simd reduction(...) at the top of the loop and compiler option -xMIC_AVX512. 

2) #pragma novector and removed -xMIC_AVX512 and added -no-simd. The loop is not vectorized and no AVX instructions are used (checked the assembly file).

The GFLOPS of the first one is 1.5 GFLOPS and for the second one is 0.8. The speedup is almost 2X only. Can anyone please explain why I don't get a good speedup (Closer to 8) ? 

long count  = 10000000
//Same loop for Cold Start
stime = dsecnd();
//1) #pragma simd reduction(+:result)
//2) #pragma novector
for (long i = 0; i < count; i++ )
{
    result += (A[i] * B[i]);
}

etime = dsecnd();

double bestExTime = (etime - stime);
double gplops = (1.e-9 * 2.0 * count) / bestExTime;
printf("%f,%f\n" ,result,  gplops);

 Thanks,


Viewing all articles
Browse latest Browse all 1175

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>