Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1175

icpc slower than gcc?

$
0
0

I'm trying to make an optimized parallel version of [opencv SURF][1] and in particular [surf.cpp][2] using Intel C++ compiler.

I'm using Intel Advisor to locate inefficient and unvectorized loops. In particular, it suggests to rebuild the code using the `icpc` compiler (instead of `gcc`) and then to use the `xCORE-AVX2` flag since it's available for my hardware. 

So my original `cmake` for building opencv  using `g++` was:

    cmake -D CMAKE_BUILD_TYPE=RelWithDebInfo -D CMAKE_INSTALL_PREFIX=... -D OPENCV_EXTRA_MODULES_PATH=... -DWITH_TBB=OFF -DWITH_OPENMP=ON

And built the application which uses SURF with `g++ ... -O3 -g -fopenmp`

Using `icpc` instead is:

    cmake -D CMAKE_BUILD_TYPE=RelWithDebInfo -D CMAKE_INSTALL_PREFIX=... -D OPENCV_EXTRA_MODULES_PATH=... -DWITH_TBB=OFF -DWITH_OPENMP=ON -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS="-debug inline-debug-info -parallel-source-info=2 -ipo -parallel -xCORE-AVX2 -Bdynamic"

(in particular notice `-DCMAKE_C_COMPILER -DCMAKE_CXX_COMPILER -DCMAKE_CXX_FLAGS`)

And compiled the SURF application with: `-g -O3 -ipo -parallel -qopenmp -xCORE-AVX2` and `-shared-intel -parallel` for linking

I thought that the `icpc` solution was going to be faster than the `g++` one, but it isn't: `icpc` takes 0.15s while `g++` takes `0.12`s (I ran the experiments several times and these numbers are reliable).

Why this happens? Am I doing something wrong with `icpc`?

 

  [1]: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_surf_in...
  [2]: https://github.com/opencv/opencv_contrib/blob/master/modules/xfeatures2d...


Viewing all articles
Browse latest Browse all 1175

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>