Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1175

/QaxAVX,CORE-AVX2 generates incorrect code for AVX2

$
0
0

I'm currently fighting hard with the newest version of ICL. First the profile generated code crashes in Intel stuff (https://software.intel.com/en-us/forums/intel-c-compiler/topic/760787#co...) and now this:

- If I compile /fp:fast /OxAVX2, everything is fast, the executable is huge and I cannot make it smaller using profile based build (see the other post). And it runs only on AVX2 CPUs.

- If I compile /fp:fast /OxSSE2 /OaxAVX, everything is fast, less but still

- If I compile /fp:fast /OxSSE2 /OaxAVX,CORE-AVX2, it's superfast, actually faster than /OxAVX2 :), that itself is weird, and well, it doesn't work - some calculations just result in some nonsense, in the superhuge code I cannot really post any "minimum example" or anything.

- If I compile /fp:precise /OxSSE2 /OaxAVX,CORE-AVX2, it gets superslow and huge, but works :).

It's pretty obvious that some optimization makes things dysfunctional and since having alternative path to AVX2 is faster than compiling the whole thing directly for AVX2 (albeit not working correctly), something is not working as it should. For the record, it's audio processing, contains lots of vectorizable loops for crossmultiplication of buffers etc.

 


Viewing all articles
Browse latest Browse all 1175

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>