Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1175

ICC 2017 Bug - Bad ordering of instructions

$
0
0

There seems to be a bug in Intel2017. Given the following code, compiled with "icc -std=c99 -O3 *.c" (with an Ivy Bridge processor):

// File: accumulate.c
void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output)
{
    output[0] += input[0]; // first offset is always zero
    for(int i = 1; i < 4; i++)
        output[offsets[i]] += input[i];
}

 

// File: main.c
#include <stdio.h>

void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output);


int main(void)
{
    int offsets[4] = {0, 0, 1, 1};
    double input[4] = {1.0, 2.0, 3.0, 4.0};
    double output[4] = {0.0, 0.0, 0.0, 0.0};

    accumulate(offsets, input, output);
    printf("Results: %12.6e %12.6e %12.6e %12.6e\n",
           output[0], output[1], output[2], output[3]);

    return 0;
}

 

The resulting output is "1.000000e+00 7.000000e+00 0.000000e+00 0.000000e+00". The first value is incorrect. It is correct with Intel2016, or if I remove the restrict keywords.

In looking at the disassembled code for accumulate(), the compiler doesn't seem to realise that output[0] and output[offset[...]] can be the same memory location. It reorders the instructions so that one actually overwrites the value of the other, as opposed to letting output[0] always go first.

movsxd rax,DWORD PTR [rdi+0x4]         <---- rax = 0 in this case
movsxd rcx,DWORD PTR [rdi+0x8]
movsxd r8,DWORD PTR [rdi+0xc]
movsd  xmm1,QWORD PTR [rdx+rax*8]      <----
movsd  xmm0,QWORD PTR [rdx]            <---- [rdx] and [rdx+rax*8] are the same!
addsd  xmm1,QWORD PTR [rsi+0x8]
addsd  xmm0,QWORD PTR [rsi]
movsd  QWORD PTR [rdx+rax*8],xmm1      <---- puts results in output[0]
movsd  xmm2,QWORD PTR [rdx+rcx*8]
movsd  QWORD PTR [rdx],xmm0            <---- overwrites [rdx+rax*8]
addsd  xmm2,QWORD PTR [rsi+0x10]
movsd  QWORD PTR [rdx+rcx*8],xmm2
movsd  xmm3,QWORD PTR [rdx+r8*8]
addsd  xmm3,QWORD PTR [rsi+0x18]
movsd  QWORD PTR [rdx+r8*8],xmm3
ret

 


Viewing all articles
Browse latest Browse all 1175

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>