Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1175

_GFX_offload weird behaviour

$
0
0

Hi,

I'm targeting Intel Graphics Technology with the API-Based offloading for asynchronous offloading. To begin, I try to offload this algorithm :

for (int i = 0; i < size; i++){
  A[i] = i;
}

So I wrote this code :

__declspec(target(gfx_kernel))
void fill(int * A, int size){
  _Cilk_for(int i = 0; i < size; i++){
    A[i] = i;
  }
}

int main() {
  int N = 1024;
  int * A = malloc(sizeof(int) * N);

  _GFX_share(A,N);

  _GFX_offload((void*)fill, A, N);
  _GFX_wait(0,-1);

  _GFX_unshare(A);
  free(A)

  return 0;
}

This code compiles and executes, but only the 780 firsts elements of A are effectively changed. I guess that's because of the max value of groups and threads but the number seems weird to me (_GFX_get_device_hardware_thread_count() returns 336).

So I have two questions : why 780 ? and how can I write a kernel that I can call with

_GFX_offload((void *)fill, A, N);

that does what I want it to do ?

Thanks, and have a nice day

Mathieu

Thread Topic: 

Help Me

Viewing all articles
Browse latest Browse all 1175

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>