Hi,
I need to make scaling graphs for an OpenMP application.
My machine is a Dual-Xeon (14 cores per Xeon), with hyper-threading. I would like to place threads using the OpenMP 4 standard, so using OMP_PLACES, OMP_PROC_BIND, OMP_NUM_THREADS.
One of the benchmark is the following: use 4 threads, the first two threads should be bound to the first core of the first socket, and the other 2 threads should be bound to the first core of the second socket. For that, I use:
export OMP_PLACES='{0}, {14}' export OMP_PROC_BIND=close export OMP_NUM_THREADS=4
but I am not sure that it does the right job. Bare in mind that I don't want the first and the third thread to be on core 0. I want the first and the second threads to be on this core as I want to use the first touch policy and limits the number of chunks of arrays being allocated in different NUMA domains.
Could you also confirm the number in OMP_PLACES is related to the core number, and that different NUMA domains (including on KNL with the 2 cores on a tile, and the quadrants) are grouped in ascending order.
Thanks for your help,
Francois