More OpenCL questions...
It seems one of my kernels is just crashing everything (including the screen driver). I am assuming the issue is that each thread tries to access the same memory at once (only reading). The problem 2 arrays are xGrid/grid_E is a short array (length 1024) that gets accessed at a random location a few 1000 times by each of 1000 threads. Looks like this:
for (int i=loopStart; i<loopEnd; ++i) {
grid_index = convert_int(x[i]/dx);
grid_index = ID;
Eparticle[i] = grid_E[grid_index]*( (xGrid[grid_index+1]-x[i])/dx ) + grid_E[grid_index+1]*( (x[i]-xGrid[grid_index])/dx ),
}
I am thinking if it could solve the issue to just make 1000 copies of the arrays (will just be 16MB) so each thread has it's own. But I am wondering should this really be neccesary? Since the arrays are marked as "const", shouldn't the GPU cache it in local memory by itself and not access so much globally? (Their memory buffers are however marked READ/WRITE rather than READ ONLY, as some other kernel needs to write to them).
It seems one of my kernels is just crashing everything (including the screen driver). I am assuming the issue is that each thread tries to access the same memory at once (only reading). The problem 2 arrays are xGrid/grid_E is a short array (length 1024) that gets accessed at a random location a few 1000 times by each of 1000 threads. Looks like this:
for (int i=loopStart; i<loopEnd; ++i) {
grid_index = convert_int(x[i]/dx);
grid_index = ID;
Eparticle[i] = grid_E[grid_index]*( (xGrid[grid_index+1]-x[i])/dx ) + grid_E[grid_index+1]*( (x[i]-xGrid[grid_index])/dx ),
}
I am thinking if it could solve the issue to just make 1000 copies of the arrays (will just be 16MB) so each thread has it's own. But I am wondering should this really be neccesary? Since the arrays are marked as "const", shouldn't the GPU cache it in local memory by itself and not access so much globally? (Their memory buffers are however marked READ/WRITE rather than READ ONLY, as some other kernel needs to write to them).
Specs: intel i5 3570k @ 3.4GHz;
16Gb RAM; Raedon HD 7900;
Win8 64-Bit
16Gb RAM; Raedon HD 7900;
Win8 64-Bit
