与共享阵列内核和几个当地的int:CUDA:使用缓存中的数据访问本地变量?
__global__ void myKern()
{
gloablID = ....; //initialize gloabl thread ID
__shared__ int TMS[3]; //populate shared array in a simple way
if (globalID == 0)
{
TMS[0] = 0;
TMS[1] = 1;
TMS[2] = 2;
}
__syncthreads();
int val0 = 69;
int val1 = 36;
int val2 = 92;
int random_number = .... //use cuRand to get a random number between 0 and 3
int output = TMS[random_number];
//at this point, I want the variable "output" to be used to access one of my local ints.
//For example, if "output" = 2, I want to be able to print val2 to screen.
//In a fantasy computer language this might look something like:
//std::cout<< "val" + "output";
//I just want 92 to be printed to the screen.
???
}
这可能看起来像一个奇怪的算法,但如果我能做到这一点,就会让我登记的速度与大尺寸的结合我的CUDA项目中的共享缓存。请不要暴力破解二进制解决方案,因为我将使用一个大小为2698的共享数组和33个局部变量。
能否请你澄清一下,你真的需要?你说:_if“输出”= 2,我希望能够将val2打印到screen_。然后你说:_std :: cout <<“val”+“output”_。看来你想处理一堆寄存器变量,因为它们是一个独特的数组和exloit数组指针算术? – JackOLantern
很抱歉,如果不清楚,很难解释。也许这将澄清: – Jordan
如果输出= 0,我想69打印。 – Jordan