2015-11-04 85 views
1

在我开始之前,我是一名C初学者,我正在尝试做一些可能是错误的openCL工作。下面是我的内核代码:openCL Long溢出

__kernel void collatz(__global int* in, __global int* out) 
{ 
    uint id = get_global_id(0); 
    unsigned long n = (unsigned long)id; 
    uint count = 0; 

    while (n > 1) { 
     if (n % 2 == 0) { 
      n = n/2; 
     } else { 
      if(n == 1572066143) { 
       unsigned long test = n; 
       printf("BEFORE - %lu\n", n); 
       test = (3 * test) + 1; 
       printf("AFTER - %lu\n", test); 

       n = (3 * n) + 1; 
      } else { 
       n = (3 * n) + 1; 
      } 

     } 

     count = count + 1; 
    } 

    out[id] = count; 

} 

和输出:

BEFORE - 1572066143 
AFTER - 421231134 

对我来说,它看起来像n为四溢,但我想不出为什么它正在发生。

有趣的是,如果我创建一个新的变量来存储与n相同的值,那么它似乎正常工作。

unsigned long test = 1572066143; 
printf("BEFORE - %lu\n", test); 
test = (3 * test) + 1; 
printf("AFTER - %lu\n", test); 

输出:

BEFORE - 1572066143 
AFTER - 4716198430 

正如我所说的,我一个C初学者,所以我可以做一些非常愚蠢的!任何帮助将不胜感激,因为我已经把我的头发拉出几个小时了!

感谢, 斯蒂芬

更新:

这里是我的主代码的情况下,我做的事情上月底愚蠢:

int _tmain(int argc, _TCHAR* argv[]) 
{ 
    /*Step1: Getting platforms and choose an available one.*/ 
    cl_uint numPlatforms; //the NO. of platforms 
    cl_platform_id platform = NULL; //the chosen platform 
    cl_int status = clGetPlatformIDs(0, NULL, &numPlatforms); 

    cl_platform_id* platforms = (cl_platform_id*)malloc(numPlatforms* sizeof(cl_platform_id)); 
    status = clGetPlatformIDs(numPlatforms, platforms, NULL); 
    platform = platforms[0]; 
    free(platforms); 

    /*Step 2:Query the platform and choose the first GPU device if has one.*/ 
    cl_device_id  *devices; 
    devices = (cl_device_id*)malloc(1 * sizeof(cl_device_id)); 
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, devices, NULL); 

    /*Step 3: Create context.*/ 
    cl_context context = clCreateContext(NULL, 1, devices, NULL, NULL, NULL); 

    /*Step 4: Creating command queue associate with the context.*/ 
    cl_command_queue commandQueue = clCreateCommandQueue(context, devices[0], 0, NULL); 

    /*Step 5: Create program object */ 
    const char *filename = "HelloWorld_Kernel.cl"; 
    std::string sourceStr; 
    status = convertToString(filename, sourceStr); 
    const char *source = sourceStr.c_str(); 
    size_t sourceSize[] = { strlen(source) }; 
    cl_program program = clCreateProgramWithSource(context, 1, &source, sourceSize, NULL); 

    status = clBuildProgram(program, 1, devices, NULL, NULL, NULL); 

    /*Step 7: Initial input,output for the host and create memory objects for the kernel*/ 
    cl_ulong max = 2000000; 
    cl_ulong *numbers = NULL; 
    numbers = new cl_ulong[max]; 
    for (int i = 1; i <= max; i++) { 
     numbers[i] = i; 
    } 

    int *output = (int*)malloc(sizeof(cl_ulong) * max); 

    cl_mem inputBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, max * sizeof(cl_ulong), (void *)numbers, NULL); 
    cl_mem outputBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY, max * sizeof(cl_ulong), NULL, NULL); 

    /*Step 8: Create kernel object */ 
    cl_kernel kernel = clCreateKernel(program, "collatz", NULL); 

    /*Step 9: Sets Kernel arguments.*/ 
    status = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&inputBuffer); 


    // Determine the size of the log 
    size_t log_size; 
    clGetProgramBuildInfo(program, devices[0], CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size); 

    // Allocate memory for the log 
    char *log = (char *)malloc(log_size); 

    // Get the log 
    clGetProgramBuildInfo(program, devices[0], CL_PROGRAM_BUILD_LOG, log_size, log, NULL); 

    // Print the log 
    printf("%s\n", log); 


    status = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&outputBuffer); 

    /*Step 10: Running the kernel.*/ 
    size_t global_work_size[] = { max }; 
    status = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL, global_work_size, NULL, 0, NULL, NULL); 

    /*Step 11: Read the data put back to host memory.*/ 
    status = clEnqueueReadBuffer(commandQueue, outputBuffer, CL_TRUE, 0, max * sizeof(cl_ulong), output, 0, NULL, NULL); 


return SUCCESS; 

}

+1

可能只是一个编译器错误,当我在自己的机器上运行相同的内核时,我得到了正确的结果。你在使用哪种OpenCL平台和设备?你尝试过不同的吗? – jprice

+0

感谢您的回应并在您的最后尝试。我正在使用AMD SDK。我已经在我的英特尔集成显卡以及我的AMD显卡280x上尝试过它,但都产生了错误的结果。我可能会尝试在AMD开发人员论坛上提出同样的问题。 – stephenheron

回答

0

我终于到了问题的底部。

我在英特尔高清显卡4600芯片上运行代码,它产生了原始问题中显示的奇怪行为。我切换到使用我的AMD卡,然后开始按预期工作!

很奇怪。感谢大家的帮助!

0

主机端并且设备尺寸值具有不同的尺寸。在主机中,根据平台,long可以从32位到64位不等。在设备中,long仅指64位。

printf()函数,如C中定义的那样,%ld是打印long(主机端长)的数字。您在内核中使用printf,所以......可能是因为使用了类C语法分析器,因此将该变量打印为32位长。

你可以尝试打印它作为%lld或作为浮点?

+0

嗨,感谢您使用%lld尝试的回复,但得到相同的结果。 – stephenheron

+0

难道你错过了解释printf输出吗?工作项不按顺序运行,因此打印可以是任何线程。 Mybe你只是看着一个线程和另一个之前的线程。所有的情况或只有那一个会发生溢出吗?你可以添加一个ID吗? 'printf(“%u BEFORE - %lu \ n”,id,test);' – DarkZeros