用法<<在C或CUDA

幂

// create arrays of 1M elements 
const int num_elements = 1<<20;

在下面的代码

的含义是什么？它是特定于CUDA还是可以在标准C中使用？

当我printf“版num_elements我num_elements == 1048576

这原来是2^20。那么运算符是否是C中取幂指数的简写？

// This example demonstrates parallel floating point vector 
// addition with a simple __global__ function. 

#include <stdlib.h> 
#include <stdio.h> 


// this kernel computes the vector sum c = a + b 
// each thread performs one pair-wise addition 
__global__ void vector_add(const float *a, 
          const float *b, 
          float *c, 
          const size_t n) 
{ 
    // compute the global element index this thread should process 
    unsigned int i = threadIdx.x + blockDim.x * blockIdx.x; 

    // avoid accessing out of bounds elements 
    if(i < n) 
    { 
    // sum elements 
    c[i] = a[i] + b[i]; 
    } 
} 


int main(void) 
{ 
    // create arrays of 1M elements 
    const int num_elements = 1<<20; 

    // compute the size of the arrays in bytes 
    const int num_bytes = num_elements * sizeof(float); 

    // points to host & device arrays 
    float *device_array_a = 0; 
    float *device_array_b = 0; 
    float *device_array_c = 0; 
    float *host_array_a = 0; 
    float *host_array_b = 0; 
    float *host_array_c = 0; 

    // malloc the host arrays 
    host_array_a = (float*)malloc(num_bytes); 
    host_array_b = (float*)malloc(num_bytes); 
    host_array_c = (float*)malloc(num_bytes); 

    // cudaMalloc the device arrays 
    cudaMalloc((void**)&device_array_a, num_bytes); 
    cudaMalloc((void**)&device_array_b, num_bytes); 
    cudaMalloc((void**)&device_array_c, num_bytes); 

    // if any memory allocation failed, report an error message 
    if(host_array_a == 0 || host_array_b == 0 || host_array_c == 0 || 
    device_array_a == 0 || device_array_b == 0 || device_array_c == 0) 
    { 
    printf("couldn't allocate memory\n"); 
    return 1; 
    } 

    // initialize host_array_a & host_array_b 
    for(int i = 0; i < num_elements; ++i) 
    { 
    // make array a a linear ramp 
    host_array_a[i] = (float)i; 

    // make array b random 
    host_array_b[i] = (float)rand()/RAND_MAX; 
    } 

    // copy arrays a & b to the device memory space 
    cudaMemcpy(device_array_a, host_array_a, num_bytes, cudaMemcpyHostToDevice); 
    cudaMemcpy(device_array_b, host_array_b, num_bytes, cudaMemcpyHostToDevice); 

    // compute c = a + b on the device 
    const size_t block_size = 256; 
    size_t grid_size = num_elements/block_size; 

    // deal with a possible partial final block 
    if(num_elements % block_size) ++grid_size; 

    // launch the kernel 
    vector_add<<<grid_size, block_size>>>(device_array_a, device_array_b, device_array_c, num_elements); 

    // copy the result back to the host memory space 
    cudaMemcpy(host_array_c, device_array_c, num_bytes, cudaMemcpyDeviceToHost); 

    // print out the first 10 results 
    for(int i = 0; i < 10; ++i) 
    { 
    printf("result %d: %1.1f + %7.1f = %7.1f\n", i, host_array_a[i], host_array_b[i], host_array_c[i]); 
    } 


    // deallocate memory 
    free(host_array_a); 
    free(host_array_b); 
    free(host_array_c); 

    cudaFree(device_array_a); 
    cudaFree(device_array_b); 
    cudaFree(device_array_c); 
}

来源

2011-11-04 smilingbuddha

<< <<是左移...检查http://en.wikipedia.org/wiki/Logical_shift – Aziz

不，<<运算符是移位运算符。它需要一个数字的位，如00101，并将它们移到左边的和之间，这会将数字乘以2的幂。所以x << y是x * 2^y。这是数字方式的结果存储在计算机内部，这是二进制的。

例如，数1，当在2的补32位整数（它是）存储：

00000000000000000000000000000001

当你

1 << 20

你正在采取一切在1的那个二进制表示中，并将它们移过20的地方：

00000000000100000000000000000000

这是2^20。这也适用于符号 - 幅度表示，1的补等

另一个例子，如果你采取的5表示：

00000000000000000000000000000101

，做，你

00000000000000000000000000001010

哪是10或5 * 2^1。

相反地，>>将除法由2的幂通过转移到右 n位移动的位。

来源

2011-11-04 16:32:27

但是，C不需要2的补码。 –

实际上2的补码只适用于有符号整数，C不需要它。同时，对于_binary_数字，换档工作会导致每次向左移动数值时乘以2（基数）。（同样，如果左移十进制数字，则每次移位乘以10） – Arkku

数字不以2的补码存储，2的补码是对一系列比特的解释。你可以将float解释为int，使用shift操作符，它将起作用。结果不会是你所期望的。 – Femaref

这是一个转变。在二进制中，取一个1，向左移动20个位置相当于乘以2^20

编辑：是的，它是标准的C和一个非常好的方式，使用户清楚它是单个1在20位中，比写更多int a = 1048576;

来源

2011-11-04 16:34:18

......这是标准C. –

（标准）C左移操作符<<通过将其左侧的值的位（二进制数字）向左移动所示的“空格”由右边的值（填充右边的零），即1 < < 20导致二进制数，1后面跟着20个零。由于二进制是基数2，所以每次向左移动两倍的值（乘以基数），即它等于乘以2的幂。

这个二进制数的属性可以利用乘以2的幂乘正整数，比使用更一般的数学函数更快。（同样在小学数学中，可以利用十进制数的类似性质来处理10 ...的功率）

来源

2011-11-04 16:44:23 Arkku

是的，谢谢。 =） – Arkku

用法<<在C或CUDA

回答

相关问题