2011-10-01 50 views
0

更新:在我的内核int4是错误的。与PyOpenCL结构对齐

我正在使用pyopencl,但无法使结构对齐正常工作。在下面调用内核两次的代码中,b值正确返回(如1),但c值有一些“随机”值。

换句话说:我想读一个结构的两个成员。我可以阅读第一个,但不是第二个。为什么?

无论我使用numpy结构化数组还是使用struct打包,都会发生同样的问题。并且评论中的_-attribute__设置也没有帮助。

我怀疑我在代码的其他地方做了些愚蠢的事,但看不到它。任何帮助赞赏。

import struct as s 
import pyopencl as cl 
import numpy as n 

ctx = cl.create_some_context() 
queue = cl.CommandQueue(ctx) 

for use_struct in (True, False): 

    if use_struct: 
     a = s.pack('=ii',1,2) 
     print(a, len(a)) 
     a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, len(a)) 
    else: 
#  a = n.array([(1,2)], dtype=n.dtype('2i4', align=True)) 
     a = n.array([(1,2)], dtype=n.dtype('2i4')) 
     print(a, a.itemsize, a.nbytes) 
     a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, a.nbytes) 

    b = n.array([0], dtype='i4') 
    print(b, b.itemsize, b.nbytes) 
    b_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, b.nbytes) 

    c = n.array([0], dtype='i4') 
    print(c, c.itemsize, c.nbytes) 
    c_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, c.nbytes) 

    prg = cl.Program(ctx, """ 
     typedef struct s { 
      int4 f0; 
      int4 f1 __attribute__ ((packed)); 
//   int4 f1 __attribute__ ((aligned (4))); 
//   int4 f1; 
     } s; 
     __kernel void test(__global const s *a, __global int4 *b, __global int4 *c) { 
      *b = a->f0; 
      *c = a->f1; 
     } 
     """).build() 

    cl.enqueue_copy(queue, a_dev, a) 
    event = prg.test(queue, (1,), None, a_dev, b_dev, c_dev) 
    event.wait() 
    cl.enqueue_copy(queue, b, b_dev) 
    print(b) 
    cl.enqueue_copy(queue, c, c_dev) 
    print(c) 

输出(我不得不重新格式化,同时切+粘贴,所以可能会搞砸线略微突破;我还添加了评论,指出各种打印值):

# first using struct 
/home/andrew/projects/personal/kultrung/env/bin/python3.2 /home/andrew/projects/personal/kultrung/src/kultrung/test6.py 
b'\x01\x00\x00\x00\x02\x00\x00\x00' 8 # the struct packed values 
[0] 4 4        # output buffer 1 
[0] 4 4        # output buffer 2 
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/cache.py:343: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU   T5600 @ 1.83GHz' at 0x1385a20> succeeded, but said: 

Build started Kernel <test> was successfully vectorized Done. warn("Build succeeded, but resulted in non-empty logs:\n"+message) 
[1]   # the first value (correct) 
[240]  # the second value (wrong) 

# next using numpy 
[[1 2]] 4 8 # the numpy struct 
[0] 4 4  # output buffer 
[0] 4 4  # output buffer 
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/__init__.py:174: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU   T5600 @ 1.83GHz' at 0x1385a20> succeeded, but said: 

Build started Kernel <test> was successfully vectorized Done. warn("Build succeeded, but resulted in non-empty logs:\n"+message) 
[1]  # first value (ok) 
[67447488] # second value (wrong) 

Process finished with exit code 0 

回答

0

好吧,我不知道我从哪里得到int4 - 我认为它必须是英特尔扩展。由于内核类型按预期工作,因此切换到AMD,并且使用int。一旦我清理了一些东西,我会在http://acooke.org/cute/Somesimple0.html上发帖。

0

在OpenCL的程序,尝试对结构本身packed属性,而不是成员之一:

typedef struct s { 
     int4 f0; 
     int4 f1; 
} __attribute__((packed)) s; 

这可能是因为你只能有一个我packed属性大部分结构,它可能没有包装整个结构。

+0

谢谢,我只是试了一下,但它没有解决问题(也是在这里“包装”下的第一个例子http://www.khronos.org/registry/cl/sdk/1.0/docs/man/ xhtml/attributes-variables.html表明它应该是我拥有它的地方,我认为) –