如何释放GPU内存并在Pyopencl中为不同的阵列使用相同的缓冲区？

以下是参考我的工作代码：如何释放GPU内存并在Pyopencl中为不同的阵列使用相同的缓冲区？

vector = numpy.array([1, 2, 4, 8], numpy.float32) #cl.array.vec.float4 
matrix = numpy.zeros((1, 4), cl.array.vec.float4) 
matrix[0, 0] = (1, 2, 4, 8) 
matrix[0, 1] = (16, 32, 64, 128) 
matrix[0, 2] = (3, 6, 9, 12) 
matrix[0, 3] = (5, 10, 15, 25) 
# vector[0] = (1, 2, 4, 8) 


platform=cl.get_platforms() #gets all platforms that exist on this machine 
device=platform[0].get_devices(device_type=cl.device_type.GPU) #gets all GPU's that exist on first platform from platform list 
context=cl.Context(devices=[device[0]]) #Creates context for all devices in the list of "device" from above. context.num_devices give number of devices in this context 
print("everything good so far") 
program=cl.Program(context,""" 
__kernel void matrix_dot_vector(__global const float4 * matrix,__global const float *vector,__global float *result) 
{ 
int gid = get_global_id(0); 

result[gid]=dot(matrix[gid],vector[0]); 
} 

""").build() 
queue=cl.CommandQueue(context) 
# queue=cl.CommandQueue(context,cl_device_id device) #Context specific to a device if we plan on using multiple GPUs for parallel processing 

mem_flags = cl.mem_flags 
matrix_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=matrix) 
vector_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=vector) 
matrix_dot_vector = numpy.zeros(4, numpy.float32) 
global_size_of_GPU= 0 
destination_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, matrix_dot_vector.nbytes) 
# threads_size_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, global_size_of_GPU.nbytes) 
program.matrix_dot_vector(queue, matrix_dot_vector.shape, None, matrix_buf, vector_buf, destination_buf) 

## Step #11. Move the kernel’s output data to host memory. 
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf) 
# cl.enqueue_copy(queue, global_size_of_GPU, threads_size_buf) 
print(matrix_dot_vector) 
# print(global_size_of_GPU) 

# COPY SAME ARRAY FROM GPU AGAIN 
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf) 
print(matrix_dot_vector) 
print('copied same array twice')

我怎么能免费matrix_buf & destination_buf对GPU的内存。一个是只读的，另一个是只写的。
如何在同一个matrix_buf中加载不同的矩阵数组，而不需要必须在pyopencl中创建新的缓冲区。我读到，如果我加载新的数据在相同的缓冲区，它会快得多，然后重新创建相同大小的缓冲区每次。
如果我在旧缓冲区中加载的新阵列的大小比那个缓冲区中的旧阵列小，那么可以。新阵列必须具有完全相同的缓冲区大小？

来源

2017-05-26 Aseem Hegshetye

matrix_buf.release（）& destination_buf.release（） - 这将释放分配用于在GPU各缓冲器的存储器中。它更好地释放内存，如果它没有用，以避免遇到内存错误。如果GPU功能退出，所有的GPU内存都会被pyopencl自动清除。（队列，矩阵_buf，矩阵_2） - 在matrix_buf中加载一个新的matrix_2数组而不重新创建一个新的矩阵buf。
可以重新使用现有的缓冲区并使用其中的一部分。在内核方面，我们可以控制要访问的部分。 - {由doqtor}

来源

2017-06-05 23:07:46

回复1.我相信当缓冲区的变量超出范围或可以显式调用release()缓冲区将被释放。在这种情况下，缓冲区是只读还是只写不重要。
Re 2.试试pyopencl.enqueue_map_buffer()它返回一个可以从主机端修改的数组。更多here。
Re 3.如果您想重新使用现有的缓冲区并使用其中的一部分，那很好。在内核方面，您可以控制要访问的部分。

来源

2017-05-29 09:43:35 doqtor

u能请解释“发布（）”和‘pyopencl.enqueue_map_buffer（）’有一个例子，我试着读你所提供的链接，但它的艰涩 –

看一看这里的例子：。 [pyopencl.buffer.release]（http://nullege.com/codes/search?cq=pyopencl.buffer.release）和[pyopencl.enqueue_map_buffer]（http://nullege.com/codes/search?cq=pyopencl .enqueue_map_buffer） – doqtor

如何释放GPU内存并在Pyopencl中为不同的阵列使用相同的缓冲区？

回答

相关问题