0
我正在尝试编写一些CUDA代码来计算最长的公用子序列。我不能工作了如何使线程休眠,直到依存关系来计算它的细胞被满足:睡觉/在CUDA线程中等待
即
// Ignore the spurious maths here, very messy data structures. Planning ahead to strings that are bigger then GPU blocks. i & j are correct though.
int real_i = blockDim.x * blockIdx.x + threadIdx.x;
int real_j = blockDim.y * (max_offset - blockIdx.x) + threadIdx.y;
char i_char = seq1[real_i];
char j_char = seq2[real_j];
// For i & j = 1 to length
if((real_i > 0 && real_j > 0) && (real_i < sequence_length && real_j < sequence_length) {
printf("i: %d, j: %d\n", real_i, real_j);
printf("I need to wait for dependancy at i: %d j: %d and i: %d j: %d\n", real_i, (real_j - 1), real_i - 1, real_j);
printf("Is this true? %d\n", (depend[sequence_length * real_i + (real_j - 1)] && depend[sequence_length * (real_i - 1) + real_j]));
//WAIT FOR DEPENDENCY TO BE SATISFIED
//THIS IS WHERE I NEED THE CODE TO HANG
while((depend[sequence_length * real_i + (real_j - 1)] == false) && (depend[sequence_length * (real_i - 1) + real_j] == false)) {
}
if (i_char == j_char)
c[sequence_length * real_i + real_j] = (c[sequence_length * (real_i - 1) + (real_j - 1)]) + 1;
else
c[sequence_length * real_i + real_j] = max(c[sequence_length * real_i + (real_j - 1)], c[sequence_length * (real_i - 1) + real_j]);
// SETTING THESE TO TRUE SHOULD ALLOW OTHER THREADS TO BREAK PAST THE WHILE BLOCK
depend[sequence_length * real_i + (real_j - 1)] = true;
depend[sequence_length * (real_i - 1) + real_j] = true;
}
所以基本上线程应该在while循环挂起,直到它的依赖,满足在移入计算代码之前由其他线程执行。
我知道“第一”线程都有它的依赖性来满足它打印
real i 1, real j 1
I need to wait for dependancy at i: 1 j: 0 and i: 0 j: 1
Is this true? 1
曾经它已经完成它的计算设置了一些细胞依赖矩阵为true,允许2个线程,让过去的同时,循环和内核从那里移动。
但是,如果我去掉while循环我的整个系统挂起〜10秒,我得到
the launch timed out and was terminated
有什么建议?