OpenGL GLPaint线程渲染

我目前正在使用基于Apple的GLPaint示例的库，用于在Open GL中的屏幕上绘图。目前，当画布保存并恢复会话时，绘制线条（可以看到进度），并且如果有很多点要渲染，则需要相当多的时间。有什么办法可以让它平行或更快地渲染？OpenGL GLPaint线程渲染

这是我使用的绘图代码：

CGPoint start = step.start; 
CGPoint end = step.end; 

// Convert touch point from UIView referential to OpenGL one (upside-down flip) 
CGRect bounds = [self bounds]; 
start.y = bounds.size.height - start.y; 
end.y = bounds.size.height - end.y; 

static GLfloat*  vertexBuffer = NULL; 
static NSUInteger vertexMax = 64; 
NSUInteger   vertexCount = 0, 
count, 
i; 

[EAGLContext setCurrentContext:context]; 
glBindFramebufferOES(GL_FRAMEBUFFER_OES, viewFramebuffer); 

// Convert locations from Points to Pixels 
CGFloat scale = self.contentScaleFactor; 
start.x *= scale; 
start.y *= scale; 
end.x *= scale; 
end.y *= scale; 

// Allocate vertex array buffer 
if(vertexBuffer == NULL) 
    vertexBuffer = malloc(vertexMax * 2 * sizeof(GLfloat)); 

// Add points to the buffer so there are drawing points every X pixels 
count = MAX(ceilf(sqrtf((end.x - start.x) * (end.x - start.x) + (end.y - start.y) * (end.y - start.y))/kBrushPixelStep), 1); 
for(i = 0; i < count; ++i) { 
    if(vertexCount == vertexMax) { 
     vertexMax = 2 * vertexMax; 
     vertexBuffer = realloc(vertexBuffer, vertexMax * 2 * sizeof(GLfloat)); 
    } 

    vertexBuffer[2 * vertexCount + 0] = start.x + (end.x - start.x) * ((GLfloat)i/(GLfloat)count); 
    vertexBuffer[2 * vertexCount + 1] = start.y + (end.y - start.y) * ((GLfloat)i/(GLfloat)count); 
    vertexCount += 1; 
} 

// Render the vertex array 
glVertexPointer(2, GL_FLOAT, 0, vertexBuffer); 
glDrawArrays(GL_POINTS, 0, (int)vertexCount); 

// Display the buffer 
glBindRenderbufferOES(GL_RENDERBUFFER_OES, viewRenderbuffer); 
[context presentRenderbuffer:GL_RENDERBUFFER_OES];

来源

2014-04-28 merrick_s

我们在谈论多少分？我解释代码的方式，它需要两个屏幕位置（可能来自触摸输入），并在每个“kBrushPixelStep”像素之间绘制点。那应该不是那么多点吧？或者你是否调用了我们重复看到的代码，使用不同的'start'和'end'值？ –

@ReetoKoradi代码被重复调用，一个包含多个步骤（每个都有起始和结束坐标）的数组调用每个步骤的函数 –

OpenGL是不是多线程。您必须从单个线程提交OpenGL命令。

你有两个选择：

那么您可以您的代码以使用并发来构建您发送到OpenGL的数据，然后将其提交到OpenGL的API，一旦它的所有可用。
您可以重构它来使用着色器进行计算。这将计算从CPU推到了GPU上，GPU对并行操作进行了高度优化。

上面的代码使用realloc在for循环中重复分配一个缓冲区。这是非常低效的，因为内存分配是现代操作系统中最慢的基于RAM的操作之一。您应该重构代码以预先计算内存缓冲区的最终大小，然后以最终大小分配缓冲区一次，而不是使用realloc。这应该会让你以很少的努力提高许多次速度。

看着你的代码，重构你的for循环将顶点计算分解成块并提交这些块到GCD进行并发处理应该不难。诀窍在于将任务分解为足够大的工作单元，以便从并行处理中受益（在设置任务以在后台队列中运行时存在一定的开销。您希望在每个工作单元中完成足够的工作使这个开销值得。）

来源

2014-04-28 19:05:34

发布的代码在每次通过循环时都不会重新分配。每次将其填充到当前大小时，它会以两倍的大小进行分配。这是动态调整缓冲区大小的一种非常有效的方法。当然，如果大小可以提前计算，那会更好。 –

在我发表我的评论之后，我意识到这一点，但放弃了它，因为重新分配是缓慢和不必要的。正如我所说的，内存分配是基于内存的最慢操作之一。 –

我相信上面的评论中的对话显示了你的性能问题的主要部分。除非我完全误会了吧，你的代码的高层结构目前看起来是这样的：

loop over steps 
    calculate list of points from start/end points 
    render list of points 
    present the renderbuffer 
end loop

它应该是大规模更快地出席了渲染的所有步骤后，才渲染：

loop over steps 
    generate list of points from start/end points 
    draw list of points 
end loop 
present the renderbuffer

即使更好地，为每个步骤生成一个顶点缓冲对象（aka VBO）作为创建它的一部分，并将该步骤的点的坐标存储在缓冲区中。然后你的抽奖逻辑变成：

loop over steps 
    bind VBO for step 
    draw content of VBO 
end loop 
present the renderbuffer

来源

2014-04-29 15:46:36

OpenGL GLPaint线程渲染

回答

相关问题