阵列结构或数组的结构性能

我使用的是MacMini '11用的AMD Radeon HD 6630M。我使用数组结构绘制网格，一切正常：60 fps（使用CVDisplayLink）。我使用具有内置属性的着色器。生活很好。我正在切换到使用一系列结构（交错），因为我知道这是“现代”GPU的首选。属性在着色器中定义。网格画得很漂亮。但是当我这样做时，帧频下降了大约33％（达到40 fps）。这些电话有多个副本。使用工具：时间探查，我得到了以下比较：阵列结构或数组的结构性能

Using structure of arrays (60 fps) 
Running Time Self Symbol Name 
3.0ms 0.0% 3.0 0x21b76c4   ATIRadeonX3000GLDriver 
2.0ms 0.0% 0.0 gldUpdateDispatch ATIRadeonX3000GLDriver 
2.0ms 0.0% 0.0 gleDoDrawDispatchCore  GLEngine 
2.0ms 0.0% 0.0  glDrawElements_ACC_Exec GLEngine 
2.0ms 0.0% 0.0  glDrawElements  libGL.dylib 
2.0ms 0.0% 0.0  -[Mesh draw]  me 

Using array of structures (40 fps) 
Running Time Self  Symbol Name 
393.0ms 7.4% 393.0 0x86f6695    ? 
393.0ms 7.4% 0.0 gleDrawArraysOrElements_ExecCore GLEngine 
393.0ms 7.4% 0.0 glDrawElements_IMM_Exec  GLEngine 
393.0ms 7.4% 0.0  glDrawElements   libGL.dylib 
393.0ms 7.4% 0.0  -[Mesh draw]   me

看起来libGL函数是做出决定在不同的方向走，和结构的阵列看起来像X3000驱动程序没有获取调用。它是否在Apple软件模拟器中执行？我应该留在数组结构中吗？有没有人看过类似的东西？

的属性的代码是从苹果例子，在这些领域，性能没有用在我的应用程序（至少10个其他地区）。这是从慢版本。正如我所提到的，由于数据不是交错的，因此我在快速版本中使用了内置属性。渲染现场，只是缓慢。

我希望这是你在找什么：

// Step 5 - Bind each of the vertex shader's attributes to the programs 
[self.meshShader addAttribute:@"inPosition"]; 
[self.meshShader addAttribute:@"inNormal"]; 
[self.meshShader addAttribute:@"inTexCoord"]; 

// Step 6 - Link the program 
if([[self meshShader] linkShader] == 0){ 
    self.posAttribute = [meshShader attributeIndex:@"inPosition"]; 
    self.normAttribute = [meshShader attributeIndex:@"inNormal"]; 
    self.texCoordAttribute = [meshShader attributeIndex:@"inTexCoord"]; 

... 


- (void) addAttribute:(NSString *)attributeName 
{ 
    if ([attributes containsObject:attributeName] == NO){ 
     [attributes addObject:attributeName]; 
     glBindAttribLocation(program, [attributes indexOfObject:attributeName],  
     [attributeName UTF8String]); 
    } 
}

更新： 经过进一步调查： 1）我使用dhpoWare的modelObj装载机（修改），并且由于它使用的交织排列结构，它也像我的性能明智的结构阵列 - 只是没有一点点击。我可能会错误地解释乐器。 modelObj代码调用glDrawElements_IMM_Exec，它也以迂回的方式调用gleDoDrawDispatchCore。我不确定它是否只是在glDrawElements_IMM_Exec上累积了一堆调用，然后通过gleDoDrawDispatchCore进行爆发。不知道。 2）我认为仪器有问题，因为它显示GLEngine调用我没有使用外部钩子的未使用的内部3ds对象方法之一。我通过在那里设置一个Xcode断点来检查它，并且它从未被触发。我不再做3DS了。

我想我会继续环顾四周，也许就回答绊倒。如果有人会给我一个关于一系列结构是否可行的意见，那将不胜感激。

SOLUTION：我添加了一个VBO到的这个前端和一切都很好。原始代码来自OpenGL ES 2.0指南，增加了VBO修复了我的问题。帧速率为60,1ms驱动程序调用。这里是代码：

glGenVertexArrays(1, &vaoName); 
glBindVertexArray(vaoName); 

// new - create VBO 
glGenBuffers(1, &vboName); 
glBindBuffer(GL_ARRAY_BUFFER, vboName); 

// Allocate and load position data into the VBO 
glBufferData(GL_ARRAY_BUFFER, sizeof(struct vertexAttribs) * self.numVertices,            
        vertexAttribData, GL_STATIC_DRAW); 
// end of new 

NSUInteger vtxStride = sizeof(struct vertexAttribs); 
//GLfloat *vtxBuf = (GLfloat *)vertexAttribData; // no longer use this 
GLfloat *vtxBuf = (GLfloat *)NULL;    // use this instead 

glEnableVertexAttribArray(self.posAttribute); 
glVertexAttribPointer(self.posAttribute, VERTEX_POS_SIZE, GL_FLOAT, GL_FALSE, 
         vtxStride, vtxBuf); 
vtxBuf += VERTEX_POS_SIZE; 

glEnableVertexAttribArray(self.normAttribute); 
glVertexAttribPointer(self.normAttribute, VERTEX_NORM_SIZE, GL_FLOAT, GL_FALSE, 
         vtxStride, vtxBuf); 
vtxBuf += VERTEX_NORM_SIZE; 

glEnableVertexAttribArray(self.texCoordAttribute); 
glVertexAttribPointer(self.texCoordAttribute, VERTEX_TEX_SIZE, GL_FLOAT, GL_FALSE, 
         vtxStride, vtxBuf); 
...

来源

2012-03-10 user1261484

你能张贴的设置了属性阵列的代码？快速和慢速版本？ – 2012-03-10 20:50:49

你可以发布你的解决方案作为答案并接受它，让别人知道这个问题解决了吗？ – 2012-03-27 15:34:33

在内存中实现单位跨度访问的数组结构是经验法则。它不仅适用于GPU，还适用于像Intel Xeon Phi这样的CPUS和协处理器。

在你的情况下，我不相信这个代码部分被发送到GPU，而不是性能的损失是由于非单位跨度存储器存取（CPU向/从存储器）。

来源

2014-03-14 13:09:08

阵列结构或数组的结构性能

回答

相关问题