2014-04-17 124 views
0

任何人都可以告诉我为什么最大工作项目为我的CPU比CPU和计算单位? 是,意味着CPU性能比GPU更好clinfo设备cpu-gpu信息

CPU:英特尔酷睿i7 2.2GH GPU:的AMD Radeon HD 6700M



Number of platforms:        2 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 AMD-APP (1084.2) 
    Platform Name:         AMD Accelerated Parallel Proces 
sing 
    Platform Vendor:        Advanced Micro Devices, Inc. 
    Platform Extensions:       cl_khr_icd cl_amd_event_callbac 
k cl_amd_offline_devices cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_me 
dia_sharing 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 
    Platform Name:         Intel(R) OpenCL 
    Platform Vendor:        Intel(R) Corporation 
    Platform Extensions:       cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 


    Platform Name:         AMD Accelerated Parallel Proces 
sing 
Number of devices:        2 
    Device Type:         CL_DEVICE_TYPE_GPU 
    Device ID:          4098 
    Max compute units:        6 
    Max work items dimensions:      3 
    Max work items[0]:       256 
    Max work items[1]:       256 
    Max work items[2]:       256 
    Max work group size:       256 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     4 
    Preferred vector width double:     0 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      4 
    Native vector width double:     0 
    Max clock frequency:       725Mhz 
    Address bits:         32 
    Max memory allocation:       536870912 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     1024 
    Alignment (bits) of base address:    2048 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          No 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         None 
    Cache line size:        0 
    Cache size:         0 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Scratchpad 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  64 
    Error correction support:      0 
    Unified memory for Host and Device:   0 
    Profiling timer resolution:     1 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      No 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:           Turks 
    Vendor:          Advanced Micro Devices, Inc. 
    Driver version:        1084.2 (VM) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_global_int32_base_atomic 
s cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_lo 
cal_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store 
cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd 
_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d 
x9_media_sharing 


    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          4098 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     8 
    Preferred vector width double:     4 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2195Mhz 
    Address bits:         32 
    Max memory allocation:       1073741824 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       8192 
    Max image 2D height:       8192 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     4096 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         32768 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  1 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          GenuineIntel 
    Driver version:        1084.2 (sse2,avx) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_fp64 cl_amd_fp64 cl_khr_ 
global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3 
2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr 
_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_at 
tribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3 
d10_sharing 


    Platform Name:         Intel(R) OpenCL 
Number of devices:        1 
    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          32902 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     1 
    Preferred vector width short:     1 
    Preferred vector width int:     1 
    Preferred vector width long:     1 
    Preferred vector width float:     1 
    Preferred vector width double:     1 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2200Mhz 
    Address bits:         32 
    Max memory allocation:       536838144 
    Image support:         Yes 
    Max number of images read arguments:   480 
    Max number of images write arguments:   480 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     480 
    Max size of kernel argument:     3840 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        No 
    Round to +ve and infinity:     No 
    IEEE754-2008 fused multiply-add:    No 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         262144 
    Global memory size:       2147352576 
    Constant buffer size:       131072 
    Max number of constant args:     480 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  128 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        Yes 
    Profiling :         Yes 
    Platform ID:         00401218 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          Intel(R) Corporation 
    Driver version:        3.0.1.15216 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 (Build 80752) 
    Extensions:         cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 

为什么看到CPU三种设备类型两个和一个针对GPU的OpenCL 英特尔的CPU或内置GPU 我有两个显示适配器:AMD的Radeon HD 6700M系列 英特尔HD图形家庭

+0

将其标记为属于[su](http://superuser.com/)的线程。 –

+0

什么意思最大计算单位:6或8。这是否意味着英特尔核心的数量我有核心我7?对于GPU只有6? – user1848223

+0

任何帮助请 – user1848223

回答

2

“有多少核心/处理单元/硬件线程做我的GPU有?“对于新的GPGPU用户,是一个非常常见的问题。我平常的回答是“你为什么在乎?”。没有办法查询设备使用OpenCL API的处理元素的数量。在不同的体系结构中,构成一个处理单元和一个计算单元的确切区别很大。

实际情况是,设备拥有多少处理元素并不重要,因为使用此指标是评估设备性能的一种非常糟糕的方式。如果您真的需要知道该设备对于特定应用程序的速度有多快,那么您应该对其进行基准测试(直接与您的应用程序或与您的应用程序具有类似属性的微型基准测试)。

要回答您的其他问题:您的系统上有两个OpenCL实现可以使用CPU,Intel和AMD。因此,这两个平台都会将CPU报告为可用的OpenCL设备。

+0

我厌倦了这个问题。但我认为我们必须长期处理这个问题......事实上,这是一个合乎逻辑的问题。人们仍然会来自CPU世界,并尝试手动控制每个“线程”并准确知道它们的数量。即使GPU将拥有数百万的并行内核...... – DarkZeros