3
我正在编写一些特定于平台的优化,并且我意识到可以在主机代码中解析供应商字符串并使用-D
选项将其发送给内核的事实,在没有主机参与的情况下直接在内核中检测供应商可能会更方便(这样即使不访问主机源代码也可以优化内核,...)。在内核代码中检测OpenCL设备厂商
到目前为止,我想出了以下内容:
#ifdef __NV_CL_C_VERSION
/**
* @def NVIDIA
* @brief defined when compiling on NVIDIA GPUs
*/
#define NVIDIA
#endif // __NV_CL_C_VERSION
#if defined(__WinterPark__) || defined(__BeaverCreek__) || defined(__Turks__) || \
defined(__Caicos__) || defined(__Tahiti__) || defined(__Pitcairn__) || \
defined(__Capeverde__) || defined(__Cayman__) || defined(__Barts__) || \
defined(__Cypress__) || defined(__Juniper__) || defined(__Redwood__) || \
defined(__Cedar__) || defined(__ATI_RV770__) || defined(__ATI_RV730__) || \
defined(__ATI_RV710__) || defined(__Loveland__) || defined(__GPU__) || \
defined(__Hawaii__)
#define AMD
/**
* @def AMD
* @brief defined when compiling on AMD GPUs
* @note This list was originally found at https://github.com/magnumripper/JohnTheRipper/wiki/Predefined-macros-in-OpenCL-(standard-and-proprietary) and copied shamelessly. It is most definitely incomplete and contains the troubling __GPU__.
* @note AMD also defines __CPU__ when compiling for CL_DEVICE_TYPE_CPU.
*/
#endif // ...
任何补充或更正?任何人都知道英特尔的定义