0
我试图创建一个基准ARM在随后的指令循环(组装),单独和组合:浮点指令
- 整数加法
- 整数乘法
- 浮点加法
- 浮点乘法
这是我的整数运算代码:
int additions_int(int n) {
int i, dummyValue = n;
__asm (
"MOV R0, #2\n"
"MOV R1, #6\n"
);
for (i = 0; i < n/LOOP_STEP; i++) {
__asm (
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
"ADD R0, R0, R1\n"
);
}
return dummyValue;
}
int multiplications_int(int n) {
int i, dummyValue=n;
__asm (
"MOV R0, #2\n"
"MOV R1, #6\n"
);
for (i = 0; i < n/LOOP_STEP; i++) {
__asm (
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
"MUL R0, R0, R1\n"
);
}
return dummyValue;
}
问题出在浮点运算上。我检查this documentation,我已经tryed做这样的事情:
arm-linux-gnueabi-gcc -static -march=armv7-a microbenchmark_arm.c -o microbenchmark_arm
我得到这个错误:
Error: selected processor does not support ARM mode `vmul.f32 R0,R0,R1'
Error: selected processor does not support ARM mode `vadd.f32 R0,R0,R1'
谁能说我
float multiplications_fp(int n) {
int i;
float fn=n, dummyValue = fn;
for (i = 0; i < n/LOOP_STEP; i++) {
__asm (
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
"VMUL.F32 R0, R0, R1\n"
);
}
return dummyValue;
}
float additions_fp(int n) {
int i;
float fn=n, dummyValue = fn;
for (i = 0; i < n/LOOP_STEP; i++) {
__asm (
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
"VADD.F32 R0, R0, R1\n"
);
}
return dummyValue;
}
与编译我做错了什么?
任何人都可以向我展示一个用于ARM Cortex-A架构的浮点加法或乘法的例子吗?
首先阅读ARMv7A体系结构的参考手册和目标CPU的数据表以及gcc手册,该如何?供参考:循环是有问题的,因为不确定。请先阅读如何正确的基准。 – Olaf
我看到一个拼写不同的例子[here](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0491c/BABDEAGJ.html):'VMUL.F32 d0,d0 ,d0'。我没有ARM FP的经验,所以不能告诉你如何修正你的语法。 – anatolyg
顺便说一句,关于“浮点加法或乘法的例子”,你可以通过反汇编编译代码来看到一个例子。 – anatolyg