在C中读/写浮点类型时，如何处理字节顺序差异？

我正在为我的应用程序设计一个文件格式，而且我显然希望它可以在big-endian和little-endian系统上工作。我已经找到了使用htonl和ntohl来管理积分类型的工作解决方案，但是当我尝试使用float和double的值时，我有点卡住了。在C中读/写浮点类型时，如何处理字节顺序差异？

鉴于浮点表示如何工作的本质，我会假定标准的字节顺序函数不能用于这些值。同样，我甚至不完全确定传统意义上的字节顺序是如何控制这些类型的字节顺序的。

我需要的是一致性。一种编写double的方法，并确保在读取它时获得相同的值。我如何在C中执行此操作？

来源

2013-02-19 Alexis King

保存为文本不是一个选项？ – qPCR4vir 2013-02-19 09:33:40

@ qPCR4vir这可能会导致很多性能下降。 – fuz 2013-02-19 09:45:49

HDF5（http://www.hdfgroup.org/HDF5/）等图书馆负责为您提供所有相关信息。我怀疑HDF5可能有点重量级，以满足您的需求。 – 2013-02-19 09:51:02

另一个选项可以是使用从double frexp(double value, int *exp);<math.h>（C99），以浮点值分解成归一化分数（在范围[0.5，1））和为2的整数幂然后可以乘以通过FLT_RADIX^DBL_MANT_DIG馏分以获得在范围[FLT_RADIX^DBL_MANT_DIG/2的整数，FLT_RADIX^DBL_MANT_DIG）。然后，您可以保存大或小的整数，无论您选择哪种格式。

当您加载保存的号码时，将执行相反的操作并使用double ldexp(double x, int exp);将重建的分数乘以2的幂。

当FLT_RADIX = 2（几乎所有系统，我想呢？）和DBL_MANT_DIG < = 64时，这将工作得最好。

必须小心避免溢出。

为doubles示例代码：

#include <limits.h> 
#include <float.h> 
#include <math.h> 
#include <string.h> 
#include <stdio.h> 

#if CHAR_BIT != 8 
#error currently supported only CHAR_BIT = 8 
#endif 

#if FLT_RADIX != 2 
#error currently supported only FLT_RADIX = 2 
#endif 

#ifndef M_PI 
#define M_PI 3.14159265358979324 
#endif 

typedef unsigned char uint8; 

/* 
    10-byte little-endian serialized format for double: 
    - normalized mantissa stored as 64-bit (8-byte) signed integer: 
     negative range: (-2^53, -2^52] 
     zero: 0 
     positive range: [+2^52, +2^53) 
    - 16-bit (2-byte) signed exponent: 
     range: [-0x7FFE, +0x7FFE] 

    Represented value = mantissa * 2^(exponent - 53) 

    Special cases: 
    - +infinity: mantissa = 0x7FFFFFFFFFFFFFFF, exp = 0x7FFF 
    - -infinity: mantissa = 0x8000000000000000, exp = 0x7FFF 
    - NaN:  mantissa = 0x0000000000000000, exp = 0x7FFF 
    - +/-0:  only one zero supported 
*/ 

void Double2Bytes(uint8 buf[10], double x) 
{ 
    double m; 
    long long im; // at least 64 bits 
    int ie; 
    int i; 

    if (isnan(x)) 
    { 
    // NaN 
    memcpy(buf, "\x00\x00\x00\x00\x00\x00\x00\x00" "\xFF\x7F", 10); 
    return; 
    } 
    else if (isinf(x)) 
    { 
    if (signbit(x)) 
     // -inf 
     memcpy(buf, "\x00\x00\x00\x00\x00\x00\x00\x80" "\xFF\x7F", 10); 
    else 
     // +inf 
     memcpy(buf, "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x7F" "\xFF\x7F", 10); 
    return; 
    } 

    // Split double into normalized mantissa (range: (-1, -0.5], 0, [+0.5, +1)) 
    // and base-2 exponent 
    m = frexp(x, &ie); // x = m * 2^ie exactly for FLT_RADIX=2 
        // frexp() can't fail 
    // Extract most significant 53 bits of mantissa as integer 
    m = ldexp(m, 53); // can't overflow because 
        // DBL_MAX_10_EXP >= 37 equivalent to DBL_MAX_2_EXP >= 122 
    im = trunc(m); // exact unless DBL_MANT_DIG > 53 

    // If the exponent is too small or too big, reduce the number to 0 or 
    // +/- infinity 
    if (ie > 0x7FFE) 
    { 
    if (im < 0) 
     // -inf 
     memcpy(buf, "\x00\x00\x00\x00\x00\x00\x00\x80" "\xFF\x7F", 10); 
    else 
     // +inf 
     memcpy(buf, "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x7F" "\xFF\x7F", 10); 
    return; 
    } 
    else if (ie < -0x7FFE) 
    { 
    // 0 
    memcpy(buf, "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00", 10); 
    return; 
    } 

    // Store im as signed 64-bit little-endian integer 
    for (i = 0; i < 8; i++, im >>= 8) 
    buf[i] = (uint8)im; 

    // Store ie as signed 16-bit little-endian integer 
    for (i = 8; i < 10; i++, ie >>= 8) 
    buf[i] = (uint8)ie; 
} 

void Bytes2Double(double* x, const uint8 buf[10]) 
{ 
    unsigned long long uim; // at least 64 bits 
    long long im; // ditto 
    unsigned uie; 
    int ie; 
    double m; 
    int i; 
    int negative = 0; 
    int maxe; 

    if (!memcmp(buf, "\x00\x00\x00\x00\x00\x00\x00\x00" "\xFF\x7F", 10)) 
    { 
#ifdef NAN 
    *x = NAN; 
#else 
    *x = 0; // NaN is not supported, use 0 instead (we could return an error) 
#endif 
    return; 
    } 

    if (!memcmp(buf, "\x00\x00\x00\x00\x00\x00\x00\x80" "\xFF\x7F", 10)) 
    { 
    *x = -INFINITY; 
    return; 
    } 
    else if (!memcmp(buf, "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x7F" "\xFF\x7F", 10)) 
    { 
    *x = INFINITY; 
    return; 
    } 

    // Load im as signed 64-bit little-endian integer 
    uim = 0; 
    for (i = 0; i < 8; i++) 
    { 
    uim >>= 8; 
    uim |= (unsigned long long)buf[i] << (64 - 8); 
    } 
    if (uim <= 0x7FFFFFFFFFFFFFFFLL) 
    im = uim; 
    else 
    im = (long long)(uim - 0x7FFFFFFFFFFFFFFFLL - 1) - 0x7FFFFFFFFFFFFFFFLL - 1; 

    // Obtain the absolute value of the mantissa, make sure it's 
    // normalized and fits into 53 bits, else the input is invalid 
    if (im > 0) 
    { 
    if (im < (1LL << 52) || im >= (1LL << 53)) 
    { 
#ifdef NAN 
     *x = NAN; 
#else 
     *x = 0; // NaN is not supported, use 0 instead (we could return an error) 
#endif 
     return; 
    } 
    } 
    else if (im < 0) 
    { 
    if (im > -(1LL << 52) || im <= -(1LL << 53)) 
    { 
#ifdef NAN 
     *x = NAN; 
#else 
     *x = 0; // NaN is not supported, use 0 instead (we could return an error) 
#endif 
     return; 
    } 
    negative = 1; 
    im = -im; 
    } 

    // Load ie as signed 16-bit little-endian integer 
    uie = 0; 
    for (i = 8; i < 10; i++) 
    { 
    uie >>= 8; 
    uie |= (unsigned)buf[i] << (16 - 8); 
    } 
    if (uie <= 0x7FFF) 
    ie = uie; 
    else 
    ie = (int)(uie - 0x7FFF - 1) - 0x7FFF - 1; 

    // If DBL_MANT_DIG < 53, truncate the mantissa 
    im >>= (53 > DBL_MANT_DIG) ? (53 - DBL_MANT_DIG) : 0; 

    m = im; 
    m = ldexp(m, (53 > DBL_MANT_DIG) ? -DBL_MANT_DIG : -53); // can't overflow 
      // because DBL_MAX_10_EXP >= 37 equivalent to DBL_MAX_2_EXP >= 122 

    // Find out the maximum base-2 exponent and 
    // if ours is greater, return +/- infinity 
    frexp(DBL_MAX, &maxe); 
    if (ie > maxe) 
    m = INFINITY; 
    else 
    m = ldexp(m, ie); // underflow may cause a floating-point exception 

    *x = negative ? -m : m; 
} 

int test(double x, const char* name) 
{ 
    uint8 buf[10], buf2[10]; 
    double x2; 
    int error1, error2; 

    Double2Bytes(buf, x); 
    Bytes2Double(&x2, buf); 
    Double2Bytes(buf2, x2); 

    printf("%+.15E '%s' -> %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X\n", 
     x, 
     name, 
     buf[0],buf[1],buf[2],buf[3],buf[4],buf[5],buf[6],buf[7],buf[8],buf[9]); 

    if ((error1 = memcmp(&x, &x2, sizeof(x))) != 0) 
    puts("Bytes2Double(Double2Bytes(x)) != x"); 

    if ((error2 = memcmp(buf, buf2, sizeof(buf))) != 0) 
    puts("Double2Bytes(Bytes2Double(Double2Bytes(x))) != Double2Bytes(x)"); 

    puts(""); 

    return error1 || error2; 
} 

int testInf(void) 
{ 
    uint8 buf[10]; 
    double x, x2; 
    int error; 

    x = DBL_MAX; 
    Double2Bytes(buf, x); 
    if (!++buf[8]) 
    ++buf[9]; // increment the exponent beyond the maximum 
    Bytes2Double(&x2, buf); 

    printf("%02X %02X %02X %02X %02X %02X %02X %02X %02X %02X -> %+.15E\n", 
     buf[0],buf[1],buf[2],buf[3],buf[4],buf[5],buf[6],buf[7],buf[8],buf[9], 
     x2); 

    if ((error = !isinf(x2)) != 0) 
    puts("Bytes2Double(Double2Bytes(DBL_MAX) * 2) != INF"); 

    puts(""); 

    return error; 
} 

#define VALUE_AND_NAME(V) { V, #V } 

const struct 
{ 
    double value; 
    const char* name; 
} testData[] = 
{ 
#ifdef NAN 
    VALUE_AND_NAME(NAN), 
#endif 
    VALUE_AND_NAME(0.0), 
    VALUE_AND_NAME(+DBL_MIN), 
    VALUE_AND_NAME(-DBL_MIN), 
    VALUE_AND_NAME(+1.0), 
    VALUE_AND_NAME(-1.0), 
    VALUE_AND_NAME(+M_PI), 
    VALUE_AND_NAME(-M_PI), 
    VALUE_AND_NAME(+DBL_MAX), 
    VALUE_AND_NAME(-DBL_MAX), 
    VALUE_AND_NAME(+INFINITY), 
    VALUE_AND_NAME(-INFINITY), 
}; 

int main(void) 
{ 
    unsigned i; 
    int errors = 0; 

    for (i = 0; i < sizeof(testData)/sizeof(testData[0]); i++) 
    errors += test(testData[i].value, testData[i].name); 

    errors += testInf(); 

    // Test subnormal values. A floating-point exception may be raised. 
    errors += test(+DBL_MIN/2, "+DBL_MIN/2"); 
    errors += test(-DBL_MIN/2, "-DBL_MIN/2"); 

    printf("%d error(s)\n", errors); 

    return 0; 
}

输出（ideone）：

+NAN 'NAN' -> 00 00 00 00 00 00 00 00 FF 7F 

+0.000000000000000E+00 '0.0' -> 00 00 00 00 00 00 00 00 00 00 

+2.225073858507201E-308 '+DBL_MIN' -> 00 00 00 00 00 00 10 00 03 FC 

-2.225073858507201E-308 '-DBL_MIN' -> 00 00 00 00 00 00 F0 FF 03 FC 

+1.000000000000000E+00 '+1.0' -> 00 00 00 00 00 00 10 00 01 00 

-1.000000000000000E+00 '-1.0' -> 00 00 00 00 00 00 F0 FF 01 00 

+3.141592653589793E+00 '+M_PI' -> 18 2D 44 54 FB 21 19 00 02 00 

-3.141592653589793E+00 '-M_PI' -> E8 D2 BB AB 04 DE E6 FF 02 00 

+1.797693134862316E+308 '+DBL_MAX' -> FF FF FF FF FF FF 1F 00 00 04 

-1.797693134862316E+308 '-DBL_MAX' -> 01 00 00 00 00 00 E0 FF 00 04 

+INF '+INFINITY' -> FF FF FF FF FF FF FF 7F FF 7F 

-INF '-INFINITY' -> 00 00 00 00 00 00 00 80 FF 7F 

FF FF FF FF FF FF 1F 00 01 04 -> +INF 

+1.112536929253601E-308 '+DBL_MIN/2' -> 00 00 00 00 00 00 10 00 02 FC 

-1.112536929253601E-308 '-DBL_MIN/2' -> 00 00 00 00 00 00 F0 FF 02 FC 

0 error(s)

来源

2013-02-19 10:20:09

太好了，谢谢。我想我唯一的担心是溢出。你会如何建议处理它们？另外，这种方法确切吗？ – 2013-02-19 21:06:40

如果浮点值是以2为底的，则必须精确。你可以遇到的唯一的溢出是当CPU格式的指数范围比格式或其他方式更宽时。您可能需要检查这一点，并使用无穷大的特殊值或无穷大不支持时的最大值。尾数可能与位数/位数有些类似的问题。如果它不适合CPU或文件插槽，则需要截断或舍入它。 – 2013-02-19 21:31:36

您可以优化IEEE-754的代码，以便如果CPU支持IEEE-754，则不做任何特殊处理，不进行任何检查。 – 2013-02-19 21:32:44

根据应用程序，使用纯文本数据格式（可能性为XML）可能是一个好主意。如果你不想浪费磁盘空间，你可以压缩它。

来源

2013-02-19 09:34:52

将浮点值写为文本时，'％a''可能比'％f' /'％e' /'％g'更好。不可读，但应避免截断小数位或其中太多。 – 2013-02-19 09:54:09

浮点值使用与整数值imho相同的字节顺序。使用一个工会与各自对应的整体重叠它们，并使用共同hton功能：

float htonf(float x) { 
    union foo { 
    float f; 
    uint32_t i; 
    } foo = { .f = x }; 

    foo.i = htonl(foo.i); 
    return foo.f; 
}

来源

2013-02-19 09:45:23 fuz

已经有平台，整数和浮点数有不同的字节顺序。但我们可能不在意支持这些。 – 2013-02-19 18:06:24

XML可能做到这一点的最简便的方式。

但是，看起来你已经拥有了大部分的解析器，但是仍然停留在float/double问题上。我建议把它写成一个字符串（以你想要的任何精度），然后再读回来。

除非你所有的目标平台都使用IEEE-754浮点数（和双精度），否则不会使用字节交换技巧您。

来源

2013-02-19 09:45:43

等等...现在有些平台不使用IEEE 754浮点数？ – fuz 2013-02-19 09:46:32

我不认为在RAM中IEEE-754浮点/双精度位的顺序是有保证的。这可能是任何事情，你不应该直接操纵其内容。 – 2013-02-19 09:48:11

这是一篇有趣的文章，关于如何确保您的平台双执行符合IEEE-754：http://stackoverflow.com/a/753018/1384030 – 2013-02-19 09:49:04

如果您保证您的实现始终将指定格式的序列化浮点表示处理，那么您将会很好（IEEE 754是常见的）。

是的，体系结构可能会对浮点数进行不同的排序（例如，大或小排序）。因此，你会想以某种方式指定字节顺序。这可能是格式的规格或变量，并记录在文件的数据中。

最后一个主要缺陷是内置对齐方式可能会有所不同。您的硬件/处理器如何处理失准数据是如何定义的。因此，您可能需要交换数据/字节，然后将其移至目的地float/double。

来源

2013-02-19 09:47:04 justin

像HDF5甚至的NetCDF库可能是这个高性能马克说有点重量级的，除非你还需要这些库中的其他功能。

仅处理序列化的较轻重量的替代方案将是例如， XDR（另见wikipedia description）。许多操作系统提供XDR例程开箱即用，如果这还不够独立的XDR库。

来源

2013-02-19 10:31:20 janneb

在C中读/写浮点类型时，如何处理字节顺序差异？

回答

相关问题