在条件标准化之前强制评估函数输入条件

在基准函数Criterion之前如何强制评估函数的输入？我正在尝试对一些函数进行基准测试，但希望排除评估输入thunk的时间。有问题的代码使用unboxed vectors进行输入，对于Int向量不能深入分析。下面的实施例的代码片断：在条件标准化之前强制评估函数输入条件

-- V is Data.Vector.Unboxed 
shortv = V.fromList [1..10] :: V.Vector GHC.Int.Int16 
intv = V.fromList [1..10] :: V.Vector GHC.Int.Int32 

main :: IO() 
main = defaultMain [ 
      bench "encode ShortV" $ whnf encodeInt16V shortv 
      ,bench "encode IntV" $ whnf encodeInt32V intv 
     ]

准则基准时间包括上述功能的基准测试时建立shortv，和INTV输入。标准测量低于 - 它测量大约〜400ns的用于这似乎包括用于输入编译时间以及每个功能：

benchmarking encode ShortV 
mean: 379.6917 ns, lb 378.0229 ns, ub 382.4529 ns, ci 0.950 
std dev: 10.79084 ns, lb 7.360444 ns, ub 15.89614 ns, ci 0.950 

benchmarking encode IntV 
mean: 392.2736 ns, lb 391.2816 ns, ub 393.4853 ns, ci 0.950 
std dev: 5.565134 ns, lb 4.694539 ns, ub 6.689224 ns, ci 0.950

现在，如果基准码的主要部分被修改为下面（通过去除第二工作台功能）：

main = defaultMain [ 
      bench "encode ShortV" $ whnf encodeInt16V shortv 
     ]

shortv输入似乎是基准encodeInt16V功能之前进行评估。这对我来说确实是理想的输出，因为这个基准测量了函数执行的时间，不包括构建输入的时间。以下标准输出：

benchmarking encode ShortV 
mean: 148.8488 ns, lb 148.4714 ns, ub 149.6279 ns, ci 0.950 
std dev: 2.658834 ns, lb 1.621119 ns, ub 5.184792 ns, ci 0.950

同样，如果我只标杆“编码INTV”基准，我拿到〜150ns的时间为那一个了。

我从Criterion文档了解到，它试图避免延迟评估以获得更准确的基准。这是有道理的，这里并不是真正的问题。我的问题是如何构建shortv和intv输入，以便在传递给bench函数之前已经对它们进行了评估。现在，我可以通过限制defaultMain一次只对一个函数进行基准测试来实现这一点（正如我刚才所示），但这不是一个理想的解决方案。

EDIT1

还有别的东西与标准基准怎么回事，它似乎只发生在Vector数组，而不是名单。如果我通过打印shortv和intv强制进行全面评估，那么基准测量的时间仍然是〜400ns，而不是〜150ns。代码更新如下：

main = do 
    V.forM_ shortv $ \x -> do print x 
    V.forM_ intv $ \x -> do print x 
    defaultMain [ 
      bench "encode ShortV" $ whnf encodeInt16V shortv 
      ,bench "encode IntV" $ whnf encodeInt32V intv 
     ]

标准输出（也有158.4％的异常值，这似乎不正确的）：

estimating clock resolution... 
mean is 5.121819 us (160001 iterations) 
found 253488 outliers among 159999 samples (158.4%) 
    126544 (79.1%) low severe 
    126944 (79.3%) high severe 
estimating cost of a clock call... 
mean is 47.45021 ns (35 iterations) 
found 5 outliers among 35 samples (14.3%) 
    2 (5.7%) high mild 
    3 (8.6%) high severe 

benchmarking encode ShortV 
mean: 382.1599 ns, lb 381.3501 ns, ub 383.0841 ns, ci 0.950 
std dev: 4.409181 ns, lb 3.828800 ns, ub 5.216401 ns, ci 0.950 

benchmarking encode IntV 
mean: 394.0517 ns, lb 392.4718 ns, ub 396.7014 ns, ci 0.950 
std dev: 10.20773 ns, lb 7.101707 ns, ub 17.53715 ns, ci 0.950

来源

2011-12-04 Sal

你可以调用defaultMain运行基准测试之前使用evaluate。不知道它是否是最干净的解决方案，但它看起来像这样：

main = do 
    evaluate shortv 
    evaluate intv 
    defaultMain [..]

来源

2011-12-04 23:17:12

测试与评估，如您所建议的，但它并没有改变结果。我想这是因为它评估表达式为whnf，而不是nf。 – Sal

我不认为这可能是问题：例如， 'Int32'是'Data.Vector.Primitive.Vector Int32'的一种新类型，它只包含'Int'和'ByteArray'的严格字段。最后一个是围绕原始'ByteArray＃'的数据类型，我认为这是严格的。但测试应该很简单：只需将'evaluate'调用改为打印总和即可。 –

是的，你是对的。我在vector数组上使用print语句尝试了forM_，并得到了相同的结果。还有其他事情正在发生，而且它对于Vector数组似乎非常特殊。当我最初使用列表编写函数时，我没有看到这个问题。 – Sal

在条件标准化之前强制评估函数输入条件

回答

相关问题