我已经开始尝试创建以下玩耍:优化批量大小
public static IEnumerable<List<T>> OptimizedBatches<T>(this IEnumerable<T> items)
那么这个扩展方法的客户端会使用这样的:
foreach (var list in extracter.EnumerateAll().OptimizedBatches())
{
// at some unknown batch size, process time starts to
// increase at an exponential rate
}
下面是一个例子:
batch length time
1 100ms
2 102ms
4 110ms
8 111ms
16 118ms
32 119ms
64 134ms
128 500ms <-- doubled length but time it took more than doubled
256 1100ms <-- oh no!!
根据以上所述,最好批次长度是64因为64/134是长度/时间的最佳比例。
所以问题是用什么算法来根据迭代器步骤之间的连续时间自动选择最佳批处理长度?
这里是我迄今为止 - 它尚未......
class LengthOptimizer
{
private Stopwatch sw;
private int length = 1;
private List<RateRecord> rateRecords = new List<RateRecord>();
public int Length
{
get
{
if (sw == null)
{
length = 1;
sw = new Stopwatch();
}
else
{
sw.Stop();
rateRecords.Add(new RateRecord { Length = length, ElapsedMilliseconds = sw.ElapsedMilliseconds });
length = rateRecords.OrderByDescending(c => c.Rate).First().Length;
}
sw.Start();
return length;
}
}
}
struct RateRecord
{
public int Length { get; set; }
public long ElapsedMilliseconds { get; set; }
public float Rate { get { return ((float)Length)/ElapsedMilliseconds; } }
}
你能上什么“最佳批量长度”是指你的问题阐述? – Romoku
我试图得到长度/时间的最佳比例 –
您是在优化长度还是时间? – Romoku