CodeGo.net>选择不同计数真的很慢
作者:互联网
我有一个大约7000个对象的循环,并且在循环中我需要获得结构列表的不同计数.目前,我正在使用-
foreach (var product in productsToSearch)
{
Console.WriteLine("Time elapsed: {0} start", stopwatch.Elapsed);
var cumulativeCount = 0;
productStore.Add(product);
var orderLinesList = totalOrderLines
.Where(myRows => productStore.Contains(myRows.Sku))
.Select(myRows => new OrderLineStruct
{
OrderId = myRows.OrderId,
Sku = myRows.Sku
});
var differences = totalOrderLines.Except(orderLinesList);
cumulativeCount = totalOrderLinsCount - differences.Select(x => x.OrderId).Distinct().Count();
cumulativeStoreTable.Rows.Add(product, cumulativeCount);
Console.WriteLine("Time elapsed: {0} end", stopwatch.Elapsed);
}
public struct OrderLineStruct
{
public string OrderId { get; set; }
public string Sku { get; set; }
}
获得唯一计数时,这非常慢.有人知道这样做更有效的方法吗?我尝试过使用MoreLinq,它对Linq具有DisctintBy方法,但是它的效率并不高.我已经玩过PLinq,但是我不确定在哪里可以并行化此查询.
因此,循环的每次迭代的时间为-
经过的时间:00:00:37.1142047开始
经过的时间:00:00:37.8310148结束
= 0.7168101秒
* 7000 = 5017.6707(83.627845分钟)
它的Distinct()Count()行花费的时间最多(约0.5秒).变量差异具有数十万个OrderLineStruct,因此对此执行任何linq查询的速度都很慢.
更新
我对循环进行了一些修改,现在它在大约10分钟内运行,而不是1个小时以上
foreach (var product in productsToSearch)
{
var cumulativeCount = 0;
productStore.Add(product);
var orderLinesList = totalOrderLines
.Join(productStore, myRows => myRows.Sku, p => p, (myRows, p) => myRows)
.Select(myRows => new OrderLineStruct
{
OrderId = myRows.OrderId,
Sku = myRows.Sku
});
totalOrderLines = totalOrderLines.Except(orderLinesList).ToList();
cumulativeCount = totalOrderLinesCount - totalOrderLines.Select(x => x.OrderId).Distinct().Count();
cumulativeStoreTable.Rows.Add(product, cumulativeCount);
}
在Except上具有.ToList()似乎有所不同,现在我在每次迭代后都删除已处理的订单,这将提高每次迭代的性能.
解决方法:
您在错误的位置寻找问题.
orderLinesList,差异和差异.Select(x => x.OrderId).Distinct()只是具有延迟执行的LINQ to Objects链接查询方法,而Count()方法正在全部执行它们.
您的处理算法效率很低.瓶颈是orderLinesList查询,它对每个产品的整个totalOrderLines列表进行迭代,并且将其链接(包含)在Except,Distinct等中-再次在循环内,即7000次.
这是IMO可以执行的示例高效算法:
Console.WriteLine("Time elapsed: {0} start", stopwatch.Elapsed);
var productInfo =
(
from product in productsToSearch
join line in totalOrderLines on product equals line.Sku into orderLines
select new { Product = product, OrderLines = orderLines }
).ToList();
var lastIndexByOrderId = new Dictionary<string, int>();
for (int i = 0; i < productInfo.Count; i++)
{
foreach (var line in productInfo[i].OrderLines)
lastIndexByOrderId[line.OrderId] = i; // Last wins
}
int cumulativeCount = 0;
for (int i = 0; i < productInfo.Count; i++)
{
var product = productInfo[i].Product;
foreach (var line in productInfo[i].OrderLines)
{
int lastIndex;
if (lastIndexByOrderId.TryGetValue(line.OrderId, out lastIndex) && lastIndex == i)
{
cumulativeCount++;
lastIndexByOrderId.Remove(line.OrderId);
}
}
cumulativeStoreTable.Rows.Add(item.Product, cumulativeCount);
// Remove the next if it was just to support your processing
productStore.Add(item.Product);
}
Console.WriteLine("Time elapsed: {0} end", stopwatch.Elapsed);
标签:plinq,morelinq,linq,c 来源: https://codeday.me/bug/20191027/1943482.html