重构使用row_number（）返回具有唯一列值的行的tsql视图

我有一个sql视图，我正在使用它来检索数据。我们可以说它有一大串产品，它们与购买它们的顾客有关。每个产品的视图应只返回一行，而不管它链接到多少客户。我正在使用row_number函数来实现这一点。（这个例子被简化，一般情况会查询那里应该只为这是返回行的一些列X的每个唯一值一个返回行并不重要）重构使用row_number（）返回具有唯一列值的行的tsql视图

CREATE VIEW productView AS 
SELECT * FROM 
    (SELECT 
     Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering, 
     customer.Id 
     //various other columns 
    FROM products 
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id 
    //various other joins 
    ) as temp 
WHERE temp.prodcut_numbering = 1

现在让我们说，该视图中的总行数为〜100万，并且从productView运行select *需要10秒。在产品ID = 10的productView中执行查询（如select *）需要相同的时间。我相信这是因为查询得到这个评估

SELECT * FROM 
    (SELECT 
     Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering, 
     customer.Id 
     //various other columns 
    FROM products 
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id 
    //various other joins 
    ) as temp 
WHERE prodcut_numbering = 1 and prodcut.Id = 10

我认为这是导致内部子查询每次完整评估。理想情况下，我想沿着以下方向使用一些东西

SELECT 
    Row_number() OVER(PARTITION BY products.productID ORDER BY products.productID) AS product_numbering, 
    customer.id 
    //various other columns 
FROM products 
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id 
    //various other joins 
WHERE prodcut_numbering = 1

但是这似乎不被允许。有什么办法可以做类似的事吗？

编辑 -

多次试验之后，我相信我有实际的问题是如何强制加入到只返回1行。我尝试使用outer apply，如下所示。一些示例代码。

CREATE TABLE Products (id int not null PRIMARY KEY) 
CREATE TABLE Customers (
     id int not null PRIMARY KEY, 
     productId int not null, 
     value varchar(20) NOT NULL) 

declare @count int = 1 
while @count <= 150000 
begin 
     insert into Customers (id, productID, value) 
     values (@count,@count/2, 'Value ' + cast(@count/2 as varchar))  
     insert into Products (id) 
     values (@count) 
     SET @count = @count + 1 
end 

CREATE NONCLUSTERED INDEX productId ON Customers (productID ASC)

通过上述样本集，下面

select * from Products 
outer apply (select top 1 * 
      from Customers 
      where Products.id = Customers.productID) Customers

了 '让一切' 查询时间1000毫秒〜运行。添加明确的条件：

select * from Products 
outer apply (select top 1 * 
      from Customers 
      where Products.id = Customers.productID) Customers 
where Customers.value = 'Value 45872'

需要一定的时间。对于相当简单的查询来说，这1000毫秒已经太多了，并且在添加其他类似的连接时会以错误的方式扩展（向上）。

来源

2011-10-18 John

你需要实际客户的详细信息或者只是存在或只是一个客户ID？子查询是评估的，因为“10”事先不知道。你正在问第十排。因此，我的第一个问题关于期望的输出 – gbn

真的很好的观察 - SQL无法将视图过滤器应用到子查询中。你真的需要视图的灵活性吗？如果您使用SPROC或带有“已定义”过滤器的表值函数（在您的示例中为ProductID），则可以将过滤器构建到子查询中。而在PARTITION BY和FILTER相同的情况下（ProductId），根本不需要PARTITION--所以SELECT TOP 1应该足够了。 – StuartLC

我确实需要实际的客户详细信息（如果不存在，则为空值），而不仅仅是一个的存在。我也必须使用视图，重构检索数据的应用程序是不可能的。 – John

如果你不喜欢的东西：

SELECT ... 
FROM products 
OUTER APPLY (SELECT TOP 1 * from customer where customerid = products.buyerid) as customer 
...

然后在productId使用过滤器应该有所帮助。不过，这可能会更糟。

来源

2011-10-18 21:27:45 GilM

使用公共表格表达式（CTE）尝试以下方法。使用您提供的测试数据，它会在不到一秒的时间内返回特定的ProductIds。

create view ProductTest as 

with cte as (
select 
    row_number() over (partition by p.id order by p.id) as RN, 
    c.* 
from 
    Products p 
    inner join Customers c 
     on p.id = c.productid 
) 

select * 
from cte 
where RN = 1 
go 

select * from ProductTest where ProductId = 25

来源

2011-10-24 18:06:22

这似乎确实比其他方法运行得更快，但它仍然会导致对整个子查询进行评估。单独从ProductTest执行'select * *需要大致相同的时间，并且具有与where子句相同的执行计划。 – John

我认为这是由于观点本身的本质而最好的。另一种选择是创建一个存储过程，或者一个表值函数，它可以传入productid中，并且可以直接对您希望查询的部分进行过滤。 –

问题是您的数据模型有缺陷。你应该有三个表：

客户（客户ID，...）
产品（productId参数，...）
ProductSales（客户ID，productId参数）

此外，销售表应该可以分成一对多（Sales和SalesDetails）。除非你修复你的数据模型，否则你只会在你的尾巴上追逐红鲱鱼问题。如果系统不是您的设计，请修复它。如果老板不让你修复它，然后修复它。如果你不能修复它，然后修复它。对于您提出的错误数据模型来说，并不容易。

来源

2011-10-29 19:36:04

这将可能是速度不够快，如果你真的不在乎你带来哪些客户回

select p1.*, c1.* 
FROM products p1 
Left Join (
     select p2.id, max(c2.id) max_customer_id 
     From product p2 
     Join customer c2 on 
     c2.productID = p2.id 
     group by 1 
) product_max_customer 
Left join customer c1 on 
c1.id = product_max_customer.max_customer_id 
;

来源

2014-06-04 17:09:42 spioter

重构使用row_number（）返回具有唯一列值的行的tsql视图

回答

相关问题