SQL Server最佳匹配查询与更新（T-SQL）

我试图找出什么是最优化的SQL查询来实现以下。SQL Server最佳匹配查询与更新（T-SQL）

我有一个包含邮编/ PostalCodes一个表，让我们假设以下结构：

table_codes：

ID | ZipCode 
--------------- 
1  1234 
2  1235 
3  456

等。

我的应用程序的用户填写了他们需要输入他们的邮编（邮编）的轮廓。假设有时，用户将进入我的表没有定义一个邮编，我想根据用户输入的zip建议最佳匹配。

我使用下面的查询：

Declare @entered_zipcode varchar(10) 
set @entered_zipcode = '23456' 


SELECT TOP 1 table_codes.ZipCode 
FROM table_codes 
where @entered_zipcode LIKE table_codes.ZipCode + '%' 
or table_codes.ZipCode + '%' like @entered_zipcode + '%' 
ORDER BY table_codes.ZipCode, LEN(table_codes.ZipCode) DESC

基本上，我想以下几点：

如果@entered_zipcode长于表中的任何邮政编码，我想以获得匹配@entered_zipcode的zip表中的最佳前缀如果@entered_zipcode比表中的任何现有代码都短，我试图在匹配@entered_zipcode的zip表中获得最佳前缀
在表中使用它作为前缀，并得到最佳匹配

此外，我建立一个临时表结构如下：

#tmpTable 
------------------------------------------------------------------------------------ 
ID | user1_enteredzip | user1_bestmatchzip | user2_enteredzip | user2_bestmatchzip | 
------------------------------------------------------------------------------------ 
1 | 12    |  *1234*   |  4567  |  **456**  | 
2 | 
3 | 
4 |

进入拉链是用户输入一个和* .. *之间的代码是我查找表中最匹配的代码，我试图使用下面的查询。

查询时间似乎有点长，这就是为什么我在优化它求助：

 update #tmpTable 
     set  user1_bestmatchzip = (SELECT TOP 1 
              zipcode 
            FROM table_codes 
            where #tmpTable.user1_enteredzip LIKE table_codes.zipcode + '%' 
              or table_codes.zipcode + '%' like #tmpTable.user1_enteredzip + '%' 
            ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC 
           ), 
       user2_bestmatchzip = (SELECT TOP 1 
              zipcode 
            FROM table_codes 
            where #tmpTable.user2_enteredzip LIKE table_codes.zipcode + '%' 
              or table_codes.zipcode + '%' like #tmpTable.user2_enteredzip + '%' 
            ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC 
           ) 
     from #tmpTable

来源

2012-02-29 mmmmmm

你为什么使用临时表？ – vulkanino 2012-02-29 12:57:19

我正在尝试使用该临时表进行一些计算。我想展示的是，我需要在一次更新操作中为2列获得最佳匹配zip。在我看来，我的查询不是最理想的方式。 – mmmmmm 2012-02-29 13:07:25

如果你改变你的临时表是这样的：

id | user | enteredzip | bestmatchzip 
10 | 1 | 12345  | 12345 
20 | 2 | 12   | 12345

即：使用列保存用户号码（1或2）。这样，您将一次更新一行。

此外，ORDER BY需要时间，你的邮政编码设置指标？难道你不能在zipcodes表中创建一个字段“length”来预先计算邮编长度吗？

编辑： 我在想，排序由LEN是没有意义的，你可以删除！如果邮政编码不能有重复，那么通过邮政编码排序就足够了。如果他们可以，LEN将永远是平等的！如果你比较最小长度的字符串什么 -

来源

2012-02-29 13:10:07 vulkanino

谢谢vulkanino。我有一个ZipColumn的索引，但长度字段听起来像一个伟大的ideea。我将尝试这一点，并得到结果。非常感谢你。 – mmmmmm 2012-02-29 13:12:29

预先计算的“zip len”在减少执行时间方面有所帮助，但却没有那么多。我正在考虑你提出的另一种方法。 – mmmmmm 2012-02-29 13:20:31

Oau。 Vulkanino，你真的很好。我没有想过这件事，但你是绝对正确的。 Len的订单没有意义。我刚删除了这个子句，现在执行时间缩短了3-4倍。从700ms到〜200。你摇滚！ – mmmmmm 2012-02-29 13:24:27

你比较两个字符串的前几个字符？

select top 1 zipcode 
from table_zipcodes 
where substring(zipcode, 1, case when len(zipcode) > len (@entered_zipcode) then len(@entered_zipcode) else len (zipcode) end) 
    = substring (@entered_zipcode, 1, case when len(zipcode) > len (@entered_zipcode) then len(@entered_zipcode) else len (zipcode) end) 
order by len (zipcode) desc

将删除或并允许指数*的使用IN_ @ entered_zipcode LIKE table_codes.ZipCode + '％' *。另外，在我看来，结果的排序是错误的 - 较短的邮编先行。

来源

2012-02-29 13:39:02

谢谢尼古拉的回答。我也尝试过，但执行时间比我当前的查询翻了一番。（这是我的原始查询没有由Vulkanino建议的ORDER by len子句） – mmmmmm 2012-02-29 14:24:16

SQL Server最佳匹配查询与更新（T-SQL）

回答

相关问题