2013-01-14 20 views
0
SELECT nl.legal_name, nl.city, c.description 'Country', it.lei, sec.sym 
FROM name_loc    nl 
INNER JOIN ident_tbl_tmp it ON nl.fk_ident_id = it.id 
INNER JOIN sym_exch_cnty sec ON it.fk_sec_id = sec.id 
INNER JOIN countries  c ON nl.fk_cnty_id = c.id 
WHERE legal_name REGEXP '^For' 
limit 100; 

使用上述查询将返回500+行数据。的部分输出是:MySQL查询,用于查找使用REGEXP重复以匹配前7个字符

输出:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| FOREFRONT GROUP LTD HKD0.01(SUB   | PENDING  | HONG KONG  | NA     | 2903.HK  | 
| FOREFRONT HOLDINGS      | PENDING  | UNITED STATES | NA     | FFHN   | 
| FOREIGN & COL INV TR      | PENDING  | UNITED STATES | NA     | FLIVF  | 
| Foreign & Colonial Investment Trust  | PENDING  | NEW ZEALAND | NA     | FCT.NZ  | 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E1:B0I.SI | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E2:B0I.SI | 

我需要查询时,第一个字符“n”匹配和国家都是一样的,这将返回一个结果。

这将是对前7个字符相匹配的正确的结果:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E1:B0I.SI | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E2:B0I.SI | 

这将是用于在第一14个字符相匹配的正确的结果:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 

我曾尝试过各种子查询,但没有运气。我认为我可能需要一个功能或程序,但我不确定。

+0

对于“外国和殖民地......”在前7个字符中如何与“前陆”相匹配,我感到困惑? –

+0

对不起,我的“Foreign&Colonial ...”中的前7个字符是“Foreign”,因此两行的名称基于legal_name中的前7个字符重复。 – John

回答

1

您可以简单地GROUP BY Country, LEFT(legal_name, 7)。这将确保您只为国家和名称前缀的每个组合获得一行输出。你对哪一行将没有影响。如果您想跟踪原始行数,您甚至可以添加列COUNT(*) AS number_of_duplicates

+0

谢谢,这不是我正在寻找的,但它会工作。一个问题,我怎样才能限制select(所打印的行)仅限于那些“COUNT(*)AS number_of_duplicates”大于1的行?我试过“SELECT if(count(*)> = 2,nl.legal_name,nl.city,c.description'Country',it.lei,sec.sym,COUNT(*)AS number_of_duplicates,'')”一条错误消息。谢谢。 – John

+0

@John:使用集合函数的重构与'HAVING'一起使用。所以你要么写'HAVING number_of_duplicates> 1'或'HAVING COUNT(*)> 1'。 – MvG

+0

很酷,感谢您的帮助 – John