我有以下结构的表:如何获得匹配串的数量在一个大桌子
+-----+-------------------+
| ID | Name |
+-----+-------------------+
| 1 | abc |
+-----+-------------------+
| 2 | abc (duplicate) |
+-----+-------------------+
| 3 | bcd |
+-----+-------------------+
| 4 | bcd (duplicate) |
+-----+-------------------+
| 5 | bcd (duplicate) |
+-----+-------------------+
| 6 | efg |
+-----+-------------------+
| 7 | hij |
+-----+-------------------+
我要统计每个Name
次数(含(duplicate)
含税),即:
+-------------------+--------+
| Name | Count |
+-------------------+--------+
| abc | 2 |
+-------------------+--------+
| bcd | 3 |
+-------------------+--------+
| efg | 1 |
+-------------------+--------+
| hij | 1 |
+-------------------+--------+
我想提一下,那Name
这一列实际上有类型TINYTEXT
。并且会有很多行: in test mode already。我试图让自己的TRIM(REPLACE(Name, '(duplicate)', ''))
与分组连接表:
SELECT
DISTINCT TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) as `name`,
COUNT(`s`.`ID`) as `count`
FROM
`Table` as `t` INNER JOIN `Table` as `s` ON
TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) LIKE TRIM(REPLACE(`s`.`Name`, '(duplicate)', ''))
GROUP BY 1;
而且......嗯,花了122.62秒,我的开发机器上4846行的结果(?!)。
Q1:是不是一个正确的做法?
Q2:有什么办法可以让它更快吗?
任何帮助,将不胜感激。
哦,你居然加**(dublicate)**? – devnull
有没有这样的标记。应该? – BlitZ
所有名称的长度均为3个字符? – Mr47