2012-10-28 133 views
0

我在我的数据库中有27个表格。一个词表(一个拼字游戏词表)和26个关联表。如何在字符串单词应用程序中的简单的mysql/php单词中查找单词匹配?

Table Fields 
================ 
word [id,word] 
a  [word_id] 
b  [word_id] 
... 
z  [word_id] 

我想弄清楚给定一个字符串匹配的单词。

例如,如果给定的字符串是pant,我想知道:pant, apt, pat, tap, ant, tan, nap, pan, at, ta, pa, an, na

我目前的策略是爆炸字符串中的每个字母,并找到匹配所有字母的关联词。

例如:

SELECT word.word 
FROM word, p, a, n, t 
WHERE 
    word.id = p.word_id OR 
    word.id = a.word_id OR 
    word.id = n.word_id OR 
    word.id = t.word_id 

但这结束打印该具有p,A,N或它们吨所有单词。

如果我切换所有的运营商到AND,我坚持只有一个匹配:pant

你能帮我解决这个谜题吗?

我还关心如何处理字符串中的重复字母。例如,PPANT应该为app找到一个匹配项,当纯PANT不应该。

我在正确的轨道与关联表或有更好的方法吗?

我试图在php/mysql中相当有效地处理这个问题。我知道还有其他人在C,Perl,Java等之前解决了这个谜题。

回答

1

我不熟悉MySQL的高级功能,所以我不能说是否有办法在程序上执行此限制,这可能为您节省大量存储空间。尽管如此,我会提供这种可能性。

说,这是你的字表:

+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 
| word | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | 
+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 
| pant  | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 
+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 
| ppant | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 
+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 
| app  | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 
| kick  | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
+==========+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+ 

然后你的查询可能如下所示:

SELECT word.word FROM word 
JOIN 
(
    SELECT * FROM word WHERE word.word = "pant" 
) AS root 
ON 
    word.a <= root.a 
AND word.b <= root.b 
AND word.c <= root.c 
AND word.d <= root.d 
AND word.e <= root.e 
AND word.f <= root.f 
AND word.g <= root.g 
AND word.h <= root.h 
AND word.i <= root.i 
AND word.j <= root.j 
AND word.k <= root.k 
AND word.l <= root.l 
AND word.m <= root.m 
AND word.n <= root.n 
AND word.o <= root.o 
AND word.p <= root.p 
AND word.q <= root.q 
AND word.r <= root.r 
AND word.s <= root.s 
AND word.t <= root.t 
AND word.u <= root.u 
AND word.v <= root.v 
AND word.w <= root.w 
AND word.x <= root.x 
AND word.y <= root.y 
AND word.z <= root.z 

现在,当然有办法正常化表和多种方式来创造查询。你应该尝试一下对你的情况最有意义的事情。

+0

我不知道这将如何工作,以匹配没有所需的确切字母的单词。例如,给定字符串:'anppt',这将如何返回单词'app'?或者甚至考虑到你提到的字符串,“anpt”,你将如何从这里得到“ant”或“at”? – Ryan

+0

你说得对Ryan。我的解决方案只能找到anagrams。我已经更新了我的答案。 – erisco