假设我们有涉及TopicID和StoryID, 与一对形成化合物主键列的SQL表TopicStory。
的目标是找到某种一套TopicID的组成。发布的问题最多需要五个 。深度优先搜索 这些概述如下。
随着搜索的深度限定在五,指出问题不能 比多项式复杂性更差。然而,要求为最大的话题集合,可以与像 约束(选择每个主题不相关的 任何其他选择的主题至少两层)可以发现一个普遍问题 可能是NP完全问题。
一词的使用“搜索”中提出了一种回溯算法。在 以下,我们通过嵌套循环实现了回溯,其中每个循环是 ,由其外部的循环定义和参数化。
在我们给出详细的细节之前,可能会激发描述 “蛮力”的方法,接下来可以更容易地理解更复杂的方法 。
BRUTE_FORCE:
Generate all possible subsets of five topics.
Test each of these sets for feasibility (each topic has
at least two stories unrelated to any of the other topics).
我们的深度优先搜索的草图假设议题一共有排序, 例如按照TopicID的整数值排序。这允许生成主题 而不重复(由于主题的排列)。
NESTED_LOOPS:
(Loop_1) Select into List_1 all topics with at least two stories.
Iterate through List_1, choosing the first topic %T1%.
PASS control into Loop_2.
CONTINUE in Loop_1.
If the end of List_1 is reached, EXIT with failure.
(Loop_2) Select into List_2 all topics > %T1% with at least two
stories unrelated to %T1%.
Iterate through List_2, choosing the second topic %T2%.
If topic %T1% still has at least two stories unrelated
to %T2%, PASS control into Loop_3.
CONTINUE in Loop_2.
If the end of List_2 is reached, go BACK to Loop_1.
(Loop 3) Select into List_3 all topics > %T2% with at least two
stories unrelated to %T1% or %T2%.
Iterate through List_3, choosing the third topic %T3%.
If topic %T1% still has at least two stories unrelated
to %T2% or %T3%,
and topic %T2% still has at least two stories unrelated
to %T1% or %T3%, PASS control into Loop_4.
CONTINUE in Loop_3.
If the end of List_3 is reached, go BACK to Loop_2.
(Loop 4) Select into List_4 all topics > %T3% with at least two
stories unrelated to %T1%, %T2%, or %T3%.
Iterate through List_4, choosing the fourth topic %T4%.
If topic %T1% still has at least two stories unrelated
to %T2%, %T3%, or %T4%,
and topic %T2% still has at least two stories unrelated
to %T1%, %T3%, or %T4%,
and topic %T3% still has at least two stories unrelated
to %T1%, %T2%, or %T4%, PASS control into Loop_5.
CONTINUE in Loop_4.
If the end of List_4 is reached, go BACK to Loop_3.
(Loop 5) Select into List_5 all topics > %T4% with at least two
stories unrelated to %T1%, %T2%, %T3%, or %T4%.
Iterate through List_5, choosing the fifh topic %T5%.
If topic %T1% still has at least two stories unrelated
to %T2%, %T3%, %T4%, or %T5%,
and topic %T2% still has at least two stories unrelated
to %T1%, %T3%, %T4%, or %T5%,
and topic %T3% still has at least two stories unrelated
to %T1%, %T2%, %T4%, or %T5%,
and topic %T4% still has at least two stories unrelated
to %T1%, %T2%, %T3%, or %T5%, EXIT with success
returning five topics %T1%, %T2%, %T3%, %T4%, and %T5%.
CONTINUE in Loop_5.
If the end of List_5 is reached, go BACK to Loop_4.
使用“选择”在每个嵌套循环的开口是为了唤起 SQL查询的可能性,以实现许多逻辑的。对于 例如最外层循环基本上是刚开结果集中 此查询:
SELECT TS1.TopicID, Count(*)
From TopicStory TS1
Group By TS1.TopicID
Having Count(*) > 1
内部循环的相应的列表可以类似 通过SQL查询取决于在 所选主题的参数值来构造外环。为了说明没有不必要的重复让我们跳 权内部循环,并给出List_5适当的查询:
SELECT TS5.TopicID, Count(*)
From TopicStory TS5
Where TS5.TopicID > %T4%
and NOT EXISTS (SELECT *
From TopicStory TSX
Where TSX.TopicID in (%T1%,%T2%,%T3%,%T4%)
and TSX.StoryID = TS5.StoryID
)
Group By TS5.TopicID
Having Count(*) > 1
这之后将在List_5检查%T5%产生 计数至少两个故事的左对于主题%T1%:
SELECT Count(*)
From TopicStory TZ1
Where TZ1.TopicID = %T1%
and NOT EXISTS (SELECT *
From TopicStory TX1
Where TX1.StoryID = TZ1.StoryID
and TX1.TopicID in (%T2%,%T3%,%T4%,TS5.TopicID)
)
并且对于其他先前话题选择进行了必要的修改。
虽然它可能会降低性能不可接受,限制相关%T5%主题额外的逻辑 (使前面的话题选择 仍然保留至少两层的选择)可以推入一个查询。 它应该是这样的:
/*
Given %T1%, %T2%, %T3$, and %T4% from queries above, find all topics %T5% > %T4%
with at least 2 stories not related to %T1%, %T2%, %T3%, or %T4% and such that
%T1% still has at least 2 stories not related to %T2%, %T3%, %T4%, or %T5% and
%T2% still has at least 2 stories not related to %T1%, %T3%, %T4%, or %T5% and
%T3% still has at least 2 stories not related to %T1%, %T2%, %T4%, or %T5% and
%T4% still has at least 2 stories not related to %T1%, %T2%, %T3%, or %T5%
*/
SELECT TS5.TopicID, Count(*)
From TopicStory TS5
Where TS5.TopicID > %T4%
and NOT EXISTS (SELECT *
From TopicStory TSX
Where TSX.TopicID in (%T1%,%T2%,%T3%,%T4%)
and TSX.StoryID = TS5.StoryID
)
and (SELECT Count(*)
From TopicStory TZ1
Where TZ1.TopicID = %T1%
and NOT EXISTS (SELECT *
From TopicStory TX1
Where TX1.StoryID = TZ1.StoryID
and TX1.TopicID in (%T2%,%T3%,%T4%,TS5.TopicID)
)
) > 1
and (SELECT Count(*)
From TopicStory TZ2
Where TZ2.TopicID = %T2%
and NOT EXISTS (SELECT *
From TopicStory TX2
Where TX2.StoryID = TZ2.StoryID
and TX2.TopicID in (%T1%,%T3%,%T4%,TS5.TopicID)
)
) > 1
and (SELECT Count(*)
From TopicStory TZ3
Where TZ3.TopicID = %T3%
and NOT EXISTS (SELECT *
From TopicStory TX3
Where TX3.StoryID = TZ3.StoryID
and TX3.TopicID in (%T1%,%T2%,%T4%,TS5.TopicID)
)
) > 1
and (SELECT Count(*)
From TopicStory TZ4
Where TZ4.TopicID = %T4%
and NOT EXISTS (SELECT *
From TopicStory TX3
Where TX3.StoryID = TZ3.StoryID
and TX3.TopicID in (%T1%,%T2%,%T3%,TS5.TopicID)
)
) > 1
Group By TS5.TopicID
Having Count(*) > 1
的功能集的MySQL是一个进展中的工作,所以可以想象在存储过程中的高效 实现是可能的,其中光标可以采取主题列出的 作用。然而,如果 “游标”是外部管理列表(例如PHP),并且数据库查询 保持尽可能简单,我会对自己的良好业绩充满信心。
你想用什么语言来完成? – CraigW 2011-06-15 21:13:40
这似乎是一个主题和故事之间的二分图上的一种匹配问题。目标是找到一个解决方案还是列出所有可能的解决方案? – hardmath 2011-06-15 21:17:53
@CraigW:语言是PHP和MySQL。 – akanevsky 2011-06-15 21:18:28