2017-06-14 48 views
2

拥有原始和“最终”/ 结果树。我想比较这些树木并“重现”这些步骤,这些步骤将被携带以获得相同的结果。将原始树合并到结果树中的步骤

现实生活中的例子:在数据库中有原始树。工作人员已准备好更改(在App中生成新的结果树),现在我们需要更新数据库。我们无法删除数据库并重新上传,因为可能有尚未生成的数据。

类/表定义:

class TreeNode 
{ 
    public string Text { get; set; } 
    public TreeNode Parent { get; set; } 

    /* some other properties */ 
} 

实施例的树木:

Origin       Result 
|A        |A 
| -1       | -2 
| -2       |C 
|B        | -3 
| -5       |D 
| --£       | -1 
|C        | --£ 
|F        | -5 
| -7       |E 
|H        | -6 
           |G 
           | -4 
           |H 

我希望是有一个算法,通过该我将被允许处理时的对象是加入删除移动

重要信息:有其他家长不应该删除加入后面的对象,相反,他们应该只有下其他家长感动!删除会导致数据丢失。

实施例:

Mark B as removed 
Mark F as removed 
Add D 
Add E 
Add G 
Move 1 under D 
Move 5 under D 
Mark 7 as removed 
Add 3 under C 
Add 6 under E 
Add 4 under G 
Move £ under 1 
Removed 7 
Removed F 
Removed B 

自己的解决方案

我创建样品与的Win-形式的TreeView。我的算法仅适用于每个级别的基础(例如,将1从A移动到D),但不能跨越。元素是第一个被删除的市场,最后被删除。

Application screenshot

代码:

//Recursive loop to find all nodes in Nth level 
private IEnumerable<TreeNode> getNodesOnLevel(TreeNodeCollection aCollection, int aLevel) 
{ 
    var lResultTreeNodeCol = new List<TreeNode>(); 

    if (aLevel == 1) 
     return aCollection.Cast<TreeNode>(); 

    foreach(TreeNode nNode in aCollection) 
    { 
     lResultTreeNodeCol.AddRange(getNodesOnLevel(nNode.Nodes, aLevel - 1)); 
    } 

    return lResultTreeNodeCol; 
} 

//Called once 
public void UpdateTrees(TreeNodeCollection aCollectionA, TreeNodeCollection aCollectionB) 
{ 
    List<TreeNode> lRemoved = new List<TreeNode>(); 
    for (int i = 1; UpdateWithLevel(aCollectionA, aCollectionB, i, ref lRemoved) > 0; i++) 
    { 
    } 
    var lRem = lRemoved.LastOrDefault(); 
    do 
    { 
     W($"Removed {lRem.Text}"); 
     lRemoved.Remove(lRem); 
    } while ((lRem = lRemoved.LastOrDefault()) != null); 

} 

//Called per level 
private int UpdateWithLevel(TreeNodeCollection aCollectionA, TreeNodeCollection aCollectionB, int level, ref List<TreeNode> aRemoved) 
{ 
    int lNumOfUpdates = 0; 
    var colA = getNodesOnLevel(aCollectionA, level); 
    var colB = getNodesOnLevel(aCollectionB, level); 

    //Search Original collection, compare to Result collection 
    foreach (TreeNode nodeA in colA) 
    { 
     //Find nodeA in Result collection 
     var lNodeAinColB = colB.FirstOrDefault((a) => a.Text == nodeA.Text); 

     if(lNodeAinColB == null) //NodeA not found in result collection - delete 
     { 
      aRemoved.Add(nodeA); 
      W($"Mark {nodeA.Text} as removed"); 
      lNumOfUpdates++; 
     } 
     else if((lNodeAinColB.Parent?.Text ?? "") != (nodeA.Parent?.Text ?? "")) //NodeA exists in Result collection, different parrent -> must be moved 
     { 
      W($"Move {nodeA.Text} under {lNodeAinColB.Parent.Text}"); 
      lNumOfUpdates++; 
     } 
    } 

    //Search Result collection, if Original collection does not have nodeB, we must create it (add) 
    foreach (TreeNode nodeB in colB) 
    { 
     if (!colA.Contains(nodeB, new TestNodeEquality())) 
     { 
      W($"Add {nodeB.Text}" + ((nodeB.Parent != null)?$" under {nodeB.Parent.Text}":"")); 
      lNumOfUpdates++; 
     } 
    } 

    return lNumOfUpdates; 
} 

我还没有找到一个适合我的问题,也不是宝贵的资源&我真的想避免重复轮的任何话题。

问题(S):

  • 有现有&工作alghoritm(名称/参考)?什么是这种被称为(Tree Diff/Merge/Lookup/..)的alghorithms/actions?

  • 我可以以任何方式优化alghoritm吗?

+0

@jdweng你能指点我指导文章吗? – Tatranskymedved

+1

如果每个节点都有一个唯一的标识,那么您可以轻松比较它们的状态更改,逐个节点忽略级别,然后应用更改,我想呢? – AKX

+0

https://en.wikipedia.org/wiki/Tree_sort https://en.wikipedia.org/wiki/Binary_search_tree https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree https:// en .wikipedia.org/wiki/Binary_tree https://en.wikipedia.org/wiki/Heapsort – jdweng

回答

3

我不认为你在这里需要一些复杂的递归算法。简单地说你的结果节点名称父字典和检查:

  • 原来的节点是否在字典
  • 原始节点的父代是否改变
  • 是否有结果的节点,其不存在于原始节点中

字典还提供了O(1)用于搜索节点,因此也将是一种优化。同样涉及Except操作,这是快速设置操作。

代码:

var originalNodes = new List<TreeNode>(); // TreeNodeCollection 
var nodes = new List<TreeNode>();   // TreeNodeCollection 
var parentByName = nodes.ToDictionary(n => n.Text, n => n.Parent); 

foreach(var originalNode in originalNodes) 
{ 
    TreeNode parent; 
    if (!parentByName.TryGetValue(originalNode.Text, out parent)) 
    { 
     // removed - there is no key for original node name 
     continue; 
    } 

    if (originalNode.Parent?.Text != parent?.Text) 
    { 
     // moved from originalNode.Parent to parent 
     continue; 
    } 
} 

// these guys are added 
var added = parentByName.Keys.Except(originalNodes.Select(n => n.Text)) 
+1

简单而强大。谢谢! – Tatranskymedved

1

我没有一个C#周围的环境,所以我想我可以在Python实现这一点 - 他们称之为可执行的伪代码,对不对? ;)

def node(id, children=[]): 
    assert all(isinstance(child, dict) for child in children) 
    return {'id': id, 'children': children} 

tree1 = [ 
    node('a', [ 
     node('1'), 
     node('2'), 
    ]), 
    node('b', [ 
     node('5', [ 
      node('*'), 
     ]), 
    ]), 
    node('c'), 
    node('f', [ 
     node('7'), 
    ]), 
    node('h'), 
] 


tree2 = [ 
    node('a', [ 
     node('2'), 
    ]), 
    node('c', [ 
     node('3'), 
    ]), 
    node('d', [ 
     node('1', [ 
      node('*'), 
     ]), 
     node('5'), 
    ]), 
    node('e', [ 
     node('6'), 
    ]), 
    node('g', [ 
     node('4'), 
    ]), 
    node('h'), 
] 

def walk(tree, fn, parent=None): 
    for node in tree: 
     fn(node, parent) 
     walk(node.get('children',()), fn, parent=node) 


def get_all_nodes_and_parents(tree): 
    nodes = {} 
    parents = {} 
    def add_node(node, parent): 
     nodes[node['id']] = node 
     parents[node['id']] = (parent['id'] if parent else None) 
    walk(tree, add_node) 
    return (nodes, parents) 


def treediff(t1, t2): 
    n1, p1 = get_all_nodes_and_parents(t1) 
    n2, p2 = get_all_nodes_and_parents(t2) 
    new_nodes = set(n2.keys()) - set(n1.keys()) 
    del_nodes = set(n1.keys()) - set(n2.keys()) 

    for node_id in sorted(new_nodes): 
     yield 'create node %s' % node_id 

    for node_id in sorted(del_nodes): 
     yield 'delete node %s' % node_id 

    for node_id in n2: 
     if p1.get(node_id) != p2.get(node_id): 
      yield 'move node %s from %s to %s' % (node_id, p1.get(node_id), p2.get(node_id)) 

for op in treediff(tree1, tree2): 
    print(op) 

此输出

create node 3 
create node 4 
create node 6 
create node d 
create node e 
create node g 
delete node 7 
delete node b 
delete node f 
move node 3 from None to c 
move node 1 from a to d 
move node * from 5 to 1 
move node 5 from b to d 
move node 6 from None to e 
move node 4 from None to g 

进一步的改善将是直接在他们的新父母来创建新的节点,但是这将需要增加的复杂性保持创造秩序的轨道,所以家长在他们的新孩子面前被创造。

+1

谢尔盖使得它更简单,没有递归等,但我很高兴为其他语言解决方案,欣赏! – Tatranskymedved

+0

当然,如果树有一个API来获得所有节点而不管它们的深度(以及它们是否提供父属性),则不需要递归:) – AKX