2014-01-08 27 views
3

在我们的应用中,我们允许用户编写特定的条件,我们允许他们使用这样的符号表示的条件:从缀转换条件公式前缀符号

(1 and 2 and 3 or 4) 

如果每个数字编号对应一个具体的规则/条件。现在的问题是,我应该怎么把它转换,使得最后的结果是这样的:

{ 
    "$or": [ 
     "$and": [1, 2, 3], 
     4 
    ] 
} 

再举一个例子:

(1 or 2 or 3 and 4) 

要:

{ 
    "$or": [ 
     1, 
     2, 
     "$and": [3, 4] 
    ] 
} 



我h AVE写50在标记生成器的线,成功地符号化语句为标记和使用堆栈/ PEEK算法验证,令牌看起来是这样的:

["(", "1", "and", "2", "and", "3", "or", "4", ")"] 

现在我应该如何转换这种“中间符号”的成“前缀记法”与and优先于or的规则?

部分指针或关键字非常感谢!现在我所拥有的并不是真正将我引向目前我需要的东西。

一些研究至今:

编辑

此外,用户可以指定任意数量的括号的能力,如果他们坚持,比如像:

((1 or 3) and (2 or 4) or 5) 

所以得到翻译为:

{ 
    "$or": [{ 
     $and": [ 
      "$or": [1, 3], 
      "$or": [2, 4] 
     }, 
     5 
    ] 
} 

编辑2

我想出了算法。 Posted as an answer below。感谢您的帮助!

+1

http://blog.reverberate.org/2013/07/ll-and-lr-parsing -demystified.html – zerkms

+0

谢谢@zerkms!我肯定在研究中需要这个:) –

+0

PS:我认为你的第二种情况下的AST缺少第二个'$或'标记。它应该是这样的:http://sketchia.com/draw_G2HXgRv.html PPS:疯狂的绘画技巧,我知道) – zerkms

回答

0

感谢指导家伙们,至少我拿出了我自己的解决方案。由于这是我第一次做数学方程解析,请原谅我,如果我这样做是错误的或无效的,或者帮我找到这个错误:

基本上,这里是我做的步骤,这一点:

  1. 之前解析,总是验证模式。出现错误时抛出错误。
  2. 一旦经过验证,我们会为前缀符号转换做一个中缀表示法。此步骤要求“和”优先于“或”。
    1. 反转给定的模式
    2. 是否将中缀转换为后缀表示法转换。I dumb, I learn from this
    3. 做反向再次
    4. 中缀到前缀应该在这个阶段
  3. 构建从前缀符号,使得
    1. 节点总是有一棵树来完成,并且最大,有两个分支机构
    2. 遍历,直到它达到全叶
  4. 优化树,使得合并同类运营商一起(如多发性$and运营子$and可以合并形成一个较短的树)
  5. 混合与设定的给定的标准,并且全部完成!

工作实例可以在这里找到:http://jsfiddle.net/chaoszcat/uGKYj/3/

工作如下代码:

(function() { 

    /** 
    * This is a source example of my original question on 
    * http://stackoverflow.com/questions/20986255/converting-conditional-equation-from-infix-to-prefix-notation 
    * 
    * This is my solution and use it at your own risk 
    * @author Lionel Chan <chaoszcat[at]gmail.com> 
    */ 

    /** 
    * isNumeric, from jQuery. Duplicated here to make this js code pure 
    * @param {mix} n Test subject 
    * @returns {boolean} true if it's numeric 
    */ 
    function isNumeric(n) { 
     return !isNaN(parseFloat(n))&&isFinite(n); 
    } 

    /** 
    * Node class - represent a operator or numeric node 
    * @param {string} token The token string, operator "and", "or", or numeric value 
    */ 
    function Node(token) { 
     this.parent = null; 
     this.children = []; //one node has two children at most 
     this.token = token; 
     this.is_operator = token === 'and' || token === 'or'; 
     this.is_numeric = !this.is_operator; 
     this.destroyed = false; 
    } 

    Node.prototype = { 

     isOperator: function() { return this.is_operator;}, 
     isNumeric: function() { return this.is_numeric;}, 

     //While building tree, a node is full if there are two children 
     isFull: function() { 
      return this.children.length >= 2; 
     }, 

     addChild: function(node) { 
      node.parent = this; 
      this.children.push(node); 
     }, 

     hasParent: function() { 
      return this.parent !== null; 
     }, 

     indexOfChild: function(node) { 
      for (var i = 0 ; i < this.children.length ; ++i) { 
       if (this.children[i] === node) { 
        return i; 
       } 
      } 
      return -1; 
     }, 

     removeChild: function(node) { 
      var idx = this.indexOfChild(node); 
      if (idx >= 0) { 
       this.children[idx].parent = null; //remove parent relationship 
       this.children.splice(idx, 1); //splice it out 
      } 
     }, 

     /** 
     * Pass my children to the target node, and destroy myself 
     * 
     * @param {Node} node A target node 
     */ 
     passChildrenTo: function(node) { 
      for (var i = 0 ; i < this.children.length ; ++i) { 
       node.addChild(this.children[i]); 
      } 
      this.destroy(); 
     }, 

     //Destroy this node 
     destroy: function() { 
      this.parent.removeChild(this); 
      this.children = null; 
      this.destroyed = true; 
     } 
    }; 

    /** 
    * Tree class - node manipulation 
    * @param {array} prefixTokens The converted, prefix-notated tokens 
    */ 
    function Tree(prefixTokens) { 
     this.buildTree(prefixTokens); 
     //Optimize tree - so that the tree will merge multiple similar operators together 
     this.optimize(this.root); 
    } 

    Tree.prototype = { 
     root: null, 

     //Reference to the deepest operator node in the tree for next attachment point 
     deepestNode: null, 

     /** 
     * Render this tree with given criteria array 
     * @param {array} crits 
     * @returns {object} The built criteria 
     */ 
     render: function(crits) { 
      //After optimization, we build the criteria and that's all! 
      return this.buildCriteria(this.root, crits); 
     }, 

     /** 
     * Build criteria from root node. Recursive 
     * 
     * @param {Node} node 
     * @param {array} crits 
     * @returns {object} of criteria 
     */ 
     buildCriteria: function(node, crits) { 

      var output = {}, 
       label = '$'+node.token; 

      output[label] = []; //cpnditions array 

      for (var i = 0 ; i < node.children.length ; ++i) { 
       if (node.children[i].isOperator()) { 
        output[label].push(this.buildCriteria(node.children[i], crits)); 
       }else{ 
        output[label].push(crits[node.children[i].token-1]); 
       } 
      } 
      return output; 
     }, 

     /** 
     * Optimize the tree, we can simplify nodes with same operator. Recursive 
     * 
     * @param {Node} node 
     * @void 
     */ 
     optimize: function(node) { 

      //note that node.children.length will keep changing since the swapping children will occur midway. Rescan is required 
      for (var i = 0 ; i < node.children.length ; ++i) { 
       if (node.children[i].isOperator()) { 
        this.optimize(node.children[i]); 
        if (node.children[i].token === node.token) { 
         node.children[i].passChildrenTo(node); 
         i = 0; //rescan this level whenever a swap occured 
        } 
       } 
      } 
     }, 

     /** 
     * Build tree from raw tokens 
     * @param {array} tokens 
     */ 
     buildTree: function(tokens) { 
      for (var i = 0 ; i < tokens.length ; ++i) { 
       this.addNode(new Node(tokens[i])); 
      } 
     }, 

     /** 
     * Add node into tree 
     * 
     * @param {Node} node 
     */ 
     addNode: function(node) { 

      //If no root? The first node is root 
      if (this.root === null) { 
       this.root = node; 
       this.deepestNode = node; 
       return; 
      } 

      //if deepestNode is full, traverse up until we find a node with capacity 
      while(this.deepestNode && this.deepestNode.isFull()) { 
       this.deepestNode = this.deepestNode.parent; 
      } 

      if (this.deepestNode) { 
       this.deepestNode.addChild(node); 
      } 

      //If the current node is an operator, we move the deepestNode cursor to it 
      if (node.isOperator()) { 
       this.deepestNode = node; 
      } 
     } 
    }; 

    /** 
    * Main criteria parser 
    */ 
    var CriteriaParser = { 

     /** 
     * Convert raw string of pattern (1 and 2 or 3) into the object of criteria pattern 
     * 
     * @param {string} str The raw pattern 
     * @param {array} crits The raw list of criteria 
     * @returns {String|Boolean} 
     */ 
     parse: function(str, crits) { 
      var tokens = this.tokenize(str), 
       validationResult = this.validate(tokens, crits), 
       prefixNotation = ''; 

      //Once succeded, we proceed to convert it to prefix notation 
      if (validationResult === true) { 
       prefixNotation = this.infixToPrefix(tokens); 
       return (new Tree(prefixNotation)).render(crits); 
      }else{ 
       return validationResult; 
      } 
     }, 

     /** 
     * Convert the infix notation of the pattern (1 and 2 or 3) into prefix notation "or and 1 2 3" 
     * 
     * Note: 
     * - and has higher precedence than or 
     * 
     * Steps: 
     * 1. Reverse the tokens array 
     * 2. Do infix -> postfix conversion (http://www.cs.arizona.edu/classes/cs227/spring12/infix.pdf, http://scriptasylum.com/tutorials/infix_postfix/algorithms/infix-postfix/index.htm) 
     * 3. Reverse the result 
     * 
     * @param {array} tokens The tokenized tokens 
     * @returns {array} prefix notation of pattern 
     */ 
     infixToPrefix: function(tokens) { 

      var reversedTokens = tokens.slice(0).reverse(), //slice to clone, so not to disturb the original array 
       stack = [], 
       output = []; 

      //And since it's reversed, please regard "(" as closing bracket, and ")" as opening bracket 
      do { 
       var stackTop = stack.length > 0 ? stack[stack.length-1] : null, 
        token = reversedTokens.shift(); 

       if (token === 'and') { 
        while(stackTop === 'and') { 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.push(token); 
        stackTop = token; 
       }else if (token === 'or') { 
        while(stackTop === 'and' || stackTop === 'or') { //and has higher precedence, so it will be popped out 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.push(token); 
        stackTop = token; 
       }else if (token === '(') { //'(' is closing bracket in reversed tokens 
        while(stackTop !== ')' && stackTop !== undefined) { //keep looping until found a "open -)" bracket 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.pop(); //remove the open ")" bracket 
        stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
       }else if (token === ')') { //')' is opening bracket in reversed tokens 
        stack.push(token); 
       }else if (isNumeric(token)) { 
        output.push(token); 
       }else if (token === undefined) { 
        // no more tokens. Just shift everything out from stack 
        while(stack.length) { 
         stackTop = stack.pop(); 

         if (stackTop !== undefined && stackTop !== ')') { 
          output.push(stackTop); 
         } 
        } 
       } 

      }while(stack.length || reversedTokens.length); 

      //Reverse output and we are done 
      return output.reverse(); 
     }, 

     /** 
     * Tokenized the provided pattern 
     * @param {string} str The raw pattern from user 
     * @returns {array} A tokenized array 
     */ 
     tokenize: function(str) { 
      var pattern = str.replace(/\s/g, ''), //remove all the spaces :) not needed 
       tokens = pattern.split(''), 
       tokenized = []; 

      //Tokenize it and verify 
      var token = null, 
       next = null; 

      //attempts to concatenate the "and" and "or" and numerics 
      while (tokens.length > 0) { 
       token = tokens.shift(); 
       next = tokens.length > 0 ? tokens[0] : null; 

       if (token === '(' || token === ')') { 
        tokenized.push(token); 
       }else if (token === 'a' && tokens.length >= 2 && tokens[0] === 'n' && tokens[1] === 'd') { //and 
        tokenized.push(token + tokens.shift() + tokens.shift()); 
       }else if (token === 'o' && tokens.length >= 1 && next === 'r') { //or 
        tokenized.push(token + tokens.shift()); 
       }else if (isNumeric(token)) { 
        while(isNumeric(next)) { 
         token += next; 
         tokens.shift(); //exhaust it 
         next = tokens.length > 0 ? tokens[0] : null; 
        } 
        tokenized.push(token); 
       }else{ 
        tokenized.push(token); 
       } 
      } 

      return tokenized; 
     }, 

     /** 
     * Attempt to validate tokenized tokens 
     * 
     * @param {array} tokens The tokenized tokens 
     * @param {array} crits The user provided criteria set 
     * @returns {Boolean|String} Returns boolean true if succeeded, string if error occured 
     */ 
     validate: function(tokens, crits) { 

      var valid = true, 
       token = null, 
       stack = [], 
       nextToken = null, 
       criteria_count = crits.length; 

      for (var i = 0 ; i < tokens.length ; ++i) { 

       token = tokens[i]; 
       nextToken = i < tokens.length - 1 ? tokens[i+1] : null; 

       if (token === '(') { 
        stack.push('('); 
        if (!isNumeric(nextToken) && nextToken !== '(' && nextToken !== ')') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (token === ')') { 
        if (stack.length > 0) { 
         stack.pop(); 
        }else{ 
         throw 'Unexpected closing bracket'; 
        } 
        if (nextToken !== ')' && nextToken !== 'and' && nextToken !== 'or' && nextToken !== null) { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (token === 'and' || token === 'or') { 
        if (!isNumeric(nextToken) && nextToken !== '(') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (isNumeric(token) && token <= criteria_count) { 
        if (nextToken !== ')' && nextToken !== 'and' && nextToken !== 'or') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else{ 
        //anything not recognized, die. 
        throw 'Unexpected token "'+token+'"'; 
       } 
      } 

      //Last step - check if we have all brackets closed 
      if (valid && stack.length > 0) { 
       throw 'Missing '+stack.length+' closing bracket'; 
      } 

      return valid; 
     } 
    }; 

    //This is an example pattern and criteria set. Note that pattern numbers must match criteria numbers. 
    var pattern = '((1 or 3) and (2 or 4) or 5)', 
     crits = [ 
      1, 2, 3, 4, 5 
     ]; 

    //lazy on the document on load. Just delay 
    setTimeout(function() { 

     var result; 
     try { 
      result = JSON.stringify(CriteriaParser.parse(pattern, crits), undefined, 4); 
     }catch(e) { 
      result = e; 
     } 

     var pre = document.createElement('pre'); 
     pre.innerHTML = result; 
     document.body.appendChild(pre); 
    }, 10); 

})(); 
1

这是最容易使用两步过程完成的。 1)转换为语法树。 2)将语法树转换为前缀表示法。

语法树基本上与您的前缀表示法相同,只是使用您的编程语言的数据结构构建而成。

创建语法树的标准方法是使用LALR分析器生成器,该生成器可用于大多数语言。 LALR解析器速度快,功能强大,表现力强。 LALR解析器生成器将.y文件作为输入,并以您选择的编程语言输出解析器的源代码文件。所以你运行一次LALR解析器生成器来生成解析器。

(所有程序员都应该学会使用解析器生成器:)。使用标准标记器也很明智,但我猜测你已经编写了自己的:))

以下是为您的迷你语言生成LALR解析器的.y文件。通过LALR解析器生成器运行该.y文件将输出LALR解析器的源代码,该解析器将令牌作为输入并输出分析树(在变量$ root_tree中)。您需要在别处手动定义parsetree_binaryop数据结构。

%left AND. 
%left OR. 
start ::= expr(e). { $root_tree = e; } 
expr(r) ::= expr(e1) AND expr(e2). { r = new parsetree_binaryop(e1, OP_AND, e2); } 
expr(r) ::= expr(e1) OR expr(e2). { r = new parsetree_binaryop(e1, OP_OR, e2); } 
expr(r) ::= LPAR expr(e) RPAR. { r = e; } 

的“%左,”意思是,是左结合(我们可以选择合适的也不要紧,AND和OR)。在“%left OR”之前提到“%AND AND”意味着AND绑定比OR更紧密,因此生成的解析器会做正确的事情。

当语法树解析器给你时,生成文本表示很容易。

编辑:这似乎是一个LALR解析器生成器,其在JavaScript输出解析器:http://sourceforge.net/projects/jscc/

+0

“这也是聪明的使用标准分词器,而我猜你已经编写了自己” ---事情是,js的认真分析区域吸收。当我在寻找时 - 我没有找到任何像样的东西。 – zerkms

+0

那么必须编写我自己的解析器吗? :(让我检查.. :) –

+0

你的意思是“必须编写我自己的解析器生成器”?可能你也没有,http://sourceforge.net/projects/jscc/看起来很有前途。 – Thue

0

首先定义的语义。在你的第一个例子,你给(1 and 2 and 3) or 4解释,但它也可以是1 and 2 and (3 or 4)这样:

{ 
    "$and": [ 
     {"$or": [3,4] }, 
     [1,2] 
    ] 
} 

让我们假设and具有更高的优先级。然后只需通过列表加入与and的所有条款。接下来,所有其他与or加入。

+0

谢谢,但我在这里不评价方程我已经有现成的代码在服务器端做评价。我需要的是一个转换,我想我不知何故得到infix->前缀的想法。好像前缀可以解决问题。一旦完成,我会在这里发布解决方案。 –

+0

感谢您的努力。我已经想出了解决方案! :)干杯 –