从AppleScript的文本项目分隔符中消除连字符

0

如果你愿意，你可以在AppleScript中编写一个计算器，但是你需要像使用其他语言一样进行：1.使用标记器将输入文本分割为一个标记列表; 2.提供这些标记令牌给一个将它们组装成一个抽象语法树的解析器，以及3.对该树进行评估以产生一个结果。

对于你正在做的事情，你可以将你的标记器写成一个正则表达式（假设你不介意通过AppleScript-ObjC桥向下倾斜到NSRegularExpression）。对于解析，我建议阅读Pratt解析器，这些解析器易于实现，但功能足以支持前缀，中缀和posfix运算符以及运算符优先级。为了评估，一个简单的递归AST步行算法可能已经足够，但一次只能一步。

这些都是很好解决的问题，所以你不会遇到任何问题找到教程和其他在线信息如何做到这一点。（当然，很多废话，所以准备花一些时间搞清楚如何告诉坏坏的东西。）

你的一个问题是，你没有一个会专门为AppleScript编写的，所以准备拼写材料围绕其他语言（Python，Java等等）编写，并从那里翻译为AS。这需要一定的努力和耐心，通过所有的程序员说话，但显然是可行的（我最初在AppleScript上削减了我的牙齿，现在编写我自己的自动化脚本语言），并且是一个很好的学习练习来发展你的技能。

来源

2016-02-21 18:12:30 foo

+0

首先，感谢您的回答。尽管这听起来很复杂。是不是可以从列表中“阻止”一个字符（如果出现这种情况的话）AppleScript的文本项目分隔符？ –

+0

获取'word'元素与TID无关。 AS决定字边界的规则是不透明的，尽管可能基于Unicode标准;无论如何，你不能改变它们。至于复杂性，这完全是相对的。事实是，写一个计算器不是_trivial_练习;说，这是一个经常这样做，你会很容易找到大量的帮助，逐步引导你通过它。如果您想要更轻松的练习，请考虑玩具“Lisp”解释器。这只需要一个非常简单的扫描仪来检测括号，空格和其他所有内容。 – foo

0

举一个想法，这里有一个非常简单的类Lisp语言的简单标记生成器：

-- token types 
property StartList : "START" 
property EndList : "END" 
property ANumber : "NUMBER" 
property AWord : "WORD" 

-- recognized token chars 
property _startlist : "(" 
property _endlist : ")" 
property _number : "+-.1234567890" 
property _word : "abcdefghijklmnopqrstuvwxyz" 
property _whitespace : space & tab & linefeed & return 


to tokenizeCode(theCode) 
    considering diacriticals, hyphens, punctuation and white space but ignoring case and numeric strings 
     set i to 1 
     set l to theCode's length 
     set tokensList to {} 
     repeat while i ≤ l 
      set c to character i of theCode 
      if c is _startlist then 
       set end of tokensList to {tokenType:StartList, tokenText:c} 
       set i to i + 1 
      else if c is _endlist then 
       set end of tokensList to {tokenType:EndList, tokenText:c} 
       set i to i + 1 
      else if c is in _number then 
       set tokenText to "" 
       repeat while character i of theCode is in _number and i ≤ l 
        set tokenText to tokenText & character i of theCode 
        set i to i + 1 
       end repeat 
       set end of tokensList to {tokenType:ANumber, tokenText:tokenText} 
      else if c is in _word then 
       set tokenText to "" 
       repeat while character i of theCode is in _word and i ≤ l 
        set tokenText to tokenText & character i of theCode 
        set i to i + 1 
       end repeat 
       set end of tokensList to {tokenType:AWord, tokenText:tokenText} 
      else if c is in _whitespace then -- skip over white space 
       repeat while character i of theCode is in _whitespace and i ≤ l 
        set i to i + 1 
       end repeat 
      else 
       error "Unknown character: '" & c & "'" 
      end if 
     end repeat 
     return tokensList 
    end considering 
end tokenizeCode

该语言的语法规则如下：

一个数字表达式包含一个或多个数字，“+”或“ - ”符号和/或小数点。（上面的代码目前不检查令牌是否是有效的数字，例如它会高兴地接受像“0.1.2-3 +”这样的无意义输入，但这很容易添加。）
一个词表达式包含一个或多个字符（az）。
列表表达式以“（”开始并以“）”结尾。列表表达式中的第一个标记必须是要应用的运算符的名称;这后面可以跟零个或多个表示其操作数的附加表达式。
任何无法识别的字符都视为错误。

例如，让我们用它来标记数学表达式“3 +（2。5 * -2）”，它在前缀符号是这样写的：

set programText to "(add 3 (multiply 2.5 -2))" 

set programTokens to tokenizeCode(programText) 

--> {{tokenType:"START", tokenText:"("}, 
    {tokenType:"WORD", tokenText:"add"}, 
    {tokenType:"NUMBER", tokenText:"3"}, 
    {tokenType:"START", tokenText:"("}, 
    {tokenType:"WORD", tokenText:"multiply"}, 
    {tokenType:"NUMBER", tokenText:"2.5"}, 
    {tokenType:"NUMBER", tokenText:"-2"}, 
    {tokenType:"END", tokenText:")"}, 
    {tokenType:"END", tokenText:")"}}

一旦文本被分成令牌列表，下一个步骤是该列表中馈入其组装成解析器抽象语法树这充分说明了程序的结构。

就像我说的，有一个学习曲线，这个东西的，但你可以在你的睡眠一旦你掌握了基本原则，把它写。问，我会添加一个如何将这些标记解析成可用的形式的例子。

来源

2016-02-22 13:03:47 foo

0

继续从前，这是一个解析器，它将标记器的输出转换为描述程序逻辑的基于树的数据结构。

-- token types 
property StartList : "START" 
property EndList : "END" 
property ANumber : "NUMBER" 
property AWord : "WORD" 


------- 
-- handlers called by Parser to construct Abstract Syntax Tree nodes, 
-- simplified here for demonstration purposes 

to makeOperation(operatorName, operandsList) 
    return {operatorName:operatorName, operandsList:operandsList} 
end makeOperation 

to makeWord(wordText) 
    return wordText 
end makeWord 

to makeNumber(numberText) 
    return numberText as number 
end makeNumber 


------- 
-- Parser 

to makeParser(programTokens) 
    script ProgramParser 

     property currentToken : missing value 

     to advanceToNextToken() 
      if programTokens is {} then error "Found unexpected end of program after '" & currentToken & "'." 
      set currentToken to first item of programTokens 
      set programTokens to rest of programTokens 
      return 
     end advanceToNextToken 

     -- 

     to parseOperation() -- parses an '(OPERATOR [OPERANDS ...])' list expression 
      advanceToNextToken() 
      if currentToken's tokenType is AWord then -- parse 'OPERATOR' 
       set operatorName to currentToken's tokenText 
       set operandsList to {} 
       advanceToNextToken() 
       repeat while currentToken's tokenType is not EndList -- parse 'OPERAND(S)' 
        if currentToken's tokenType is StartList then 
         set end of operandsList to parseOperation() 
        else if currentToken's tokenType is AWord then 
         set end of operandsList to makeWord(currentToken's tokenText) 
        else if currentToken's tokenType is ANumber then 
         set end of operandsList to makeNumber(currentToken's tokenText) 
        else 
         error "Expected word, number, or list expression but found '" & currentToken's tokenText & "' instead." 
        end if 
        advanceToNextToken() 
       end repeat 
       return makeOperation(operatorName, operandsList) 
      else 
       error "Expected operator name but found '" & currentToken's tokenText & "' instead." 
      end if 
     end parseOperation 

     to parseProgram() -- parses the entire program 
      advanceToNextToken() 
      if currentToken's tokenType is StartList then 
       return parseOperation() 
      else 
       error "Found unexpected '" & currentToken's tokenText & "' at start of program." 
      end if 
     end parseProgram 

    end script 
end makeParser 


------- 
-- parse the tokens list produced by the tokenizer into an Abstract Syntax Tree 

set programTokens to {{tokenType:"START", tokenText:"("}, ¬ 
    {tokenType:"WORD", tokenText:"add"}, ¬ 
    {tokenType:"NUMBER", tokenText:"3"}, ¬ 
    {tokenType:"START", tokenText:"("}, ¬ 
    {tokenType:"WORD", tokenText:"multiply"}, ¬ 
    {tokenType:"NUMBER", tokenText:"2.5"}, ¬ 
    {tokenType:"NUMBER", tokenText:"-2"}, ¬ 
    {tokenType:"END", tokenText:")"}, ¬ 
    {tokenType:"END", tokenText:")"}} 


set parserObject to makeParser(programTokens) 

set abstractSyntaxTree to parserObject's parseProgram() 
--> {operatorName:"add", operandsList:{3, {operatorName:"multiply", operandsList:{2.5, -2}}}}

的ProgramParser对象是一个非常，非常简单递归下降语法分析器，处理程序的集合，其中的每一个都知道如何将令牌的序列成特定的数据结构。实际上，这里使用的Lisp-y语法非常简单，它实际上只需要两个处理程序：parseProgram，其中一切正在进行; parseOperation，它知道如何读取组成(OPERATOR_NAME [OPERAND1 OPERAND2 ...])列表的标记并将其转化为记录描述要执行的单个操作（添加，乘法等）。

关于AST的好处，尤其是像这样非常简单的常规应用程序，您可以将其作为数据本身来操作。例如，给定程序(multiply x y)和y = (add x 1)的定义，您可以走AST并用其定义替换任何提及的y，在此情况下给出(multiply x (add x 1))。即不仅可以进行算术计算（算法编程），也可以进行代数操作（符号编程）。这对我来说有点令人头疼，但我会在后面看到一个简单的算术评估者。

来源

2016-02-23 13:38:27 foo

0

为了完成，这里的用于解析器的输出的简单评估：

to makeOperation(operatorName, operandsList) 
    if operatorName is "add" then 
     script AddOperationNode 
      to eval(env) 
       if operandsList's length ≠ 2 then error "Wrong number of operands." 
       return ((operandsList's item 1)'s eval(env)) + ((operandsList's item 2)'s eval(env)) 
      end eval 
     end script 
    else if operatorName is "multiply" then 
     script MultiplyOperationNode 
      to eval(env) 
       if operandsList's length ≠ 2 then error "Wrong number of operands." 
       return ((operandsList's item 1)'s eval(env)) * ((operandsList's item 2)'s eval(env)) 
      end eval 
     end script 
    -- define more operations here as needed... 
    else 
     error "Unknown operator: '" & operatorName & "'" 
    end if 
end makeOperation 


to makeWord(wordText) 
    script WordNode 
     to eval(env) 
      return env's getValue(wordText)'s eval(env) 
     end eval 
    end script 
end makeWord 


to makeNumber(numberText) 
    script NumberNode 
     to eval(env) 
      return numberText as number 
     end eval 
    end script 
end makeNumber 


to makeEnvironment() 
    script EnvironmentObject 
     property _storedValues : {} 
     -- 
     to setValue(theKey, theValue) 
      -- theKey : text 
      -- theValue : script 
      repeat with aRef in _storedValues 
       if aRef's k is theKey then 
        set aRef's v to theValue 
        return 
       end if 
      end repeat 
      set end of _storedValues to {k:theKey, v:theValue} 
      return 
     end setValue 
     -- 
     to getValue(theKey) 
      repeat with aRef in _storedValues 
       if aRef's k is theKey then return aRef's v 
      end repeat 
      error "'" & theKey & "' is undefined." number -1728 
     end getValue 
     -- 
    end script 
end makeEnvironment 


to runProgram(programText, theEnvironment) 
    set programTokens to tokenizeCode(programText) 
    set abstractSyntaxTree to makeParser(programTokens)'s parseProgram() 
    return abstractSyntaxTree's eval(theEnvironment) 
end runProgram

这代替用于测试新的处理程序构建代表结构的每个类型，可以盛放的对象分析器make...处理程序组成一个抽象语法树：数字，单词和操作。每个对象定义了一个知道如何评估该特定结构的处理函数：在NumberNode中它简单地返回数字，在WordNode中检索并评估存储在该名称下的结构，在AddOperationNode中评估每个操作数然后求和它们，以及等等。

例如，要评估我们原来3 + 2.5 * -2程序：

set theEnvironment to makeEnvironment() 
runProgram("(add 3 (multiply 2.5 -2))", theEnvironment) 
--> -2.0

此外，EnvironmentObject用于存储命名值。例如，存储一个由程序命名"x"使用值：

set theEnvironment to makeEnvironment() 
theEnvironment's setValue("x", makeNumber(5)) 
runProgram("(add 3 x)", theEnvironment) 
--> 8

显然，这将需要一些更多的工作，使之成为一个适当的计算器：全套的运营商定义，更好的错误报告，并等等。另外，你可能会想用一个更加熟悉的中缀语法替换带括号的前缀语法，为此你需要类似可以处理优先级，关联等的普拉特分析器。但是，一旦您掌握了基本知识，只需要阅读各种技术并逐个进行更改和改进，直到您达到所需的解决方案。 HTH。

来源

2016-02-25 22:34:09 foo

+0

Jeeez你真棒，非常感谢！ –

从AppleScript的文本项目分隔符中消除连字符

回答

相关问题