2012-01-30 189 views
0

我需要一种系统化的方法,通过为每个单词提供自己的输入来单独替换字符串中的每个单词。我想在命令行上执行此操作。用PHP替换字符串中的多个单词

所以程序读入一个字符串,并询问我想用什么来替换第一个单词,然后是第二个单词,然后是第三个单词,依此类推,直到所有单词都被处理。

字符串中的句子必须保持良好的形式,所以算法应该注意不要弄乱标点和间距。

有没有适当的方法来做到这一点?

+0

是否确定要检查/字符串中的每一个替换单词?如果字符串本身很长...此脚本可能需要一段时间才能完成。 :) – summea 2012-01-30 17:25:52

+0

我有一些想法,但他们非常复杂。这一切都归结为为每个单词获取一组(单词,位置,长度)值组合,然后循环该列表。我想知道如何有效地实现这样的算法。 – x74x61 2012-01-30 17:29:02

+0

@summea其实,我只需要名词,动词和形容词,但这不是真正的问题。 – x74x61 2012-01-30 17:30:20

回答

2

鉴于一些文字

$subject = <<<TEXT 
I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line. 

So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed. 

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing. 

Is there a proper way to do this? 
TEXT; 

你先记号化串入字和“一切”的标记(如打电话给他们填写)。 正则表达式是有帮助的是:现在

$pattern = '/(?P<fill>\W+)?(?P<word>\w+)?/'; 
$r = preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); 

工作是返回值转换成更有用的数据结构,如令牌的阵列和使用的所有词的索引:

$tokens = array(); # token stream 
$tokenIndex = 0; 
$words = array(); # index of words 
foreach($matches as $matched) 
{ 
    foreach($matched as $type => $match) 
    { 
     if (is_numeric($type)) continue; 
     list($string, $offset) = $match; 
     if ($offset < 0) continue; 


     $token = new stdClass; 
     $token->type = $type; 
     $token->offset = $offset; 
     $token->length = strlen($string); 

     if ($token->type === 'word') 
     { 
      if (!isset($words[$string])) 
      { 
       $words[$string] = array('string' => $string, 'tokens' => array()); 
      } 
      $words[$string]['tokens'][] = &$token; 
      $token->string = &$words[$string]['string']; 
     } else { 
      $token->string = $string; 
     } 


     $tokens[$tokenIndex] = &$token; 
     $tokenIndex++; 
     unset($token); 
    } 
} 

示范您可以输出所有的词:

# list all words 

foreach($words as $word) 
{ 
    printf("Word '%s' used %d time(s)\n", $word['string'], count($word['tokens'])); 
} 

这将使你与示例文本:

Word 'I' used 3 time(s) 
Word 'need' used 1 time(s) 
Word 'a' used 4 time(s) 
Word 'systematic' used 1 time(s) 
Word 'way' used 2 time(s) 
Word 'of' used 1 time(s) 
Word 'replacing' used 1 time(s) 
Word 'each' used 2 time(s) 
Word 'word' used 5 time(s) 
Word 'in' used 3 time(s) 
Word 'string' used 3 time(s) 
Word 'separately' used 1 time(s) 
Word 'by' used 1 time(s) 
Word 'providing' used 1 time(s) 
Word 'my' used 1 time(s) 
Word 'own' used 1 time(s) 
Word 'input' used 1 time(s) 
Word 'for' used 1 time(s) 
Word 'want' used 2 time(s) 
Word 'to' used 5 time(s) 
Word 'do' used 2 time(s) 
Word 'this' used 2 time(s) 
Word 'on' used 2 time(s) 
Word 'the' used 7 time(s) 
Word 'command' used 1 time(s) 
Word 'line' used 1 time(s) 
Word 'So' used 1 time(s) 
Word 'program' used 1 time(s) 
Word 'reads' used 1 time(s) 
Word 'and' used 5 time(s) 
... (and so on) 

然后你只在词语标记上做这项工作。例如用另一个替换一个字符串:

# change one word (and to AND) 

$words['and']['string'] = 'AND'; 

最后您连接的标记成一个字符串:

# output the whole text 

foreach($tokens as $token) echo $token->string; 

与再次示范文本给出了:

I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to 
do this on the command line. 

So the program reads in a string, AND asks me what I want to replace the first word with, AND then the second word, AND 
then the third word, AND so on, until all words have been processed. 

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation AND 
spacing. 

Is there a proper way to do this? 

完成任务。确保单词标记只替换为有效的单词标记,因此也标记用户输入,如果它不是单个单词标记(与单词模式不匹配),则会发出错误。

Code/Demo

+1

你的答案显然更加完整,你花时间写了一个解决方案。这绝不是我的意图,因为我只是试图将他推向正确的方向。删除了我的答案。 +1 – 2012-01-30 18:15:45

+0

非常感谢! – x74x61 2012-01-30 21:47:09

0

看起来很简单,当你知道命令行编程的基础知识与PHP的其中有很多的教程。

在一般情况下,一个连续的循环,这将让你要求单词应该是基础。然后你做每一个循环只是一个:str_replace(),这将做你需要的基础知识。

不要忘记执行一个技巧来打破循环,如输入exit或根据需要使用一些特殊命令。

我认为这是不是想回答一个完整的代码示例在这里吗?这将完全回答这个问题,但也使它有点像一个脚本请求?

相关问题