使用boost :: regex删除C/C++样式注释

我试图使用正则表达式从字符串中删除C和C++样式注释。我发现一个Perl的，似乎两者都做：使用boost :: regex删除C/C++样式注释

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse;

但我不确定如何用boost::regex代码块，或者我需要做的将其改造成一个正则表达式是什么使用接受boost::regex。

仅供参考：我在这里找到正则表达式：perlfaq6，它似乎涵盖了我需要的任何情况。

我不希望使用boost::spirit::qi来做到这一点，因为这会为项目编译增加大量时间。

编辑：

std::string input = "hello /* world */ world"; 

boost::regex reg("(/\\*([^*]|(\\*+[^*/]))*\\*+/)|(//.*)"); 

input = boost::regex_replace(input, reg, "");

所以较短的正则表达式确实没有工作，但较长的一个没有。

来源

2012-02-26 nerozehl

难道通过使用两个单独的正则表达式可以更容易理解和维护吗？首先，摆脱/ * ... * /然后摆脱// ... eol。 – jmucchiello 2012-02-26 02:13:09

@jmucchiello不能;他们互相影响。 – smparkes 2012-02-26 02:19:25

正则表达式应该和Boost的正则表达式几乎像使用perl一样。只需使用'boost :: regex_match（）'或者'boost :: regex_search（）'来创建'boost :: regex'对象并将其应用于'std :: string'即可获得'boost :: smatch'。你究竟在为什么而挣扎？ – 2012-02-26 02:25:16

如果

\*

成为

\\*

那么为什么不

[^\\]

成为

[^\\\\]

来源

2012-02-26 03:28:45 user1227804

当boost已经有一个C++预处理器库（Boost.Wave）可以用来去除注释时，你会用这个正则表达式看起来有点奇怪。

std::string strip_comments(std::string const& input) { 
    std::string output; 
    typedef boost::wave::cpplexer::lex_token<> token_type; 
    typedef boost::wave::cpplexer::lex_iterator<token_type> lexer_type; 
    typedef token_type::position_type position_type; 

    position_type pos; 

    lexer_type it = lexer_type(input.begin(), input.end(), pos, 
     boost::wave::language_support(
      boost::wave::support_cpp|boost::wave::support_option_long_long)); 
    lexer_type end = lexer_type(); 

    for (;it != end; ++it) { 
     if (*it != boost::wave::T_CCOMMENT 
     && *it != boost::wave::T_CPPCOMMENT) { 
      output += std::string(it->get_value().begin(), it->get_value().end()); 
     } 
    } 
    return output; 
}

来源

2012-02-26 04:02:48 Mankarse

这看起来也是一个很好的解决方案。谢谢！ – nerozehl 2012-02-26 14:14:53

使用boost :: regex删除C/C++样式注释

回答

相关问题