2012-01-13 53 views
0

我有几个关于boost :: regex的问题:我尝试了下面的一个例子。正则表达式不返回任何结果

1)什么是sregex_token_iterator的第4个参数?它听起来像一个“默认匹配”,但你为什么要这样做,而不是什么都不返回?我尝试没有第四个参数,但它不编译。

2)我得到的输出: (1,0) (0,0) (3,0) (0,0) (5,0)

谁能解释一下会出错?

#include <iostream> 
#include <sstream> 
#include <vector> 
#include <boost/regex.hpp> 

// This example extracts X and Y from (X , Y), (X,Y), (X, Y), etc. 


struct Point 
{ 
    int X; 
    int Y; 
    Point(int x, int y): X(x), Y(y){} 
}; 

typedef std::vector<Point> Polygon; 

int main() 
{ 
    Polygon poly; 
    std::string s = "Polygon: (1.1,2.2), (3, 4), (5,6)"; 

    std::string floatRegEx = "[0-9]*\\.?[0-9]*"; // zero or more numerical characters as you want, then an optional '.', then zero or more numerical characters. 
    // The \\. is for \. because the first \ is the c++ escpape character and the second \ is the regex escape character 
    //const boost::regex r("(\\d+),(\\d+)"); 
    const boost::regex r("(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)"); 
    // \s is white space. We want this to allow (2,3) as well as (2, 3) or (2 , 3) etc. 

    const boost::sregex_token_iterator end; 
    std::vector<int> v; // This type has nothing to do with the type of objects you will be extracting 
    v.push_back(1); 
    v.push_back(2); 

    for (boost::sregex_token_iterator i(s.begin(), s.end(), r, v); i != end;) 
    { 
    std::stringstream ssX; 
    ssX << (*i).str(); 
    float x; 
    ssX >> x; 
    ++i; 

    std::stringstream ssY; 
    ssY << (*i).str(); 
    float y; 
    ssY >> y; 
    ++i; 

    poly.push_back(Point(x, y)); 
    } 

    for(size_t i = 0; i < poly.size(); ++i) 
    { 
    std::cout << "(" << poly[i].X << ", " << poly[i].Y << ")" << std::endl; 
    } 
    std::cout << std::endl; 

    return 0; 
} 
+0

你尝试libpcre? – 2012-01-13 17:37:53

+0

我不想介绍更多的依赖关系。 – 2012-01-13 20:50:07

回答

0

你的正则表达式是完全可选:

"[0-9]*\\.?[0-9]*" 

也匹配空字符串。所以"(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)"也匹配一个逗号。

你应该至少强制的事情:

"(?:[0-9]+(?:\\.[0-9]*)?|\\.[0-9]+)" 

这使得11.11..1但不.

(?:   # Either match... 
[0-9]+  # one or more digits, then 
(?:   # try to match... 
    \.   # a dot 
    [0-9]*  # and optional digits 
)?   # optionally. 
|   # Or match... 
\.[0-9]+ # a dot and one or more digits. 
)   # End of alternation 
+0

蒂姆,我以为?只是让点是可选的?我确实想要允许1.和.1一样,这就是为什么我在小数两边都使用[0-9] *的原因。我该如何制作。可选的? 此外,我更新了一个更合适的解析问题。但是,我得到5个输出,而不是我期望的3个输出? – 2012-01-13 17:25:07

+0

好的,我编辑了我的正则表达式。稍后会添加一个解释。 '*'也使数字可选。 – 2012-01-13 17:29:05

+0

谢谢蒂姆。这当然比让整个事情可选更好。但是,在硬编码输入的情况下,我的非强大表达式仍然适用?我想我现在正在用C++做一些错误的事情,因为我在原始文章中的#2中显示的输出? – 2012-01-13 17:33:38

相关问题