将线程结果与openmp结合起来

-1

我将几个线程收到的处理结果相结合时遇到了一些问题。我不确定，如果我正确使用openmp。下面的代码提取显示了我的代码的openmp部分。将线程结果与openmp结合起来

参数：

线程私有：

它：地图迭代器（时间戳，用户钥）

伊特： map迭代（（时间戳，用户钥）/ INT量）

thread_result_map： typedef map < userkey（str），timestamp（str）>

的情况下，谁：

日志： char数组
尺寸： log.size（）
标识符匹配正则表达式（时间戳，用户钥）

线程之间共享，时间戳，用户密钥： boost ::正则表达式
combined_result_map：的typedef地图< thread_result_map，命中（INT）>

#pragma omp parallel shared(log, size, identifier, timestamp, userkey) private(it, ite, str_time, str_key, vec_str_result, i, id, str_current, when, who, thread_result_map) 
    { 
#pragma omp for 
    for (i = 0 ; i < size ; i++){ 
      str_current.push_back(log[i]); 
     if (log[i] == '\n') { 
      if (boost::regex_search(str_current, identifier)){ 
       boost::regex_search(str_current, when, timestamp); 
       str_time = when[0]; 
       boost::regex_search(str_current, who, userkey); 
       str_key = who[0]; 
       thread_result_map.insert(make_pair(str_time, str_key)); 
       } 
       str_current = ""; //reset temp string 
      } 
     } 
#pragma omp critical 
     { 
     for (it=thread_result_map.begin(); it!=thread_result_map.end(); it++) { 
       id = omp_get_thread_num(); 
      cout << thread_result_map[it->first] << 
          thread_result_map[it->second]; 
      cout << "tID_" << id << " reducing" << endl; 
      } 
     } 
    }

正如你可以看到每个线程都有自己的字符数组的分区，通过线从阵列分析线，如果当前字符串被认定“标识符“，时间戳和用户密钥被添加到线程的私有结果映射（字符串/字符串）。

现在循环后，我有几个线程的私人结果地图。 combined_result_map是地图内的地图。关键是线程结果的键/值的组合，值是该组合的出现次数。

我只解析时间戳的一部分，所以当在1小时内出现多次相同的用户密钥时，计数器将增加。

结果应该是这个样子：

TIME(MMM/DD/HH/);USERKEY;HITS 
May/25/13;SOMEKEY124345;3

所以我没有问题的关键部分结合击量（我已删除）通过指定组合+ =结果。

但是我怎样才能以相同的方式结合我的结果地图？我知道我必须遍历线程映射，但是当我在循环内部放置一个“cout”来测试每个线程时，它只会调用一次。

在我的本地syslog测试运行给我下面的输出当我设置的所有正则表达式“错误”（以确保每个标识线将有一个用户钥，并与同名的时间戳）：

模式解析访问串：

error Pattern for parsing Timestamp: 
error Pattern for parsing Userkey: 
error 

*** Parsing File /var/log/syslog 

errortID_0 reducing errortID_1 
reducing errortID_2 reducing 
errortID_3 reducing 

*** Ok! ________________ hits : 
418 worktime: 0.0253871s

（计算点击来自线程专用柜台，我在代码中删除以上）

因此，我的每个线程都会执行一个cout并离开循环，尽管所有在一起的应该有418个命中。那么我做错了什么？如何从我的openmp区域内迭代我的结果？

来源

2011-05-16 S3dr1ck

自己发现了这个问题，对于提出愚蠢的问题感到抱歉。

我试图多次添加相同的密钥，这就是为什么地图大小没有增加，每个线程只循环一次。

编辑：

如果有人有兴趣的解决方案如何线程的结果结合起来，这是我做到了。也许你会看到任何可以改进的东西。

我刚刚将本地线程结果图更改为pairs(str,str)的矢量。

这是完整的工作openmp代码部分。也许它对任何人都有用：

#pragma omp parallel shared(log, size, identifier, timestamp, userkey) private(it, ite, str_time, str_key, i, id, str_current, when, who, local_res) 
    { 
#pragma omp for 
     for (i = 0 ; i < size ; i++){ 

      str_current.push_back(log[i]); 

      if (log[i] == '\n') { // if char is newline character 
       if (boost::regex_search(str_current, identifier)){ // if current line is access string 
        boost::regex_search(str_current, when, timestamp); // get timestamp from string 
        str_time = when[0]; 
        boost::regex_search(str_current, who, userkey); // get userkey from string 
        str_key = who[0]; 
        local_res.push_back((make_pair(str_time, str_key))); // append key-value-pair(timestamp/userkey) 
        id = omp_get_thread_num(); 
        //cout << "tID_" << id << " - adding pair - my local result map size is now: " << local_res.size() << endl; 
       } 
       str_current = ""; 
      } 
     } 

#pragma omp critical 
     { 
      id = omp_get_thread_num(); 
      hits += local_res.size(); 
      cout << "tID_" << id << " had HITS: " << local_res.size() << endl; 
      for (i = 0; i < local_res.size(); i++) { 
       acc_key = local_res[i].second; 
       acc_time = local_res[i].first; 
       if(m_KeyDatesHits.count(acc_key) == 0) { // if there are no items for this key yet, make a new entry 
        m_KeyDatesHits.insert(make_pair(acc_key, str_int_MapType())); 
       } 
       if (m_KeyDatesHits[acc_key].count(acc_time) == 0) { // "acc_time" is a key value, if it doesn't exist yet, add it and set "1" as value 
        m_KeyDatesHits[acc_key].insert(make_pair(acc_time, 1)); 
        it = m_KeyDatesHits.begin(); // iterator for userkeys/maps 
        ite = m_KeyDatesHits[acc_key].begin(); // iterator for times/clicks 
       } else m_KeyDatesHits[acc_key][acc_time]++; // if userkey already exist and timestamp already exists, count hits +1 for it 

      } 
     } 
    }

我做了一些测试，它确实运行得很快。

使用4个线程可以搜索150MB的LogFile进行访问事件，解析每个事件的自定义用户密钥和日期，并将结果在4秒内结合起来。

在最后它创建一个导出列表。这是程序输出：

HELLO，欢迎来到LogMap 0.1！

C++/OpenMP的存储器映射解析引擎
_ __ _ __ _ __ _ __ _ __ _ __可用处理器= 4
线程数的数= 4

解析访问字符串的模式：
GET/_openbooknow /键/模式的
解析时间戳：\ d {2}/\ W {3}/\ d {4}
模式解析USERKEY：
[A-ZA-Z0-9] { 20,32}

*解析文件
/home/c0d31n/Desktop/access_log-test.txt

HITS：169147 HITS：169146 HITS：169146
HITS：169147

*好的！ _ __ _ ____点击：
676586工作时间：4.03816s

*新导出文件创建 “./test.csv”

根@ c0d3b0x：〜/工作区/ OpenBookMap/Release＃
猫测试。CSV
“1nDh0gV6eE3MzK0517aE6VIU0”; “28/MAR/2011”， “18813”
“215VIU1wBN2O2Fmd63MVmv6QTZy”; “28/MAR/2011”， “6272”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “18 /月/ 2011”;” 18816"
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “21 /月/ 2011”， “12544”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “22/MAR/2011”， “12544”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “23 /月/ 2011”; “18816”
“9E1608JFGk2GZQ4ppe1Grtv2”; “28/MAR/2011”， “12544”
“pachCsiog05bpK0kDA3K2lhEY”; “17 /月/ 2011”， “18029”
“pachCsiog05bpK0kDA3K2lhEY”; “18 /月/ 2011” ;“12544”
“pachCsiog05bpK0kDA3K2lhEY”; “21 /月/ 2011”， “18816”
“pachCsiog05bpK0kDA3K2lhEY”; “22/MAR/2011”， “6272”
“pachCsiog05bpK0kDA3K2lhEY”; “23 /月/ 2011”，“18816 “
”pachCsiog05bpK0kDA3K2lhEY“;” 28/MAR/2011 “” 501760"
“1nDh0gV6eE3MzK0517aE6VIU0”; “28/MAR/2011”， “18813”
“215VIU1wBN2O2Fmd63MVmv6QTZy”; “28/MAR/2011”;” 6272"
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “18 /月/ 2011”， “18816”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “21 /月/ 2011”， “12544”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “22/MAR/2011”; “12544”
“36Pu0A2Wly3uYeIPZ4YPAuBy”; “23 /月/ 2011”， “18816”
“9E1608JFGk2GZQ4ppe1Grtv2”; “28/MAR/2011”， “12544”
“pachCsiog05bpK0kDA3K2lhEY”; “17 /月/ 2011”，“18029 “
”pachCsiog05bpK0kDA3K2lhEY“;” 18 /月/ 2011 “” 12544"
“pachCsiog05bpK0kDA3K2lhEY”; “21 /月/ 2011”， “18816”
“pachCsiog05bpK0kDA3K2lhEY”; “22/MAR/2011”;” 6272"
“pachCsiog05bpK0kDA3K2lhEY”; “23 /月/ 2011”， “18816”
“pachCsiog05bpK0kDA3K2lhEY”; “28/MAR/2011”， “501760”

来源

2011-05-17 13:54:00 S3dr1ck

将线程结果与openmp结合起来

回答

相关问题