在C++中连续流几个文件

我的问题与this类似，但是我还没有找到任何有关此问题的C++参考。在C++中连续流几个文件

有一个要读取和处理的大文件列表。什么是创建输入流的最佳方式，它可以逐个从文件中获取数据，在前一个文件结束时自动打开下一个文件？这个流将被赋予一个处理函数，该函数在文件边界上顺序读取可变大小的块。

2016-07-29 xivaxy

好了，“Unixy “的方式是将程序编写为过滤器（即从stdin读取并写入stdout），然后使用现有的构建块，如'cat input_file * .dat | myprogram'。但没有更多的细节（即文件都在一个目录中，名称可以全局化，或者它们分布在不同的地方，或者顺序需要不同），很难说比这更多... – twalberg

你可以创建一个从std :: istream派生的新类，它包含'std :: ifstream'的std :: vector'，它可以自动切换到EOF上的下一个或读取失败 – KABoissonneault

将它们收集在缓冲区文件中，然后读取他们之后？所以2部分操作 – Charlie

你想要做的是提供一个继承自std::basic_streambuf的类型。有许多隐含的成员函数，其中相关的成员函数为showmanyc(),underflow(),uflow()和xsgetn()。您需要将它们重载，在溢出时自动打开列表中的下一个文件（如果有的话）。

这是一个示例实现。我们作为一个std::filebuf并只保留下一个文件的deque<string>，我们需要阅读：

class multifilebuf : public std::filebuf 
{ 
public: 
    multifilebuf(std::initializer_list<std::string> filenames) 
    : next_filenames(filenames.begin() + 1, filenames.end()) 
    { 
     open(*filenames.begin(), std::ios::in); 
    } 

protected: 
    std::streambuf::int_type underflow() override 
    { 
     for (;;) { 
      auto res = std::filebuf::underflow(); 
      if (res == traits_type::eof()) { 
       // done with this file, move onto the next one 
       if (next_filenames.empty()) { 
        // super done 
        return res; 
       } 
       else { 
        // onto the next file 
        close(); 
        open(next_filenames.front(), std::ios::in); 

        next_filenames.pop_front(); 
        continue; 
       } 
      } 
      else { 
       return res; 
      } 
     } 
    } 

private: 
    std::deque<std::string> next_filenames; 
};

这样一来，就可以让一切透明的最终用户：

multifilebuf mfb{"file1", "file2", "file3"}; 

std::istream is(&mfb); 
std::string word; 
while (is >> word) { 
    // transaparently read words from all the files 
}

来源

2016-07-29 18:04:11 Barry

这些事情将在接下来的问题中进行介绍，我将向那些声称了解有关C++的所有知识的人提问。很好找！ – KABoissonneault

@KABoissonneault即使继续前进，并想出如何制作一个工作示例。我猜这种情况并不是那么糟糕，只需要'underflow（）'。 – Barry

要获得简单的解决方案，请将boost的连接与istream迭代器的范围用于文件。我不了解当前C++库中的类似函数，但可能存在于TS Rangesv3中。

你也可以自己写：自己写连接是完全可能的。

我会把它写成一个“扁平化”的仅用于输入的迭代器 - 一个遍历每个范围内容的范围内的迭代器。迭代器会跟踪范围的未来范围，以及当前元素的迭代器。

Here是一个非常简单的zip迭代器，可以让您了解您必须编写的代码的大小（zip迭代器是一个不同的概念，这是一个简单的代码，只适用于for(:)循环）。

这是一个如何使用C++ 14做一个素描：

template<class It> 
struct range_t { 
    It b{}; 
    It e{}; 
    It begin() const { return b; } 
    It end() const { return e; } 
    bool empty() const { return begin()==end(); } 
}; 

template<class It> 
struct range_of_range_t { 
    std::deque<range_t<It>> ranges; 
    It cur; 
    friend bool operator==(range_of_range_t const& lhs, range_of_range_t const& rhs) { 
    return lhs.cur==rhs.cur; 
    } 
    friend bool operator!=(range_of_range_t const& lhs, range_of_range_t const& rhs) { 
    return !(lhs==rhs); 
    } 
    void operator++(){ 
    ++cur; 
    if (ranges.front().end() == cur) { 
     next_range(); 
    } 
    } 
    void next_range() { 
    while(ranges.size() > 1) { 
     ranges.pop_front(); 
     if (ranges.front().empty()) continue; 
     cur = ranges.front().begin(); 
     break; 
    } 
    } 
    decltype(auto) operator*() const { 
    return *cur; 
    } 
    range_of_range_t(std::deque<range_t<It>> in): 
    ranges(std::move(in)), 
    cur{} 
    { 
    // easy way to find the starting cur: 
    ranges.push_front({}); 
    next_range(); 
    } 
};

迭代器需要工作，它应该支持所有的迭代器公理。获得最终迭代器是正确的。

这不是一个strema，而是一个迭代器。

来源

2016-07-29 17:58:21 Yakk

在C++中连续流几个文件

回答

相关问题