Be careful with directory_iterator

Author:Wojciech Muła
Added on:2018-04-28

C++17 finally introduced filesystem library, which is pretty nice.

This text shows a caveat my colleague bumped into recently. He wanted to perform a set of different operations on files from a directory; it was something like:

#include <filesystem>

using fs = std::filesystem;

class Foo {
public:
    void perfrom(const fs::path& dir) {
        perform_impl(fs::directory_iterator{dir}, fs::directory_iterator{});
    }

private:
    void perform_impl(fs::directory_iterator first, fs::directory_iterator end) {

        for (fs::directory_iterator it = first; it != end; ++it)
            do_foo(*it);

        for (fs::directory_iterator it = first; it != end; ++it)
            do_bar(*it);
    }

    void do_foo(const fs::directory_entry& de);
    void do_bar(const fs::directory_entry& de);
};

In method perform_impl we iterate twice over the range defined by two directory_iterators. Well, we suppose so. Although iterator first is copied, the copy operation is peculiar. It doesn't copy the state of iterator, we get merely a new "handle" to an existing, single state. Standard libraries from GCC and Clang keep a std::shared_ptr, which holds an instance of internal class responsible for iterating.

What it means? When the first loop executes, then first == end. Thus, the second loop never runs.

In my opinion this behaviour is counter-intuitive. If the copy operator doesn't really make a copy, it should be disabled in API (it can be done with = delete put next to the operator declaration). People will be forced to pass the iterator by reference and, thanks to that, will be aware of the iterator traits.

A funny side-effect of the current language feature is that even iterators passed by const reference change their visible state.