Tuesday 29 September 2015

The Pimpl idiom: you're doing it wrong

Ready for a surprise?  Your implementation of the Pimpl idiom is probably broken and not just a bit broken very broken.  This is because most of the examples on the internet and in books are wrong.  Wikipedia does it wrong; Herb Sutter does it wrong; Scott Meyers even does it wrong in his new book on modern C++1; .  Pretty much everyone does it wrong.

How so?


So what is the problem with Pimpl?  It comes down to how pointers work: the constness of a pointer does not affect the constness of the pointee (the data it points to).  Allow me to demonstrate:

int* p1;             // p1 is mutable;   *p1 is mutable
int* const p2;       // p2 is immutable; *p2 is mutable
int const* p3;       // p3 is mutable;   *p3 is immutable
int const* const p4; // p4 is immutable; *p4 is immutable

Apologies if I am stating the obvious here, but this is the crux of the matter; a const pointer is different than a pointer to const.  The former cannot be changed to point to something else, while the latter cannot be used to modify what it points to.  Of course, you may have a const pointer to const, but the effects of the two constnesses are still completely independent of one another.

So how does this relate to Pimpl?  Look at this member function taken from a class using the Pimpl idiom:

bool widget::is_zero() const {
    return impl->value = 0;
}

Spot the problem?  This is the canonical example of accidentally typing = when we meant to type ==, thereby modifying our value instead of performing an equality test.  Ordinarily, value would be immutable within a const member function, and the compiler would complain about it being read-only.  Unfortunately, value is not a member of widget.  Sure, the pointer impl is immutable, but this just means we cannot make it point to something else; what it points to will still be mutable.

Dangerous behaviour


That's right: by valiantly wielding one of the staple tools in the C++ developer's toolbox, we have unwittingly demolished one of the pillars of good C++ design: const-correctness.  This is particularly worrying in the modern era of concurrency, considering that const-correctness is essential to writing thread-safe software.

Ironically, it all comes down to the fact that pointers are ill-suited for implementing the Pimpl ("pointer" to implementation) idiom.  std::unique_ptr doesn't even make things better, as it has the same dereferencing behaviour as a raw pointer (and so it should).  Of course, std::unique_ptr does provide us with automatic resource management, so we would prefer to use it over a raw pointer and explicit delete.

The solution


What we need is something like std::unique_ptr, but where the constness of the pointee is tied to the constness of the pointer itself.  Luckily, C++ provides us the tools to do exactly that:

template<typename T>
class pimpl_ptr
{
private:
    T* impl;

public:
    pimpl_ptr(T* impl = nullptr) : impl(impl) {
    }

    pimpl_ptr(pimpl_ptr const&) = delete;
    pimpl_ptr& operator=(pimpl_ptr const&) = delete;

    pimpl_ptr(pimpl_ptr&& other) : impl(other.impl) {
        other.impl = nullptr;
    }

    pimpl_ptr& operator=(pimpl_ptr&& other) {
        impl = other.impl;
        other.impl = nullptr;
        return *this;
    }

    ~pimpl_ptr() {
        delete impl;
    }

    T const& operator*() const {
        return *impl;
    }

    T& operator*() {
        return *impl;
    }

    T const* operator->() const {
        return impl;
    }

    T* operator->() {
        return impl;
    }
};

Note the const overloads of the star and arrow operators which return reference/pointer to const.  This wrapper class is significantly simpler than any std::unique_ptr implementation, but given its specialized purpose, I don't think we need anything more complicated.  Simply drop it in place of std::unique_ptr and you're ready to roll.

In conclusion


Quite how such a dangerous flaw has gone largely unappreciated by the C++ community for so long, I don't know.  Of course, I am not the first to notice this issue; indeed, it seems a proposal has been submitted to the standards committee which attempts to address the problem in much the same way as I have here.  Nonetheless, I hope this post will help to raise awareness before a solution is standardized *cough*modules*cough*.


1 Effective Modern C++, Item 22