Edit: This question dates from before C++17. These days std::launder or equivalent should be added to the line noise. I don't have time to update the code to match right now.
I am aspiring to separate interface from implementation. This is primarily to protect code using a library from changes in the implementation of said library, though reduced compilation times are certainly welcome.
The standard solution to this is the pointer to implementation idiom, most likely to be implemented by using a unique_ptr and carefully defining the class destructor out of line, with the implementation.
Inevitably this raises concerns about heap allocation. I am familiar with "make it work, then make it fast", "profile then optimise" and such wisdom. There are also articles online, e.g. gotw, which declare the obvious workaround to be brittle and non-portable. I have a library which currently contains no heap allocations whatsoever - and I'd like to keep it that way - so let's have some code anyway.
#ifndef PIMPL_HPP
#define PIMPL_HPP
#include <cstddef>
namespace detail
{
// Keeping these up to date is unfortunate
// More hassle when supporting various platforms
// with different ideas about these values.
const std::size_t capacity = 24;
const std::size_t alignment = 8;
}
class example final
{
 public:
  // Constructors
  example();
  example(int);
  // Some methods
  void first_method(int);
  int second_method();
  // Set of standard operations
  ~example();
  example(const example &);
  example &operator=(const example &);
  example(example &&);
  example &operator=(example &&);
  // No public state available (it's all in the implementation)
 private:
  // No private functions (they're also in the implementation)
  unsigned char state alignas(detail::alignment)[detail::capacity];
};
#endif
This doesn't look too bad to me. Alignment and size can be statically asserted in the implementation. I can choose between overestimating both (inefficient) or recompiling everything if they change (tedious) - but neither option is terrible.
I'm not certain this sort of hackery will work in the presence of inheritance, but as I don't much like inheritance in interfaces I don't mind too much.
If we boldly assume that I've written the implementation correctly (I'll append it to this post, but it's an untested proof of concept at this point so that's not a given), and both size and alignment are greater than or equal to that of the implementation, then does the code exhibit implementation defined, or undefined, behaviour?
#include "pimpl.hpp"
#include <cassert>
#include <vector>
// Usually a class that has behaviour we care about
// In this example, it's arbitrary
class example_impl
{
 public:
  example_impl(int x = 0) { insert(x); }
  void insert(int x) { local_state.push_back(3 * x); }
  int retrieve() { return local_state.back(); }
 private:
  // Potentially exotic local state
  // For example, maybe we don't want std::vector in the header
  std::vector<int> local_state;
};
static_assert(sizeof(example_impl) == detail::capacity,
              "example capacity has diverged");
static_assert(alignof(example_impl) == detail::alignment,
              "example alignment has diverged");
// Forwarding methods - free to vary the names relative to the api
void example::first_method(int x)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));
  impl.insert(x);
}
int example::second_method()
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));
  return impl.retrieve();
}
// A whole lot of boilerplate forwarding the standard operations
// This is (believe it or not...) written for clarity, so none call each other
example::example() { new (&state) example_impl{}; }
example::example(int x) { new (&state) example_impl{x}; }
example::~example()
{
  (reinterpret_cast<example_impl*>(&state))->~example_impl();
}
example::example(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  new (&state) example_impl(impl);
}
example& example::operator=(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  if (&other != this)
    {
      (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
      new (&state) example_impl(impl);
    }
  return *this;
}
example::example(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  new (&state) example_impl(std::move(impl));
}
example& example::operator=(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  assert(this != &other); // could be persuaded to use an if() here
  (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
  new (&state) example_impl(std::move(impl));
  return *this;
}
#if 0 // Clearer assignment functions due to MikeMB
example &example::operator=(const example &other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
      *(reinterpret_cast<const example_impl *>(&(other.state)));
  return *this;
}   
example &example::operator=(example &&other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
          std::move(*(reinterpret_cast<example_impl *>(&(other.state))));
  return *this;
}
#endif
int main()
{
  example an_example;
  example another_example{3};
  example copied(an_example);
  example moved(std::move(another_example));
  return 0;
}
I know that's pretty horrible. I don't mind using code generators though, so it's not something I'll have to type out repeatedly.
To state the crux of this over-long question explicitly, are the following conditions sufficient to avoid UB|IDB?
If they are, I'll write enough tests for Valgrind to flush out the several bugs in the demo. Thank you to any who get this far!
Yes, this is perfectly safe and portable code.
However, there is no need to use placement new and explicit destruction in your assignment operators. Aside from it being exception safe and more efficient, I'd argue it's also much cleaner to just use the assignment operator of example_impl:
//wrapping the casts
const example_impl& castToImpl(const unsigned char* mem) { return *reinterpret_cast<const example_impl*>(mem);  }
      example_impl& castToImpl(      unsigned char* mem) { return *reinterpret_cast<      example_impl*>(mem);  }
example& example::operator=(const example& other)
{
    castToImpl(this->state) = castToImpl(other.state);
    return *this;
}
example& example::operator=(example&& other)
{
    castToImpl(this->state) = std::move(castToImpl(other.state));
    return *this;
}
Personally, I also would use std::aligned_storage instead of an manually aligned char array, but I guess thats a matter of taste.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With