​ Let’s start with a familiar example of runtime polymorphism. Suppose we have an abstract base class (an explicit interface) describing an arbitrary shape and a way to calculate its area: ​

struct Shape 
{
	[[nodiscard]] virtual double area() const = 0;	
};

Side note: [[nodiscard]] attribute is used to signal the compiler that a warning should be issued if the return value of double area() is unused.

A Rectangle and a Square are two concrete shapes that implement the Shape interface: ​

struct Rectangle : public Shape
{
    Rectangle(int x, int y) : x_(x), y_(y)
    {
    }

    [[nodiscard]] double area() const override
    {
        return x_ * y_;
    }

   protected:
    int x_;
    int y_;
};
struct Square : public Shape
{
    Square(int x) : x_(x)
    {
    }

    [[nodiscard]] double area() const override
    {
        return x_ * x_;
    }

   protected:
    int x_;
};

Side note: we use the override keyword to catch function overriding errors at compile time rather than runtime ​

Suppose now that we want a function that returns true if the area of an arbitrary shape is greater than 10 (be it a rectangle or a square), false otherwise. Using polymorphism, we can write a single function bool is_area_greater_than_10(const Shape& s) for any shape that implements the Shape interface: ​

bool is_area_greater_than_10(const Shape& s)
{
    return s.area() > 10;
}

Rectangle r{1,2};
Square s{4};
is_area_greater_than_10(r); // false
is_area_greater_than_10(s); // true

Side note: throughout this post, we pass a reference to avoid the slicing problem. We could also pass pointers, but prefer references due to brevity.

So far so good. I’m sure most readers have seen and read about this many times.

Why is polymorphism useful in practice?

We successfully decoupled the “what” from the “how” – the method is_area_greater_than_10() only needs to know what the shape does, not how it does it. The method doesn’t care whether s is a Rectangle or a Square as long as there is a way to calculate the area.

Polymorphism is used to achieve composition over inheritance and dependency injection (for examples, see Appendix). Both can lead to more decoupled, modular and hence testable code. Both help us adhere to SOLID principles. For example, when testing the method is_area_greater_than_10(), we can mock the Shape object to inject arbitrary behaviour that is independent of the actual shapes we might use in practice:

struct MockShape : public Shape
{
    [[nodiscard]] double area() const override
    {
        return 5.0;  // Always return 5.0
    }
};

Performance considerations: virtual functions

​ The defining feature of explicit interfaces covered above is the use of virtual functions. Virtual functions provide dynamic binding – choose the correct implementation of the derived method at runtime, but incur an extra cost.

I can’t say for certain whether all compilers implement virtual functions the same way, since the implementation itself is not in the C++ standard. From what I know, however, they are usually implemented using virtual tables, or vtables for short (see this for explanation if you are not familiar with it). ​

Compile-time polymorphism using implicit interfaces

​ But what if you don’t want to use virtual functions due to their runtime overhead? What if you want to achieve polymorphism at compile time? Good news – this can be done using C++ templates.

Using our previous example, we no longer have the Shape abstract base class. Hence, the Rectangle and Square classes don’t inherit from any other class:

struct Rectangle
{
    // ... same as before, but no virtual functions and hence no overriding
}

struct Square
{
    // ... same as before, but no virtual functions and hence no overriding
}

Instead, the correct class is figured out at compile time using a template type parameter:

template <typename Shape>
bool is_area_greater_than_10(const Shape& s)
{
	return s.area() > 10;
}

Rectangle r{1,2};
Square s{4};
is_area_greater_than_10(r); // false
is_area_greater_than_10(s); // true

Side note: in C++20, we could have used auto keyword instead of explicitly specifying the template type parameter as follows: bool is_area_greater_than_10(const auto& s)

​ This is an example of implicit interface. At this point in time, we don’t know what the type of Shape parameter is. All we know is that it must have a method area() that returns a value which can be evaluated in the expression s.area() > 10. During the compilation, two overloads of is_area_greater_than_10() are generated: ​

  1. bool is_area_greater_than_10(const Rectangle&)
  2. bool is_area_greater_than_10(const Shape&)

Side note: an overload is generated if and only if the function is called with that type

Voilà. No more virtual functions. It’s also easy to unit test this function by mocking the Shape object in the same way as virtual functions.

Some example use cases of compile time polymorphism: always inlining (gcc)

From this stackoverflow question on whether virtual functions can be inlined:

“The only time an inline virtual call can be inlined is when the compiler knows the “exact class” of the object which is the target of the virtual function call. This can happen only when the compiler has an actual object rather than a pointer or reference to an object. I.e., either with a local object, a global/static object, or a fully contained object inside a composite.”

Compile time polymorphism indeed allows inlining. Suppose we have a function as follows:

void long_function() {
    // Some piece of logic
    int x{0};
    ++x;
    std::cout << "Function part 1: " << x << std::endl;

    // Some additional piece of logic
    int y{0};
    --y;
    std::cout << "Function part 2: " << y << std::endl;
}

Now suppose that we cannot split the function into smaller units because we want to avoid additional function calls. At the same time, we want to follow the spirit of SOLID principles and keep the code modular and unit testable. For example, we would like to test the two parts of the long_function() separately and mock the two calls. We also don’t want to pass an actual object containing the logic to this function, but only a pointer/reference. And in the future we might want to have different variants of this logic, hence we need polymorphism.

Luckily, we can satisfy all these constraints by combining implicit interfaces with inlining. Define the following class which implements the two pieces of logic: ​

struct FunctionLogic {
    inline __attribute__((always_inline)) void function_part_1() {
        int x{0};
        ++x;
        std::cout << "Function part 1: " << x << std::endl;
    }

    inline __attribute__((always_inline)) void function_part_2() {
        int y{0};
        --y;
        std::cout << "Function part 2: " << y << std::endl;
    }
};

Side note: always_inline forces gcc compiler to always inline the function.

Now create our long_function that takes an object of type T:

template <typename T>
void long_function(T& t) {
    t.function_part_1();
    t.function_part_2();
}

Finally, call the long_function() to generate code during compilation:

int main() {
    FunctionLogic function_logic;
    long_function(function_logic);
}

​ Looking at Compiler Explorer with -O0 flag (no optimization), we observe that long_function() is compiled as if it was written as a single function (that is, there are no function calls): ​

void long_function<FunctionLogic>(FunctionLogic&):
    push    rbp
    mov     rbp, rsp
    ...
    mov     QWORD PTR [rbp-24], rax
    mov     DWORD PTR [rbp-28], 0
    add     DWORD PTR [rbp-28], 1
    mov     esi, OFFSET FLAT:.LC0
    mov     edi, OFFSET FLAT:_ZSt4cout
    call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
    mov     rdx, rax
    ...
    mov     QWORD PTR [rbp-8], rax
    mov     DWORD PTR [rbp-12], 0
    sub     DWORD PTR [rbp-12], 1
    mov     esi, OFFSET FLAT:.LC1
    mov     edi, OFFSET FLAT:_ZSt4cout
    call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
    mov     rdx, rax
    ...

​As mentioned earlier, we can unit test the two parts of the long_function(), mock them and, once compiled, still have the same exact behaviour as before. All this is possible due to templates generating code at compile time. ​

Using concepts (C++20) to define an interface at compile time

C++20 introduced concepts. This is a powerful tool to type check template parameters. In the above example, we knew that the FunctionLogic object has to have two methods function_part_1() and function_part_2(). We didn’t particularly care about its return type. We can write a concept for that:

template <typename T>
concept FunctionLogicConcept = requires (T t) {
    t.function_part_1();
    t.function_part_2();
};

Then we can apply this concept to check that the type T satisfies the concept at compile time:

template <FunctionLogicConcept T>
void long_function(T& t) {
    t.function_part_1();
    t.function_part_2();
}

The good thing about concepts is that the compiler shows informative error messages. With IDE integration, in practice you can basically instantly see if your code satisfies even deeply nested concepts/templates.

Advantages, drawbacks and closing thoughts

Compile time polymorphism is great. It encourages decoupled code and good unit testing practices.

I’m not sure if one can really know the performance impact of virtual functions unless code is benchmarked. From what I know, virtual function cost mostly comes from branch mispredictions. I can’t help but wonder whether code bloat resulting from templates can influence instruction cache performance to the point where it’s more feasible to use virtual functions.


Appendix: dependency injection versus composition over inheritance

Explanation follows this stackoverflow question

The two concepts are kind of identical. The following code snippet is a dependency injection example. We inject a Shape object into a Picture object (a Picture contains an arbitrary Shape), though the two objects exist independently of each other:

struct Picture
{
    Picture(Shape& shape) : shape_(shape)
    {
    }

    double get_area_of_the_shape() 
    {
        return shape_.area();
    }

   protected:
    Shape& shape_;
};

Rectangle r{1,2};
Picture p1{r};
p1.get_area_of_the_shape();  // 2

Square s{4};
Picture p2{s};
p2.get_area_of_the_shape();  // 16

In composition over inheritance, the object is created within the class, rather than injected. That is, the particular Shape object is created within the Picture object:

struct Picture
{
    // Alternatively, we could have shape_(new Square{4})
    Picture() : shape_(new Rectangle{1,2}) 
    {
    }
    
    double get_area_of_the_shape()
    {
        return shape_->area();
    }

    protected:
    Shape* shape_;
};

Picture p{};
p.get_area_of_the_shape();