“I need to see,
the truth other men cannot see,
to be things that others can't be!
Give me courage to go where no angel will go!
And I will go!
I need to know!”

— Jekyll & Hyde

Prologue

As a C++ developer, or have some experience in C++, did you ever stop coding, staring at the monitor, deciding which parameter to use. To be specific, is pass by value, const reference or r-value.

1
2
3
void Func(Object obj);
void Func(const Object& obj);
void Func(Object&& obj);

I know why you stop at this, because you worries about the constructor overhead. That’s good, but don’t waste time on this every time.

So in this article, I’ll break down the constructor/destructor call for each of these types of parameters, thus you can confidently choose the best combination for your code.


Instrument Utilities

Subject class

Before we start, we have to think of a way to visualize the object’s lifecycle. This is simple, just put an output statement in each of these functions. So here is our Object class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
class Object
{
public:
Object() : _data(0) { cout << "Default constructor (" << this << ")" << endl; }

Object(int data) : _data(data)
{
cout << "Parameterized constructor (" << this << ")" << endl;
}

Object(const Object& other)
{
cout << "Copy constructor (" << this << ")" << endl;
_data = other._data;
}

Object(Object&& other) noexcept
{
cout << "Move constructor (" << this << ")" << endl;
_data = other._data;
other._data = -1;
}

Object& operator=(const Object& other)
{
cout << "Copy assignment operator (" << this << ")" << endl;
if (this != &other)
{
_data = other._data;
}
return *this;
}

Object& operator=(Object&& other) noexcept
{
cout << "Move assignment operator (" << this << ")" << endl;
if (this != &other)
{
_data = other._data;
other._data = -1;
}
return *this;
}

~Object()
{
if (_data != -1)
{
cout << "Destructor (" << this << ")" << endl;
}
else
{
cout << "Destructor (moved) (" << this << ")" << endl;
}
}

void Print() const { cout << "Data: " << _data << endl; }

private:
int _data;
};

We also add a _data field for parameterized constructor. And to mimic real move constructor and operator, we use -1 to represent a “moved” object, which shouldn’t be used anymore.

Scope indicator

The basic knowledge you should know is that in C++, object lives in the current scope, and is guaranteed to be destroyed reaches the end of the scope. So we can use this as a trick to display the span of a scope.

I first learnt this trick from Scott Mayer’s Effective C++, where he mentioned a way to measure the execution time of a scope.

So here it is, I call it Fence. To tell scope and function call apart, I used a scope flag, thus it may look a little bit verbose.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class Fence
{
public:
Fence(bool scope, const char* message) : _scope(scope)
{
if (_scope)
{
cout << "========== " << message << endl;
}
else
{
cout << ">>>>> " << message << endl;
}
}

~Fence()
{
if (_scope)
{
cout << "----------" << endl << endl;
}
else
{
cout << "<<<<<" << endl;
}
}

private:
bool _scope;
};

Then, to simplify the use of this class, we can use macro to wrap this. To indicate scope, we can use the following macro.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#define BEGIN_SCOPE(message) do { Fence __scope_fence(true, message)
#define END_SCOPE() } while (0)

int main()
{
BEGIN_SCOPE("A");
// some code
END_SCOPE();

BEGIN_SCOPE("B");
// some other code
END_SCOPE();

// other code
}

To indicate function, we could also use a compiler macro to avoid writing function name twice. And since we record the complete function, so we don’t need to introduce a nested scope.

1
2
3
4
5
6
7
8
#define _BEGIN_CALL(func)    Fence __call_fence(false, func)
#define BEGIN_CALL() _BEGIN_CALL(__func__)

void Func(/* some parameters */)
{
BEGIN_CALL();
// some code
}

Beside, here is another macro to add blank line to the output.

1
#define BR()                 std::cout << std::endl

Lifecycle Breakdown

I write no comment in code, so that you can think about the result first. 🤔 And all experiments are done with MSVC in Visual Studio 2022. Both Debug and Release profiles output the same.

Object creation

First, let’s see how C++ handle object creation. We create these objects to see all possible constructor and operators.

1
2
3
4
5
6
7
8
9
BEGIN_SCOPE("Create");
Object obj(10);
Object obj2 = Object(15);
Object obj3(obj2);
Object obj4(std::move(obj3));
obj = obj2;
obj = Object(25);
obj = std::move(obj2);
END_SCOPE();

Running this, we’ll have the following output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
========== Create
Parameterized constructor (0000009D83FDF004)
Parameterized constructor (0000009D83FDF024)
Copy constructor (0000009D83FDF044)
Move constructor (0000009D83FDF064)
Copy assignment operator (0000009D83FDF004)
Parameterized constructor (0000009D83FDF4E4)
Move assignment operator (0000009D83FDF004)
Destructor (moved) (0000009D83FDF4E4)
Move assignment operator (0000009D83FDF004)
Destructor (0000009D83FDF064)
Destructor (moved) (0000009D83FDF044)
Destructor (moved) (0000009D83FDF024)
Destructor (0000009D83FDF004)
----------

It’s a bit long, so I’ll break it down one by one.

First, no doubt that Object obj(10); calls the parameterized constructor. Then, for Object obj2 = Object(15);, = will also invoke the constructor instead of the operator because it is considered to be variable definition. Following Object obj3(obj2); and Object obj4(std::move(obj3)); obviously calls copy and move constructor.

Constructor can be taken as a special function call, and soon I’ll talk about parameter passing in functions.

Then, for regular assignment statement, we’ll call the corresponding operator. And here comes the overhead. If you assign a temporary value to a variable, e.g. obj = Object(25), an extra object will be created with move operator invoked. At last, variable or moved one to variable only invokes copy and move operator as we expected.

So for this part, we can conclude that, only a temporary value assignment will cause a little overhead. Although temporary value in initialization can be optimized, compiler doesn’t seem to care about that in assignment. I think that’s what temporary value meant to be. However, it invokes the move assignment, so it has little impact if you have a good “move”.

Return Value Optimization

There is a special case in object creation, which is called return value optimization (RVO). It eliminates the redundant copy for named or temporary objects when it plays as the return value. And that means, it will create the object directly at the caller’s scope. For example, we have the following function that returns an object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Object CreateA()
{
BEGIN_CALL(); // equivalent to
return Object(10); // return { 10 };
}

Object CreateB(int option)
{
BEGIN_CALL();
Object obj(10);
Object obj2(20);
if (option == 0)
{
return obj;
}
else
{
return obj2;
}
}

And we can write the test.

1
2
3
4
5
6
7
BEGIN_SCOPE("Return Value Optimization");
Object obj = CreateA();
obj = CreateA();
Object obj2 = CreateB(1);
obj = CreateB(0);
CreateA();
END_SCOPE();

The output is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
========== Return Value Optimization
>>>>> CreateA
Parameterized constructor (000000781551F284)
<<<<<
>>>>> CreateA
Parameterized constructor (000000781551F704)
<<<<<
Move assignment operator (000000781551F284)
Destructor (moved) (000000781551F704)
>>>>> CreateB
Parameterized constructor (000000781551F064)
Parameterized constructor (000000781551F084)
Move constructor (000000781551F2A4)
Destructor (moved) (000000781551F084)
Destructor (000000781551F064)
<<<<<
>>>>> CreateB
Parameterized constructor (000000781551F064)
Parameterized constructor (000000781551F084)
Move constructor (000000781551F724)
Destructor (000000781551F084)
Destructor (moved) (000000781551F064)
<<<<<
Move assignment operator (000000781551F284)
Destructor (moved) (000000781551F724)
>>>>> CreateA
Parameterized constructor (000000781551F744)
<<<<<
Destructor (000000781551F744)
Destructor (000000781551F2A4)
Destructor (000000781551F284)
----------

The best situation for RVO is when the returned object is unique, which is our CreateA here. We can see that there is literally no extra constructor invoked for Object obj = CreateA();, and obj = CreateA(); only calls the move operator. This is what we wish for.

However, our program may get more complex. For CreateB, there are two choices for return, so we cannot perfectly apply RVO, but we can still optimize it to use a move constructor only to initialize it in the caller. However, we can make it better by placing obj and obj2 in their corresponding if-else scope, so that perfect RVO can be applied.

One thing to notice is that, when RVO applied, the object is constructed in callee, and destructed in caller.

The condition for perfect RVO is that (I guess), the declaration of each return value does not dominate return statements that return other value. So that there will be no conflict in deciding their location in caller’s scope.

What is dominate then? To put it simple, if A dominates B, then every execution path to B must pass A first.

If a meticulous reader, you are, then you may ask, why use this verbose if-else instead of a ternary operator? Good question. We can test that.

1
2
3
4
5
6
7
Object CreateB(int option)
{
BEGIN_CALL();
Object obj(10);
Object obj2(20);
return option ? obj : obj2;
}

Surprisingly (or not if you just know that), it results in a copy constructor instead of move!

1
2
3
4
5
6
7
>>>>> CreateB
Parameterized constructor (00000036E174F364)
Parameterized constructor (00000036E174F384)
Copy constructor (00000036E174F5C4)
Destructor (00000036E174F384)
Destructor (00000036E174F364)
<<<<<

Why? Because compiler is not that aggressive. RVO only applies for one return value, but ternary operator makes it an expression (to be more specific, a l-value). So even if they are semantically equivalent, compiler will take the conservative choice to use copy instead of move.

Parameter passing

Then, let’s see how C++ prepare the function parameters. To better understand this, you may need a quick look at the stack frame in C/C++.

Stackframe

The arguments are placed on top of caller’s stack, so that callee can find it without knowing caller’s stack layout. And all that that implies, when passing arguments, we are initializing them in the out going args segment. So the arguments are actually in caller’s scope.

Pass by value

To see it in action, let’s define a simple function with a value parameter and test it.

1
2
3
4
5
6
7
8
9
10
11
12
void Pass(Object object)
{
BEGIN_CALL();
object.Print();
}

BEGIN_SCOPE("Pass by value");
Object obj;
Pass(obj);
Pass(std::move(obj));
Pass(Object(20));
END_SCOPE();

The output will be as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
========== Pass by value
Default constructor (0000006B984FF868)
Copy constructor (0000006B984FF870)
>>>>> Pass
Data: 0
<<<<<
Destructor (0000006B984FF870)
Move constructor (0000006B984FF870)
>>>>> Pass
Data: 0
<<<<<
Destructor (0000006B984FF870)
Parameterized constructor (0000006B984FF870)
>>>>> Pass
Data: 20
<<<<<
Destructor (0000006B984FF870)
Destructor (moved) (0000006B984FF868)
----------

First Object obj; will call default constructor to initialize the object in local variable segment. When calling Pass(obj), it will first copy the object to the out going args segment which invokes copy constructor. Similarly, we can use std::move to invoke move constructor instead. And finally, we can pass a temporary object, which will construct the argument on site, with no extra copy or move.

Are we getting it? It is the same as what we talked about the object initialization in Object creation.

So we can conclude that, pass by value may not be a good choice if we pass large objects often, and that’s why modern IDE suggests you use const reference instead.

Pass by (const) reference

As we know, reference is a grammar sugar for pointers. Passing a reference is actually passing a pointer, so there is literally no overhead. Which is why we like it.

1
2
3
4
5
6
7
8
9
10
11
void PassCopy(const Object& object)
{
BEGIN_CALL();
object.Print();
}

BEGIN_SCOPE("Pass Copy");
Object obj;
PassCopy(obj);
PassCopy(Object(20));
END_SCOPE();

The output is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
========== Pass Copy
Default constructor (0000009D4A8FF878)
>>>>> PassCopy
Data: 0
<<<<<
Parameterized constructor (0000009D4A8FF874)
>>>>> PassCopy
Data: 20
<<<<<
Destructor (0000009D4A8FF874)
Destructor (0000009D4A8FF878)
----------

We can see that, when using reference, no extra copy or move is needed, only the reference is passed. Which is why we prefer to use reference for large objects. In order not to accidentally modify the argument, we can add const to it.

Not that, simply using reference cannot accept constant parameter or temporary value. But const reference can bind everything.

Pass by R-value reference

It is rare, but let’s not omit it. Passing by R-value requires that the argument is a R-value. Duh

1
2
3
4
5
6
7
8
9
10
11
12
void PassMove(Object&& object)
{
BEGIN_CALL();
object.Print();
}

BEGIN_SCOPE("Pass Move");
Object obj;
PassMove(Object(30));
// ConsumeMove(obj); // compile error
PassMove(std::move(obj));
END_SCOPE();

Notice that, R-value parameter does not accept L-value, so we must explicitly move our L-value to match the type.

1
2
3
4
5
6
7
8
9
10
11
12
========== Pass Move
Default constructor (0000009D4A8FF878)
Parameterized constructor (0000009D4A8FF874)
>>>>> PassMove
Data: 30
<<<<<
Destructor (0000009D4A8FF874)
>>>>> PassMove
Data: 0
<<<<<
Destructor (0000009D4A8FF878)
----------

Since std::move works as a type cast, and not actual assignment happens, our obj is not really moved.

Parameter consuming

Sometimes, especially in constructor, we need to copy the argument to initialize the certain members. In this case, which type of argument has lower overhead? Instead of calling Print only, we may have an initialization or assignment.

1
2
3
4
5
void Consume(Object object)
{
BEGIN_CALL();
auto o = object;
}

The argument part is clear as we’ve talked about just now, the only thing different is that we now have a new copy or move constructor. Of course passing by value is a terrible choice, so which one should we use? L-value reference or R-value reference?

But we cannot use R-value reference as it cannot bind L-value, so does it mean that L-value reference is our only choice? The answer is no. If you use ReSharper, you might have seen such a suggestion.

image-20241001160546356

This problem can be demonstrated by the following function. Pass by reference, by value then move it.

1
2
3
4
5
6
7
8
9
10
11
void ConsumeCopy(const Object& object)
{
BEGIN_CALL();
auto o = object;
}

void ConsumeValueMove(Object object)
{
BEGIN_CALL();
auto o = std::move(object);
}

To tell which one is better, we can have a little test. There are but two types of arguments, L-value and R-value. So we call each once to see the overhead.

1
2
3
4
5
6
7
8
9
10
11
BEGIN_SCOPE("Pass by Reference");
Object obj;
ConsumeCopy(obj);
ConsumeCopy(Object(66));
END_SCOPE();

BEGIN_SCOPE("Modernize Pass by Value");
Object obj;
ConsumeValueMove(obj);
ConsumeValueMove(Object(99));
END_SCOPE();

The result is not that surprising, pass by reference has fewer output, thus seems to be a better choice.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
========== Reference
Default constructor (000000E46D4FF788)
>>>>> ConsumeCopy
Copy constructor (000000E46D4FF790)
Destructor (000000E46D4FF790)
<<<<<
Parameterized constructor (000000E46D4FF784)
>>>>> ConsumeCopy
Copy constructor (000000E46D4FF744)
Destructor (000000E46D4FF744)
<<<<<
Destructor (000000E46D4FF784)
Destructor (000000E46D4FF788)
----------

========== Modernize Pass by Value
Default constructor (000000E46D4FF784)
Copy constructor (000000E46D4FF790)
>>>>> ConsumeValueMove
Move constructor (000000E46D4FF738)
Destructor (000000E46D4FF738)
<<<<<
Destructor (moved) (000000E46D4FF790)
Parameterized constructor (000000E46D4FF790)
>>>>> ConsumeValueMove
Move constructor (000000E46D4FF738)
Destructor (000000E46D4FF738)
<<<<<
Destructor (moved) (000000E46D4FF790)
Destructor (000000E46D4FF784)
----------

It may be a little long, so let’s summarize it, except Object obj;, we have the following statistics. Since a moved object’s destructor will also do fewer recycling, I count it individually.

Pass by reference Pass by value then move Difference
(Parameterized) Constructor 1 1 0
Copy Constructor 2 1 -1
Move Constructor 0 2 +2
Destructor 3 2 -1
Destructor (moved) 0 2 +2

If both types of the arguments are passed with a roughly equal possibility, the overhead is related to the efficiency of copy and move. If the cost of copy and move are the same, then apparently pass by reference is better. You may choose the other only if move is much more efficient than copy.

However, if the argument is always (or most of the time) a temporary value, you may need to reconsider your choice. For example, if you use a string as the name of an object, then this string will most likely come from a temporary value.

Pass by reference Pass by value then move Difference
(Parameterized) Constructor 1 1 0
Copy Constructor 1 0 -1
Move Constructor 0 1 +1
Destructor 2 1 -1
Destructor (moved) 0 1 +1

In this case, pass by value then move it can be a better choice.

Objects in Array

There is one thing we missed, that is array. What about the objects in array? For this, we can also write a simple test.

1
2
3
4
5
6
7
8
9
BEGIN_SCOPE("Array");
Object obj[2] { 1 };
Object* pObj = new Object[2] { 1 };
obj[1] = Object(33);
delete[] pObj; // remember this
pObj = static_cast<Object*>(malloc(sizeof(Object) * 2));
pObj[0].Print();
free(pObj); // don't forget this also
END_SCOPE();

By running it, we’ll have the following output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
========== Array
Parameterized constructor (000000C30472FEB8)
Default constructor (000000C30472FEBC)
Parameterized constructor (00000299414B7028)
Default constructor (00000299414B702C)
Parameterized constructor (000000C30472FEA0)
Move assignment operator (000000C30472FEBC)
Destructor (moved) (000000C30472FEA0)
Destructor (00000299414B702C)
Destructor (00000299414B7028)
Data: 997392451
Destructor (000000C30472FEBC)
Destructor (000000C30472FEB8)
----------

We can see that, array will call the default constructor on each element, so will the new operator. Correspondingly, their destructor will also be called. So it introduces a problem that you have to provide a default constructor or explicitly initialize every element on creation. Of course there is a work around, that is using malloc and free. But in this case you will also lose the destructor.


Epilogue

I have wished to take a deep look at how C++ handle objects for long, and now this is the day, when I send all my doubts and daemons on their way, …

Anyway, it helps me to understand the object lifecycle in C++, and hope this post can also help you. ᓚᘏᗢ