This summer, I am brushing up my C++ by going through Alex's excellent tutorial at learncpp.com. I recently went through its chapter on Pointers and References, and I cleared a lot of my previous confusion on the topic. Here are some notes I took.
References in C++
int // a normal int type (not an reference)
int& // an lvalue reference to an int object
const int& // a constant lvalue reference
Both T& ref
and T &ref
are valid ways to define a reference but the first one is preferred.
int x {5};
int& ref {x};
ref = 6;
std::cout << x << '\n'; // 6
Modify a reference will modify the object it refers too
int x {5};
int& ref {x};
int y {6};
ref = y;
std::cout << x << '\n'; // 6
You can not reseat a reference
const int x {5};
// int& ref {x}; <-- this is not allowed
const int& ref {x}; // this is allowed and preferred way to reference object, unless you explicitly need to modify a value through reference.
const T&
also can be init with literal (or other rvalue), for example, const int& {5}
where as T&
couldn’t. Note that 5, a temporary value, it's lifetime is extended by const T&
T& |
const T& |
|
Modifiable Lvalue | ✅ | ✅ |
Rvalue | ⛔️ | ✅ |
Constant Lvalue | ⛔️ | ✅ |
Can modify original object | ✅ | ⛔️ |
What the hell is l-value and r-value?
int x { 5 }; // 5 is an rvalue expression
int y { x }; // x is an lvalue expression
const double d{ 1.2 }; // 1.2 is an rvalue expression
const double e { d }; // d is a non-modifiable lvalue expression
- More formal explanation:
- Lvalue expressions evaluate to an identifiable object.
- Rvalue expressions evaluate to a value.
- More straightforward (to me) explanation:
- A lvalue that stands for locate value is a variable that holds a value in memory
- A rvalue is a variable itself and that it has no memory allocated.
Pointers in C++
Dereference operator *
#include <iostream>
int main()
{
int x{ 5 };
std::cout << x << '\n'; // print the value of variable x
std::cout << &x << '\n'; // print the memory address of variable x
std::cout << *(&x) << '\n'; // print the value at the memory address of variable x (parentheses not required, but make it easier to read)
return 0;
}
// 5
// 0027FEA0
// 5
Pointers
A pointer can be declared via T* x
, and a pointer declared but not initialized is called a wild pointer. You can initialize a pointer via T* {&x}
.
Assignment in pointer
We can use assignment with pointers in two different ways:
*ptr = 6;
To change what the pointer is pointing at (by assigning the pointer a new address)ptr = &x;
To change the value being pointed at (by assigning the dereferenced pointer a new value)
#include <iostream>
int main()
{
int x{ 5 };'
int* ptr{ &x }; // ptr initialized to point at x
std::cout << *ptr << '\n'; // print the value at the address being pointed to (x's address)
int y{ 6 };
ptr = &y; // // change ptr to point at y
std::cout << *ptr << '\n'; // print the value at the address being pointed to (y's address)
return 0;
}
&x
return a pointer not the literal address tox
- On a 32 bit architecture, a pointer is 4 bits
Null Pointer
When you initialize a pointer like int* ptr{}
, you get a null pointer.
You can also assign a pointer with ptr = nullptr
to assign it with null pointer.
You **can but should avoid **initialize a nullptr with T* ptr{0};
or T* ptr{NULL};
.
You can test whether a pointer is a null pointer by if (ptr == nullptr)
or just if (ptr)
.
- Note: Testing whether a pointer is not a null pointer does not alway guarantee a pointer is safe to deference. A dangling pointer to a destroyed item does not automatically set to nullptr. Therefore it’s your responsibility to detect these cases and ensure those pointer are set to nullptr.
Pointers and const
const int x {5};
int* ptr { &x }; // compile error: cannot convert from const int*
You cannot set a normal pointer to a pointer at a const variable.
Const pointers
A const pointer Is a pointer whose address cannot be changed — T* const ptr {&x};
.
You can dereference *ptr
and modify the original object x
given x is a non constant.
Const Const pointer to a const value
You can declare by const T* const ptr { &x };
even if T x
is a non const. In this case the nether the pointer accept assignment nor dereferenced pointer accept assignment.
WTH? Here is a cheat sheet
int x {5} |
const x {5} |
|
int* ptr {&x} |
Pointer and dereferenced pointer accept assignment | Declaration not allowed |
const int* ptr {&x} |
Pointer accept assignment but dereferenced pointer don’t | Pointer accept assignment but dereferenced pointer don’t |
int* const ptr {&x} |
Dereferenced pointer accept assignment but pointer don’t | Declaration not allowed |
const int* const ptr {&x} |
Neither pointer nor dereferenced pointer accept assignment | Neither pointer nor dereferenced pointer accept assignment |
Even simpler cheat sheet
Declaration**** | int x |
const int x |
int* ptr = &x |
*ptr = ... ✅ptr = ... ✅ | 🚫 illegal declaration |
const int* ptr = &x |
*ptr = ... ❌ptr = ... ✅ | *ptr = ... ❌ptr = ... ✅ |
int* const ptr = &x |
*ptr = ... ✅ptr = ... ❌ | 🚫 illegal declaration |
const int* const ptr = &x |
*ptr = ... ❌ptr = ... ❌ | *ptr = ... ❌ptr = ... ❌ |
- *ptr = ... → assigning to the pointee
- ptr = ... → reassigning the pointer itself
- ✅ = assignment allowed
- ❌ = assignment not allowed
- 🚫 = declaration is invalid (compile error)
Pass by Reference and Pass by Address
Passy by reference
void addOne(int& y) // y is bound to the actual object x
{
++y; // this modifies the actual object x
}
// int x{5};
// addOne(x);
#include <iostream>
void printAddresses(int val, int& ref)
{
std::cout << "The address of the value parameter is: " << &val << '\n';
std::cout << "The address of the reference parameter is: " << &ref << '\n';
}
int main()
{
int x { 5 };
std::cout << "The address of x is: " << &x << '\n';
printAddresses(x, x);
return 0;
}
// The address of x is: 0x7ffd16574de0
// The address of the value parameter is: 0x7ffd16574de4
// The address of the reference parameter is: 0x7ffd16574de0
T funcName(T& x)
and T funcName(const T& x)
follow pretty much same rule as const T&
and T&
, a reference to non-const (which can only bind to modifiable lvalues), a reference to const can bind to modifiable lvalues, non-modifiable lvalues, and rvalues. But value modification is not allowed for T funcName(const T& x)
.
Pass by Address
void addOne(int* y) // y is bound to the actual object x
{
++y; // this modifies the actual object x
}
int x{5};
addOne(&x);
Pass by value? By reference? And by address
Feature**** | Pass by Value**** | Pass by Reference**** | Pass by Address**** |
Syntax**** | void foo(int x) | void foo(int& x) | void foo(int* x) |
Argument Type**** | Copy of the actual value | Alias to the original variable | Pointer to the original variable |
Called with**** | foo(a) | foo(a) | foo(&a) |
Can modify original?**** | ❌ No | ✅ Yes | ✅ Yes (via dereferencing) |
Memory Use**** | More (creates a copy) | Less (no copy) | Less (only pointer is copied) |
Null Safety**** | Always safe | Always safe | ❌ Must check for nullptr |
Typical Use Cases**** | When you don’t want to modify input | When you want to modify input | When dealing with dynamic memory / APIs |
Example Call**** | foo(5); | foo(a); | foo(&a); |
Inside Function Access**** | Use as-is (x) | Use as-is (x) | Dereference (*x) |
- Note:
**Prefervoid foo(T* const value)
overvoid foo(T* value)
. **And yes you canvoid foo(const T* const value)
andvoid foo(const T* const value)
, it follows the same rule referenced in the cheat sheet.
When to use pass by value vs pass by reference
- Fundamental types and enumerated types are cheap to copy, so they are typically passed by value.
- Class types can be expensive to copy (sometimes significantly so), so they are typically passed by const reference.
Actually got asked this question in a Tesla interview
Pass by reference vs by address
Pass by const reference has a few other advantages over pass by address.
First, because an object being passed by address must have an address, only lvalues can be passed by address (as rvalues don’t have addresses). Pass by const reference is more flexible, as it can accept lvalues and rvalues.
Second, the syntax for pass by reference is natural, as we can just pass in literals or objects. With pass by address, our code ends up littered with ampersands (&) and asterisks (*).
In modern C++, most things that can be done with pass by address are better accomplished through other methods. Follow this common maxim: “Pass by reference when you can, pass by address when you must”.
Pass by address copies the address
// Consider
void foo(int* ptr)
{
ptr = nullptr;
}
int x{ 5};
int* ptr{ &x };
foo(x);
std::cout << "ptr is " << (ptr ? "non-null\n" : "null\n"); // ptr is non-null
Therefore reassign a passed by pointer to another address does not reassign the original pointer (because foo()
copies reference).
Pass by address by reference
Instead, if you want achieve above, you can:
void bar(int*& ptr)
{
ptr = nullptr;
}
Return by reference and address
Similar to pass by value, return by value also returns a copy of the object:
std::string returnByValue();
#include <iostream>
#include <string>
const std::string& getProgramName() // returns a const reference
{
static const std::string s_programName { "Calculator" }; // has static duration, destroyed at end of program
return s_programName;
}
int main()
{
std::cout << "This program is named " << getProgramName();
return 0;
}
// This program is named Calculator
Because getProgramName()
returns a const reference, when the line return s_programName
is executed, getProgramName()
will return a const reference to s_programName (thus avoiding making a copy). That const reference can then be used by the caller to access the value of s_programName
, which is printed.
**Using return by reference has one major caveat: the programmer must be sure that the object being referenced outlives the function returning the reference. **Otherwise, the reference being returned will be left dangling (referencing an object that has been destroyed), and use of that reference will result in undefined behavior.
const std::string& getProgramName()
{
const std::string programName { "Calculator" }; // now a non-static local variable, destroyed when function ends
return programName;
}
This will result in previous program undefined since programName is now destroyed at the end of the function.
Lifetime extension doesn’t work across function boundaries
#include <iostream>
const int& returnByConstReference(const int& ref)
{
return ref;
}
int main()
{
// case 1: direct binding
const int& ref1 { 5 }; // extends lifetime
std::cout << ref1 << '\n'; // okay
// case 2: indirect binding
const int& ref2 { returnByConstReference(5) }; // binds to dangling reference
std::cout << ref2 << '\n'; // undefined behavior
return 0;
}
In this case, a temporary object is created to hold value 5, which function parameter ref binds to. The function just returns this reference back to the caller, which then uses the reference to initialize ref2. Because this is not a direct binding to the temporary object (as the reference was bounced through a function), lifetime extension doesn’t apply. This leaves ref2 dangling, and its subsequent use is undefined behavior.
Don’t return non-const static local variables by reference
You are allowed to do so but will often result in unintended behavior:
#include <iostream>
#include <string>
const int& getNextId()
{
static int s_x{ 0 }; // note: variable is non-const
++s_x; // generate the next id
return s_x; // and return a reference to it
}
int main()
{
const int& id1 { getNextId() }; // id1 is a reference
const int& id2 { getNextId() }; // id2 is a reference
std::cout << id1 << id2 << '\n';
return 0;
}
Assigning/initializing a normal variable with a returned reference makes a copy
If a function returns a reference, and that reference is used to initialize or assign to a non-reference variable, the return value will be copied (as if it had been returned by value).
#include <iostream>
#include <string>
const int& getNextId()
{
static int s_x{ 0 };
++s_x;
return s_x;
}
int main()
{
const int id1 { getNextId() }; // id1 is a normal variable now and receives a copy of the value returned by reference from getNextId()
const int id2 { getNextId() }; // id2 is a normal variable now and receives a copy of the value returned by reference from getNextId()
std::cout << id1 << id2 << '\n';
return 0;
}
// 12
It’s okay to return reference parameters by reference
#include <iostream>
#include <string>
// Takes two std::string objects, returns the one that comes first alphabetically
const std::string& firstAlphabetical(const std::string& a, const std::string& b)
{
return (a < b) ? a : b; // We can use operator< on std::string to determine which comes first alphabetically
}
int main()
{
std::string hello { "Hello" };
std::string world { "World" };
std::cout << firstAlphabetical(hello, world) << '\n';
return 0;
}
// Hello
It’s okay for an rvalue passed by const reference to be returned by const reference
When an argument for a const reference parameter is an rvalue, it’s still okay to return that parameter by const reference.
This is because rvalues are not destroyed until the end of the full expression in which they are created.
#include <iostream>
#include <string>
const std::string& foo(const std::string& s)
{
return s;
}
std::string getHello()
{
return "Hello"; // implicit conversion to std::string
}
int main()
{
const std::string s{ foo(getHello()) };
std::cout << s;
return 0;
}
The caller can modify values through the reference
This is very confusing behavior. Hence imho, make you return const *
#include <iostream>
// takes two integers by non-const reference, and returns the greater by reference
int& max(int& x, int& y)
{
return (x > y) ? x : y;
}
int main()
{
int a{ 5 };
int b{ 6 };
max(a, b) = 7; // sets the greater of a or b to 7
std::cout << a << b << '\n';
return 0;
}
Return By Address
Return by address works almost identically to return by reference, except a pointer to an object is returned instead of a reference to an object. Return by address has the same primary caveat as return by reference -- the object being returned by address must outlive the scope of the function returning the address, otherwise the caller will receive a dangling pointer.
The major advantage of return by address over return by reference is that we can have the function return nullptr if there is no valid object to return. For example, let’s say we have a list of students that we want to search. If we find the student we are looking for in the list, we can return a pointer to the object representing the matching student. If we don’t find any students matching, we can return nullptr to indicate a matching student object was not found.
The major disadvantage of return by address is that the caller has to remember to do a nullptr check before dereferencing the return value, otherwise a null pointer dereference may occur and undefined behavior will result. Because of this danger, return by reference should be preferred over return by address unless the ability to return “no object” is needed.
WTH Sheet, again
const std::string& foo(const std::string& s) {
return s;
}
Code**** | Safe?**** | Why**** |
const std::string s{ foo("Hello") }; | ✅ | "Hello" is converted to a temporary std::string. foo() returns a reference to it, and s is copy-constructed from it before the temporary dies. |
const std::string**&** s{ foo("Hello") }; | ❌ | "Hello" becomes a temporary std::string, bound inside foo, but the returned reference points to that temporary, which dies after the full expression → ref is dangling**** |
std::string temp = "Hello"; const std::string& ref = foo(temp); | ✅ | temp is a named variable. foo(temp) returns a reference to it, which is safely bound to ref — no temporary, no lifetime issue**** |
const std::string& ref = "Hello"; | ✅ | "Hello" is promoted to a temporary std::string, but since it is directly bound to a const reference, the lifetime is extended — valid |
In short const std::string s{ foo("Hello") };
good as it make a copy before temporary dies.
const std::string& s{ foo("Hello") };
bad because temporary will die after the full expression. The difference is & if you still couldn’t tell.