4.3 — Object sizes and the sizeof operator

Object sizes

As you learned in the lesson 4.1 -- Introduction to fundamental data types, memory on modern machines is typically organized into byte-sized units, with each byte of memory having a unique address. Up to this point, it has been useful to think of memory as a bunch of cubbyholes or mailboxes where we can put and retrieve information, and variables as names for accessing those cubbyholes or mailboxes.

However, this analogy is not quite correct in one regard -- most objects actually take up more than 1 byte of memory. A single object may use 1, 2, 4, 8, or even more consecutive memory addresses. The amount of memory that an object uses is based on its data type.

Because we typically access memory through variable names (and not directly via memory addresses), the compiler is able to hide the details of how many bytes a given object uses from us. When we access some variable x in our source code, the compiler knows how many bytes of data need to be retrieved (based on the type of variable x), and will output the appropriate machine language code to handle that detail for us.

Even so, there are several reasons it is useful to know how much memory an object uses.

First, the more memory an object uses, the more information it can hold.

A single bit can hold 2 possible values, a 0, or a 1:

bit 0
0
1

2 bits can hold 4 possible values:

bit 0 bit 1
0 0
0 1
1 0
1 1

3 bits can hold 8 possible values:

bit 0 bit 1 bit 2
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

To generalize, an object with n bits (where n is an integer) can hold 2n (2 to the power of n, also commonly written 2^n) unique values. Therefore, with an 8-bit byte, a byte-sized object can hold 28 (256) different values. An object that uses 2 bytes can hold 2^16 (65536) different values!

Thus, the size of the object puts a limit on the amount of unique values it can store -- objects that utilize more bytes can store a larger number of unique values. We will explore this further when we talk more about integers.

Second, computers have a finite amount of free memory. Every time we define an object, a small portion of that free memory is used for as long as the object is in existence. Because modern computers have a lot of memory, this impact is usually negligible. However, for programs that need a large amount of objects or data (e.g. a game that is rendering millions of polygons), the difference between using 1 byte and 8 byte objects can be significant.

Key insight

New programmers often focus too much on optimizing their code to use as little memory as possible. In most cases, this makes a negligible difference. Focus on writing maintainable code, and optimize only when and where the benefit will be substantive.

Fundamental data type sizes

The obvious next question is “how much memory do objects of a given data type use?”. Perhaps surprisingly, the C++ standard does not define the exact size (in bits) of any of the fundamental types.

Instead, the standard says the following:

  • An object must occupy at least 1 byte (so that each object has a distinct memory address).
  • A byte must be at least 8 bits.
  • The integral types char, short, int, long, and long long have a minimum size of 8, 16, 16, 32, and 64 bits respectively.
  • char and char8_t are exactly 1 byte (at least 8 bits).

Nomenclature

When we talk about the size of a type, we really mean the size of an instantiated object of that type.

In this tutorial series, we will present a simplified view, by making some reasonable assumptions that are generally true for modern architectures:

  • A byte is 8 bits.
  • Memory is byte addressable (we can access every byte of memory independently).
  • Floating point support is IEEE-754 compliant.
  • We are on a 32-bit or 64-bit architecture.

Given the above assumptions, we can reasonably state the following:

Category Type Minimum Size Typical Size
Boolean bool 1 byte 1 byte
Character char 1 byte (exactly) 1 byte
wchar_t 1 byte 2 or 4 bytes
char8_t 1 byte 1 byte
char16_t 2 bytes 2 bytes
char32_t 4 bytes 4 bytes
Integral short 2 bytes 2 bytes
int 2 bytes 4 bytes
long 4 bytes 4 or 8 bytes
long long 8 bytes 8 bytes
Floating point float 4 bytes 4 bytes
double 8 bytes 8 bytes
long double 8 bytes 8, 12, or 16 bytes
Pointer std::nullptr_t 4 bytes 4 or 8 bytes

Tip

For maximum portability, you shouldn’t assume that objects are larger than the specified minimum size.

Alternatively, if you want to assume that a type has some non-minimum size (e.g. that an int is at least 4 bytes), you can use static_assert to have the compiler fail a build if it is compiled on an architecture where this assumption is not true. We cover how to do this in lesson 9.6 -- Assert and static_assert.

Related content

You can find more information about what the C++ standard says about the minimum size of various types here.

The sizeof operator

In order to determine the size of data types on a particular machine, C++ provides an operator named sizeof. The sizeof operator is a unary operator that takes either a type or a variable, and returns the size of an object of that type (in bytes). You can compile and run the following program to find out how large some of your data types are:

#include <iomanip> // for std::setw (which sets the width of the subsequent output)
#include <iostream>
#include <climits> // for CHAR_BIT

int main()
{
    std::cout << "A byte is " << CHAR_BIT << " bits\n\n";

    std::cout << std::left; // left justify output

    std::cout << std::setw(16) << "bool:" << sizeof(bool) << " bytes\n";
    std::cout << std::setw(16) << "char:" << sizeof(char) << " bytes\n";
    std::cout << std::setw(16) << "short:" << sizeof(short) << " bytes\n";
    std::cout << std::setw(16) << "int:" << sizeof(int) << " bytes\n";
    std::cout << std::setw(16) << "long:" << sizeof(long) << " bytes\n";
    std::cout << std::setw(16) << "long long:" << sizeof(long long) << " bytes\n";
    std::cout << std::setw(16) << "float:" << sizeof(float) << " bytes\n";
    std::cout << std::setw(16) << "double:" << sizeof(double) << " bytes\n";
    std::cout << std::setw(16) << "long double:" << sizeof(long double) << " bytes\n";

    return 0;
}

Here is the output from the author’s machine:

bool:           1 bytes
char:           1 bytes
short:          2 bytes
int:            4 bytes
long:           4 bytes
long long:      8 bytes
float:          4 bytes
double:         8 bytes
long double:    8 bytes

Your results may vary based on compiler, computer architecture, OS, compilation settings (32-bit vs 64-bit), etc…

Trying to use sizeof on an incomplete type (such as void) will result in a compilation error.

For gcc users

If you have not disabled compiler extensions, gcc allows sizeof(void) to return 1 instead of producing a diagnostic (Pointer-Arith). We show how to disable compiler extensions in lesson 0.10 -- Configuring your compiler: Compiler extensions.

You can also use the sizeof operator on a variable name:

#include <iostream>

int main()
{
    int x{};
    std::cout << "x is " << sizeof(x) << " bytes\n";

    return 0;
}
x is 4 bytes

For advanced readers

sizeof does not include dynamically allocated memory used by an object. We discuss dynamic memory allocation in a future lesson.

Fundamental data type performance

On modern machines, objects of the fundamental data types are fast, so performance while using or copying these types should generally not be a concern.

As an aside…

You might assume that types that use less memory would be faster than types that use more memory. This is not always true. CPUs are often optimized to process data of a certain size (e.g. 32 bits), and types that match that size may be processed quicker. On such a machine, a 32-bit int could be faster than a 16-bit short or an 8-bit char.

guest
Your email address will not be displayed
Find a mistake? Leave a comment above!
Correction-related comments will be deleted after processing to help reduce clutter. Thanks for helping to make the site better for everyone!
Avatars from https://gravatar.com/ are connected to your provided email address.
Notify me about replies:  
424 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments