4.1 — Introduction to fundamental data types

Bits, bytes, and memory addressing

In lesson 1.3 -- Introduction to objects and variables, we talked about the fact that variables are names for a piece of memory that can be used to store information. To recap briefly, computers have random access memory (RAM) that is available for programs to use. When a variable is defined, a piece of that memory is set aside for that variable.

The smallest unit of memory is a binary digit (also called a bit), which can hold a value of 0 or 1. You can think of a bit as being like a traditional light switch -- either the light is off (0), or it is on (1). There is no in-between. If you were to look at a random segment of memory, all you would see is …011010100101010… or some combination thereof.

Memory is organized into sequential units called memory addresses (or addresses for short). Similar to how a street address can be used to find a given house on a street, the memory address allows us to find and access the contents of memory at a particular location.

Perhaps surprisingly, in modern computer architectures, each bit does not get its own unique memory address. This is because the number of memory addresses is limited, and the need to access data bit-by-bit is rare. Instead, each memory address holds 1 byte of data. A byte is a group of bits that are operated on as a unit. The modern standard is that a byte is comprised of 8 sequential bits.

Key insight

In C++, we typically work with “byte-sized” chunks of data.

The following picture shows some sequential memory addresses, along with the corresponding byte of data:

Memory Addressing

As an aside…

Some older or non-standard machines may have bytes of a different size (from 1 to 48 bits) -- however, we generally need not worry about these, as the modern de-facto standard is that a byte is 8 bits. For these tutorials, we’ll assume a byte is 8 bits.

Data types

Because all data on a computer is just a sequence of bits, we use a data type (often called a type for short) to tell the compiler how to interpret the contents of memory in some meaningful way. You have already seen one example of a data type: the integer. When we declare a variable as an integer, we are telling the compiler “the piece of memory that this variable uses is going to be interpreted as an integer value”.

When you give an object a value, the compiler and CPU take care of encoding your value into the appropriate sequence of bits for that data type, which are then stored in memory (remember: memory can only store bits). For example, if you assign an integer object the value 65, that value is converted to the sequence of bits 0100 0001 and stored in the memory assigned to the object.

Conversely, when the object is evaluated to produce a value, that sequence of bits is reconstituted back into the original value. Meaning that 0100 0001 is converted back into the value 65.

Fortunately, the compiler and CPU do all the hard work here, so you generally don’t need to worry about how values get converted into bit sequences and back.

All you need to do is pick a data type for your object that best matches your desired use.

Fundamental data types

The C++ language comes with many predefined data types available for your use. The most basic of these types are called the fundamental data types (informally sometimes called basic types or primitive types).

Here is a list of the fundamental data types, some of which you have already seen:

Types Category Meaning Example
float
double
long double
Floating Point a number with a fractional part 3.14159
bool Integral (Boolean) true or false true
char
wchar_t
char8_t (C++20)
char16_t (C++11)
char32_t (C++11)
Integral (Character) a single character of text ‘c’
short int
int
long int
long long int (C++11)
Integral (Integer) positive and negative whole numbers, including 0 64
std::nullptr_t (C++11) Null Pointer a null pointer nullptr
void Void no type n/a

This chapter is dedicated to exploring these fundamental data types in detail (except std::nullptr_t, which we’ll discuss when we talk about pointers).

Integer vs integral types

In mathematics, an “integer” is a number with no decimal or fractional part, including negative and positive numbers and zero. The term “integral” has several different meanings, but in the context of C++ is used to mean “like an integer”.

The C++ standard defines the following terms:

  • The standard integer types are short, int, long, long long (including their signed and unsigned variants).
  • The integral types are bool, the various char types, and the standard integer types.

All integral types are stored in memory as integer values, but only the standard integer types will display as an integer value when output. We’ll discuss what bool and the char types do when output in their respective lessons.

The C++ standard also explicitly notes that “integer types” is a synonym for “integral types”. However, conventionally, “integer types” is more often used as shorthand for the “standard integer types” instead.

Also note that the term “integral types” only includes fundamental types. This means non-fundamental types (such as enum and enum class) are not integral types, even when they are stored as an integer (and in the case of enum, displayed as one too).

Other sets of types

C++ contains three sets of types.

The first two are built-in to the language itself (and do not require the inclusion of a header to use):

  • The “fundamental data types” provide the most the basic and essential data types.
  • The “compound data types” provide more complex data types and allow for the creation of custom (user-defined) types. We cover these in lesson 12.1 -- Introduction to compound data types.

The distinction between the fundamental and compound types isn’t all that interesting or relevant, so it’s generally fine to consider them as a single set of types.

The third (and largest) set of types is provided by the C++ standard library. Because the standard library is included in all C++ distributions, these types are broadly available and have been standardized for compatibility. Use of the types in the standard library requires the inclusion of the appropriate header and linking in the standard library.

Nomenclature

The term “built-in type” is most often used as a synonym for the fundamental data types. However, Stroustrup and others use the term to mean both the fundamental and compound data types (both of which are built-in to the core language). Since this term isn’t well-defined, we recommend avoiding it accordingly.

A notable omission from the table of fundamental types above is a data type to handle strings (a sequence of characters that is typically used to represent text). This is because in modern C++, strings are part of the standard library. Although we’ll be focused on fundamental types in this chapter, because basic string usage is straightforward and useful, we’ll introduce strings in the next chapter (in lesson 5.7 -- Introduction to std::string).

The _t suffix

Many of the types defined in newer versions of C++ (e.g. std::nullptr_t) use a _t suffix. This suffix means “type”, and it’s a common nomenclature applied to modern types.

If you see something with a _t suffix, it’s probably a type. But many types don’t have a _t suffix, so this isn’t consistently applied.

guest
Your email address will not be displayed
Find a mistake? Leave a comment above!
Correction-related comments will be deleted after processing to help reduce clutter. Thanks for helping to make the site better for everyone!
Avatars from https://gravatar.com/ are connected to your provided email address.
Notify me about replies:  
338 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments