C++23

C++23 is complete and pretty much finalized, so let’s explore the new features in C++, from a data science point of view. This is not as large of a release as C++20 or C++11, but it’s still a hefty release with lots of nice new features - larger than C++17.

Some of the major new features are building on themes from C++20, such as vastly expanded ranges and views, more constexpr/consteval, module support for the standard library, std::print to supplement std::format, and a std::generator to support coroutines. There are also several general features, lambdas and classes get along much better now, there are new preprocessor additions (!), a stacktrace library, flat maps/sets, and new std::optional monadics, deducing this, and many various additions.


Posts in the C++ series11 14 17 20 23

Updates on C++20

Like C++14, there were quite a few updates to features added in the previous standard. Let’s start by revisiting my major points from C++20, and see how they were updated. The only one without an update was concepts.

Standard lib as a module

Module support in C++ landed in C++20, but there was a glaring omission: the standard library! That’s now been corrected.

Classic C++

#include <vector>
#include <iostream>
...

C++23

import std;

This is fantastic, and shows the power of modules. You don’t need to list a bunch of includes, just this one import, and now the compiler can intelligently select what is needed. It’s so clean!

More Constexpr all the things!

The largest new addition is if consteval, which further expands on consteval from the last release.

There are also more constexpr additions: std::optional, std::variant, and std::unique_ptr (!), along with the items in cmath and cstdlib. This really improves compile time programming in C++23 over it’s predecessors.

Ranges redux

More additions to ranges! This is a massive usability improvement. It’s really hard to do much with ranges without a zip, and now it’s there!

auto num = std::vector{1, 2, 3};
auto letters = std::array{'A', 'B', 'C'};

for (const auto [n, l] : std::views::zip(num, letters)) {
    std::println("{} {}", n, l);
}

The additions to std::ranges::* are:

  • to: Convert ranges to containers (including nesting!)
  • iota: Like std::iota
  • shift_left/shift_right: like std::shift_left / std::shift_right
  • fold*: Folding algorithms
  • find_last*: Find last element that matches
  • starts_with/ends_with: Check beginning/end of range with another range
  • contains/contains_subrange: If a range contains an element or another range

And, std::views::*:

  • zip*: Combine ranges (shown above)
  • zip_transform: Feed zip output to a function
  • chunk / chunk_by: Divides range into chunks, last chunk can be shorter (good for splitting up work, for example)
  • adjacent: Make a sliding window (overlapping chunks)
  • adjacent_transform: Apply a function on the adjacent window
  • slide: Runtime version of adjacent, sliding window
  • cartesian_product: Compute a Cartesian product
  • join_with: Flatten range of ranges into a range
  • repeat: Repeats a range (n times or infinite)

There still are things missing that were in ranges v3, but a lot of them can be easily implemented using the next feature in this list!

Coroutines - std::generator

Making a generator is one of the first things users likely think of when seeing coroutines in C++20, and now it’s much easier with std::generator (which is an input range). For example, we could create an infinite range, and then use take to get the first 10 items:

imoprt std;

std::generator<int> range(int start = 0) {
    while(true) {
        co_yield start++;
    }
}

int main() {
    for (int i : range() | std::views::take(10)) {
        ...
    }
}

Formatting now has std::print(ln)

This was skipped over for C++20, but now is here; std::print and std::println! We can finally say goodbye to std::cout << std::format(...) << std::endl.

You can also use std::format and friends on ranges.


New features

Now the major new features of C++23 that don’t expand on C++20.

MDSpan

C++20 deprecated using the comma operator, and now it’s back as a multidimensional subscript! item[1,2] is now valid C++. This is implemented in a brand new std::mdspan, which gives you a multidimensional view of an existing data structure. This means we have a language for passing views of and using multidimensional arrays

std::vector v{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
auto view = std::mdspan(v.data(), 2, 6);
// view.extent(0) == 2
// view.extent(1) == 6
// view[0,0] == 1

This is only available as a view (joining std::span and std::string_view from C++17) for now. Also, sub-views are not yet included (likely addition for C++26).

Deducing this

One of the most powerful tiny additions is “deducing this”. This allows you to explicitly take a “self” argument (noted by the this keyword proceeding it).

It looks like this:

Classic C++

struct Foo {
    void bar() &;
    void bar() const&;
    void bar() &&;
    void bar() const &&;
};

C++23

struct Foo {
    void bar(this Foo& self);
    void bar(this Foo const& self);
    void bar(this Foo&& self);
    void bar(this Foo const&& self);
};

Why would you want to write it that way? Well, besides being more consistent with the way you specify other parameters, and looking more like languages like Python, there’s a huge reason to do this: you can template on this parameter! For example, the above four overloads can be written as:

struct Foo {
    template<typename Self>
    void bar(this Self&& self) {...}
};

Now you no longer need to make separate const and non-const versions of your methods!

Another use of this is for CRTP. One of the first advanced C++ tricks you might learn is CRTP, which allows you to implement compile-time polymorphism by requiring a subclass to put itself into a template parameter. Deducing this allows you to simply write this as normal class, since this is deduced as the child type!

Here’s an example, taken from CLI11’s Options classes, with shortened names for clarity:

Classic C++

template<typename CRTP>
struct OptionBase {
    /// [snip]
    CRTP* take_last() {
        auto *self = static_cast<CRTP *>(this);
        self->mo_policy(MOPolicy::TakeLast);
        return self;
    }
};

struct OptionDefaults : OptionBase<OptionDefaults> {
    /// [snip]
}

struct Option : OptionBase<OptionDefaults> {
    /// [snip]
}

C++23

struct OptionBase {
    /// [snip]
    template<typename Self>
    auto take_last(this Self&& self) {
        self->mo_policy(MOPolicy::TakeLast);
        return self;
    }
};

struct OptionDefaults : OptionBase {
    /// [snip]
}

struct Option : OptionBase {
    /// [snip]
}

Simplifying the base class (no longer a template class), the methods, and the child classes (no longer having to template on themselves) is a huge win!

There are other things that can be done with this as well, like recursive lambdas. By taking a this auto self, you can call self(...) inside the lambda. To read more, see Deducing this article or the proposal.

std::except and monadics

C++ is well known for “pay only for what you use”. The best known exception to that rule is exception handling; you pay for it (in binary size, etc.) even if you don’t use it. A different take on handling exceptions is present in Rust: you make the exception part of a variadic return type. This both avoids traditional exception handling and forces a user to explicitly handle all expected errors. Similar to how std::optional is a specialization of a variant for a value or None return, std::expected is a specialized variant for a value or an error. C++ doesn’t have as many syntax features to help you use this, but it’s still a welcome addition. And we do have monadic methods .and_then, .or_else, .transform, and .transform_error! These allow a very elegant chaining of expected return types.

That’s not the only place monadics have been added; std::optional gets them too.

std::optional<int> v = /* something */;
opt.transform([](int n){...}) // Processes values, skips if nullopt
opt.and_then([](int n){...}) // Similar, but returns optionals
opt.or_else([]{...}) // Converts empty values, returns optionals

Combined with the existing .value_or, these provide a way to work with optionals and expected quite elegantly.

Smaller updates

Lambda functions

We can now drop the () from a lambda that doesn’t take any parameters. Lambdas can be declared static if they don’t capture anything, which saves a registry call.

Preprocessor updates

We finally have #elifdef and #elifndef, which you would rather expect since we have #ifdef. #warning is now officially part of the language.

Range based for

This used to be buggy:

for(auto&& val: foo().bar()) {...}

Here, foo returns a temporary object, and bar returns a reference to a member, the compiler could destroy the object before the body of the loop, causing the reference to be valid. Now, the lifetime of objects created on the right of the colon guaranteed to last through the body of the loop.

Contains

There are now member functions for std::string and std::string_view to check to see if a string contains a substring or a single character! This tiny thing is a staple in most other languages (it even has a keyword, in, in Python) and was sorely lacking in C++. There is a matching new ranges feature for arbitrary ranges, as well. This complements the .contains that containers got in C++20.

Even smaller things

Here are a few too small to write up much about:

  • Literal suffix for std::size_t (uz) and std::ssize_t (z)
  • auto(x) / auto{x}, which do a decay copy (and optimize to nothing on prvalues) (nice post)
  • Class template argument deduction improvements for inheritance (nice post)
  • operator() and operator[] can be static
  • [[assume]] attribute to tell the compiler that an (unevaluated) expression can be assumed true
  • std::to_underlying to convert an Emum to the underlying type.
  • std::unreachable to mark unreachable code

Other new libraries

We’ve already covered <print>, <expected>, and <mdspan> above.

Stacktraces

C++ now has a built in library to help with stacktraces, <stacktrace>. For example:

std::cout << std::stacktrace::current() << '\n';

Better support for exceptions is planned for C++26.

Flat maps and sets

There are new flat (dense storage) versions of maps and sets. These store the map/set values together in memory, at the expense of modification time. In general, these are probably better if you don’t have to modify them much.

Final words

C++23 is a great addition to the language. C++ is still getting Rust-like features (like std::expected), better ranges, and features that deeply impact what you can do (deducing this). Now we need to watch the various compilers to see when we can start using these features.


I normally post reports from the meetings here. But with COVID, this development cycle has been very strange. Here’s what I could find:


C++26

The first C++26 meeting was held hybrid:

A few exciting things are already adopted for C++26, including _ as a placeholder, work in making string formatting support constexpr, a type-erased callable reference, 100 more constexpr functions, even constexpr sorting (including ranges-based sorts).

C++26 looks to be focused on Contracts, Reflection (maybe?), and


The next ++?

There also is a lot of work going into making a successor language to C++. It’s unclear what that would be, with several groups working on several different languages, like:

  • Val: making a safe version of C++
  • Carbon: A cleanup project by Google, but interop with C++ will be hard due to removed features like constructors.
  • Cpp2: A better C++, 50x safer and 10x simpler (target goal) from Herb Sutter

What makes these different from some other languages (like D, Go, and Rust) is that they are designed to interface smoothly with C++. A great article on these can be found here.


Further reading


Posts in the C++ series11 14 17 20 23

comments powered by Disqus