Awesome Unstable Rust Features

Ethan Brierley published on
20 min, 3956 words

Tags: Rust

Introduction

This article describes several unstable Rust compiler features. It is intended to explain the basics of these features without diving into too much detail.

What is unstable Rust?

Rust releases come in three channels; stable, beta and nightly.

Not only does the nightly compiler come out every day, but it also is the only compiler that allows you to unlock unstable Rust features.

This article discusses unstable compiler features. Unstable library features are another topic for another day.

Why use unstable features

Unstable Rust can enable you to write API's that you could not express in stable Rust. Unstable features are used within both the compiler and standard library for that very reason.

Using unstable features always comes with some risk. They can often behave in unexpected ways, sometimes even breaking Rust's memory safety guarantees and resulting in undefined behaviour. Parts of features can be well developed while other parts left undeveloped.

It is not uncommon for a nightly compiler using unstable features, to hit an "internal compiler error" often called an ICE. This happens when, during compilation, the compiler panics. This could be due to data and queries becoming malformed by incomplete features or even just hitting a todo! on part of a feature that has not been worked on yet. If you run into an ICE it is often helpful to check if the issue is known about and if it is not, to report it to the bug tracker.

Rust provides no guarantee that it will continue to support its unstable features into the future. As Rust developers, we are spoilt with excellent backwards compatibility and stability. When unstable features are enabled, all of those guarantees are thrown out. What works today may work very differently tomorrow.

I decided to look into unstable features not because I need them to solve a particular problem. I sought them out because I thought they were fun. Using unstable features for me was an interesting way of getting more involved in the development process of the language itself.

A comprehensive list of unstable features can be found in the compilers source code.

Enabling unstable features

To start using unstable features the first thing you'll need to do is install a nightly toolchain by running the command:

rustup toolchain install nightly

To use the nightly-toolchain run commands with the +nightly modifier.

<rust-command> +nightly <args>

For example:

cargo +nightly run

Alternatively, you can change your default compiler to nightly so that you won't need to use the +nightly modifier. I've often done this as I haven't found the nightly compiler to be too unstable, even for my projects that also compile fine on stable.

rustup default nightly

Once you are using the nightly compiler you can just start using unstable features. Let's give it a go.

fn main() {
    let my_box = box 5;
}

Which results in this complier error:

error[E0658]: box expression syntax is experimental; you can call `Box::new` instead
 --> src/main.rs:2:18
  |
2 |     let my_box = box 5;
  |                  ^^^^^
  |
  = note: see issue #49733 <https://github.com/rust-lang/rust/issues/49733> for more information
  = help: add `#![feature(box_syntax)]` to the crate attributes to enable

As is so often the case in Rust the help message tells us exactly what we need to do. We need to enable the feature using #![feature(box_syntax)].

#![feature(box_syntax)]
fn main() {
    let my_box = box 5;
}

All unstable features will need to be enabled with #![feature(..)] before they can be used. If you forget, the compiler is often able to point you in the right direction, however, this is not always the case.

Now let's get started talking about some of the features themselves. I've placed the names of the features that you need to enable in code blocks in the headings for each feature while omitting them from the code snippets to keep them concise.

Control flow, patterns and blocks

destructuring_assignment

It is very common in Rust to destructure a type when binding it to a definition. This is most commonly done with a let binding.

// Create two "variables", one for x, one for y 
let Point { x, y } = Point::random();

Traditionally this pattern has only been possible when instantiating a new definition. destructuring_assignment extends this to work when mutating values. In other words, we can use destructuring without using let.

let (mut x, mut y) = (0, 0);

Point { x, y } = Point::random();

Early return from any block with label_break_value

One of the less well-known Rust features is the fact that loops can break with a value. Like many of the constructs in Rust loops are not just statements, but expressions.

// Keep asking the user to input an number until they give us a valid one
let number: u8 = loop {
    if let Ok(n) = input().parse() {
        break n;
    } else {
        println!("Invaid number, Please input a valid number");
    }
};

label_break_value extends this to work on any labelled block, not just loops. This acts as a kind of early return that works on any block of code, not just function bodies.

To label a block you use a syntax similar to lifetimes.

'block: {
     // This block is now labelled "block"
}

It is already possible to label loops much in the same way.

We can put the same label on our break to early return from that block.

let number = 'block: {
    if s.is_empty() {
	break 'block 0; // Early return from block
    }
    s.parse().unwrap()
}

This feature is not equivalent to goto. It does not have the same damaging effects as goto as it only goes forward and breaks from a block.

Inlining the power of the ? operator using try_blocks

The edition guide uses this example to illustrate how the question mark operator works:

fn read_username_from_file() -> Result<String, io::Error> {
    let f = File::open("username.txt");

    let mut f = match f {
        Ok(file) => file,
        Err(e) => return Err(e),
    };

    let mut s = String::new();

    match f.read_to_string(&mut s) {
        Ok(_) => Ok(s),
        Err(e) => Err(e),
    }
}

Simplifying using the ? operator, results in this equivalent code:

fn read_username_from_file() -> Result<String, io::Error> {
    let mut f = File::open("username.txt")?;
    let mut s = String::new();

    f.read_to_string(&mut s)?;

    Ok(s)
}

? is used in the context of functions to early return an Err if encountered. try_blocks unlock the same power but for any block of code rather than just functions. With try_blocks we can inline our read_usernames_from_file function.

try_blocks relate to ? in the same way that label_break_value relates to return. The RFC for try_blocks mentions label_break_value as a potential way of desugaring try_blocks.

Let's rewrite our read_username_from_file function as a simple let binding with a try block.

let read_username_from_file: Result<String, io::Error> = try {
    let mut f = File::open("username.txt")?;
    let mut s = String::new();

    f.read_to_string(&mut s)?;

    Ok(s)
}

I love this kind of thing especially for smaller expressions that can be made easier to read when not extracted into functions.

inline_const

Currently, the way to get a value to be evaluated at compile time is by defining a constant.

const PI_APPROX: f64 = 22.0 / 7.0;

fn main() {
     let value = func(PI_APPROX);
}

With inline_const we can do the same as an anonymous expression.

fn main() {
     let value = func(const { 22.0 / 7.0 });
}

With this simple example, the const block is almost certainly not necessary, due to compiler optimisations such as constant propagation, however, with more complex constants, it may be helpful to be explicit with the block.

This feature also allows for these blocks to be used in pattern position. match x { 1 + 3 => {} } results in a syntax error while match x { const { 1 + 3 } => {} }does not.

if_let_guard

Extends the if guards that you can use with match statements to be able to use if let.

let_chains

Currently, if let and while let expressions can not be chained with || or &&. This feature adds that support.

Traits

associated_type_bounds

Let's take this stable Rust function:

fn fizzbuzz() -> impl Iterator<Item = String> {
    (1..).map(|val| match (val % 3, val % 5) {
        (0, 0) => "FizzBuzz".to_string(),
        (0, _) => "Fizz".to_string(),
        (_, 0) => "Buzz".to_string(),
        (_, _) => val.to_string(),
    })
}

With the associated_type_bounds feature we can use an anonymous type in this context:

fn fizzbuzz() -> impl Iterator<Item: Display> { ... }

Let's take a look at this horribly repetitive type signature:

fn flatten_twice<T>(iter: T) -> Flatten<Flatten<T>>
where
    T: Iterator,
    <T as Iterator>::Item: IntoIterator,
    <<T as Iterator>::Item as IntoIterator>::Item: IntoIterator,
{
    iter.flatten().flatten()
}

With this feature, we can write it simply as:

fn flatten_twice<T>(iter: T) -> Flatten<Flatten<T>>
where
    T: Iterator<Item: IntoIterator<Item: IntoIterator>>,
{
    iter.flatten().flatten()
}

Which is far easier for me to reason about.

default_type_parameter_fallback, associated_type_defaults and const_generics_defaults

These features allow you to specify default values for generic types, associated types and const variables respectively in more places.

This allows you as a developer to create nicer APIs. If a crate user is not interested in a detail and that item has a default then the detail can be omitted. This also makes it easier to extend APIs without making breaking changes for your users.

negative_impls and auto_traits

Both of these features are used in the standard library. The traits Send and Sync are both examples of auto traits.

The trait Send is defined in the standard library as so:

pub unsafe auto trait Send {
    // empty.
}

Note the use of the auto keyword. This tells the compiler to automatically implement the Send trait for any struct/enum/union as long as all the types that make up said type also implement Send.

Auto traits would not be very useful if simply every type always implemented them. That is where negative_impls come in.

negative_impls allows for a type to opt-out from implementing an auto trait. Take for example UnsafeCell. It would be very unsafe for an unrestricted UnsafeCell to be shared across threads, therefore it would be very unsafe for it to be Sync.

impl<T: ?Sized> !Sync for UnsafeCell<T> {}

Note the creative use of ! to express "not Sync".

marker_trait_attr

This feature adds the #[marker] attribute for traits.

Rust disallows the defining of traits implementations that could overlap. This is so that the compiler will always know which implementation to use because there will always be only one.

Traits marked with #[marker] cannot override anything in their implementations. That way they are allowed to have overlapping implementations because all implementations will be the same.

type_alias_impl_trait, impl_trait_in_bindings and trait_alias

impl Trait tells the compiler to infer a concrete type to replace it with that implements Trait. Currently, impl Trait is only used in the context of function arguments or return types.

type_alias_impl_trait and impl_trait_in_bindings extend the places impl trait can be used to include type aliases and let bindings respectively.

trait_alias is subtlely different to type_alias_impl_trait. Everywhere you use a type alias the type must remain constant. A single concrete type must be inferred by the compiler that works in all those places. Trait aliases are more forgiving as they can be a different type in each place they are used.

fn_traits and unboxed_closures

The three traits Fn, FnMut and FnOnce are known as the fn traits. They are automatically implemented for any functions or closures that you create and are what provides the ability to pass arguments to them.

An automatic implementation is currently the only way to implement those traits. The fn_traits feature allows for custom implementations on any type. This is very similar to operator overloading but customising the use of ().

#![feature(unboxed_closures)] // required to implement a function with `extern "rust-call"`
#![feature(fn_traits)]

struct Multiply;

#[allow(non_upper_case_globals)]
const multiply: Multiply = Multiply;

impl FnOnce<(u32, u32)> for Multiply {
    type Output = u32;
    extern "rust-call" fn call_once(self, a: (u32, u32)) -> Self::Output {
        a.0 * a.1
    }
}

impl FnOnce<(u32, u32, u32)> for Multiply {
    type Output = u32;
    extern "rust-call" fn call_once(self, a: (u32, u32, u32)) -> Self::Output {
        a.0 * a.1 * a.2
    }
}

impl FnOnce<(&str, usize)> for Multiply {
    type Output = String;
    extern "rust-call" fn call_once(self, a: (&str, usize)) -> Self::Output {
        a.0.repeat(a.1)
    }
}

fn main() {
    assert_eq!(multiply(2, 3), 6);
    assert_eq!(multiply(2, 3, 4), 24);
    assert_eq!(multiply("hello ", 3), "hello hello hello ");
}

Notice that this is being used to create a hacky version of function overloading and variadic functions.

Sugar

box_patterns and box_syntax

These two features make constructing and destructing Boxes easier. The box keyword replaces Box::new(..) and allows for the dereferencing Boxes when pattern matching.

struct TrashStack<T> {
    head: T,
    body: Option<Box<TrashStack<T>>>,
}

impl<T> TrashStack<T> {
    pub fn push(self, elem: T) -> Self {
        Self {
            head: elem,
            body: Some(box self),
        }
    }

    pub fn peek(self) -> Option<T> {
        if let TrashStack {
            body: Some(box TrashStack { head, .. }),
            ..
        } = self
        {
            Some(head)
        } else {
            None
        }
    }
}

This makes things a little more ergonomic but I don't think there is much chance that this feature will ever be stabilised. It seems to have existed forever with no plan for stabilisation but instead a little discussion about removing the feature. box_synatx is used heavily in the compiler's source and a little in the standard library.

It is interesting to note that box does not desugar to Box::new but Box::new is implemented in the standard library with box.

impl<T> Box<T> {
    ...
    pub fn new(x: T) -> Self {
        box x
    }
    ...
}

async_closure

Currently to be async inside of a closure you have to use an async block.

app.at("/").get(|_| async { Ok("Hi") });

async_closure allows you to mark the closure itself as async just like you would a async function.

app.at("/").get(async |_| Ok("Hi"));

in_band_lifetimes

To use a lifetime it must be explicitly brought into scope.

fn select<'data>(data: &'data Data, params: &Params) -> &'data Item;

With in_band_lifetimes the lifetimes can be used without bringing them into scope first.

fn select(data: &'data Data, params: &Params) -> &'data Item;

Interestingly enough this was how lifetimes used to work pre 1.0.0.

format_args_capture

This allows for named arguments to be placed inside of strings inside any macro that depends on std::format_args!. That includes print!, format!, write! and many more.

let name = "Ferris";
let age = 11;
println!("Hello {name}, you are {age} years old");

It is likely that this will be stabilised with or soon after edition 2021.

crate_visibility_modifier

With this feature you can write crate struct Foo rather than pub(crate) struct Foo and have it mean exactly the same thing.

This makes pub(crate) easier to write, encouraging the use of crate visibility when full pub is not necessary.

Types

type_ascription

Take for example the collect method on Iterator. Collect transforms an interator into a collection.

let word = "hello".chars().collect();
println!("{:?}", word);

This does not compile because Rust is unable to infer the type of word. This can be fixed by replacing the first line with:

let word: Vec<char> = "hello".chars().collect();

With type_ascription the let binding is no longer necessary and one can simply:

println!("{:?}", "hello".chars().collect(): Vec<char>);

The : Type syntax can be used anywhere to hint at the compiler "I want this type at this point".

never_type

It is possible to define enums with zero variants. Such an enum exists stable in the standard library.

pub enum Infallible {}

It is possible to use this type in generics and function signatures but never possible for it to be constructed. There are simply no variants to construct.

The unit type, () would be equivalent to an enum with a single variant. never_type introduces a new type, ! which is equivalent to our Infallible enum with zero variants.

Because ! can never be constructed it can be given special powers. We don't have to handle the case of ! because we have proven it will never exist.

fn main() -> ! {
    loop {
        println!("Hello, world!");
    }
}

Loops without a break "return !" because they don't ever return.

! can be very useful for expressing impossible outcomes in the type system. Take for example the FromStr implementation on this UserName type. This implementation is infallible because its implementation can never fail. This allows us to set the Err variant to type !.

struct UserName(String);

impl FromStr for UserName {
    type Err = !;
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        Ok(Self(s.to_owned()))
    }
}

It is then possible to use an empty match on the Err variant because ! has no variants.

let user_name = match UserName::from_str("ethan") {
    Ok(u) => u,
    Err(e) => match e {},
};

With the feature exhaustive_patterns the type system becomes smart enough for us to eliminate the Err branch altogether.

let user_name = match UserName::from_str("ethan") {
    Ok(u) => u,
};

We can combine this with destructuring to remove the match leaving a beautiful line of code.

let Ok(user_name) = UserName::from_str("ethan");

Attributes

optimize_attribute

It is possible to specify how you want your binary to be optimised with Cargo.toml using opt-level.

The opt-level affects the entire crate while the optimize_attribute can control optimization for individual items.

#[optimize(speed)]
fn fast_but_large() {
     ...
}
#[optimize(size)]
fn slow_but_small() {
     ...
}

This would be very useful for fine-tuning applications where the trade off between size and performance is particularly pronounced such as when using web assembly.

stmt_expr_attributes

This feature allows you to place attributes almost everywhere not just top-level items. For example with this feature, you are able to place an optimize attribute on a closure.

cfg_version

This feature allows for conditional compilation based on compiler version.

#[cfg(version("1.42"))] // 1.42 and above
fn a() {
    // ...
}

#[cfg(not(version("1.42")))] // 1.41 and below
fn a() {
    // ...
}

This allows crates to make use of the latest compiler features while still keeping fallback support for old compilers.

no_core

It has been possible to opt-out of using the standard library using #![no_std] for a while. This is important for applications that don't run in a full environment such as embedded systems. Embedded systems often don't have an operating system or even dynamic memory so many of the functions in std wouldn't work.

#![no_core] takes it further by opting out of libcore. That leaves you with almost nothing, you can't even use libc. This makes it very difficult to implement anything useful.

Other

Const Generics

I spoke about the future of const_generics at my talk for Rust Dublin. Rather than me reiterating what I said there, I encourage you to watch that talk.

Macros 2.0

Rust's declarative macros are very powerful however some of the rules around macro_rules! have always confused me.

For one, macro_rules! acts as a simple token transformation. It takes a list of tokens and outputs a new list of tokens, nothing smarter than that. The publicity rules end up being the rules of where the macro is being called. This is obvious because the codes is being simply pasted into that place.

Macros 2.0 is an rfc describing a replacement to macro_rules! with a new construct simply using the keyword macro.

One of the main improvements the new syntax introduces is macro hygiene which allows macros to use the publicity rules of where they are written rather than where they are called.

generators

Generators/coroutines provide a special kind of function that can be paused during execution to "yield" intermediate values to the caller.

Generators can return multiple values using the yield keyword, each time pausing the function and returning to the caller. A generator can then return a single value after which it can no longer be resumed.

About three years ago I attempted to write an algorithm to traverse an infinite matrix along its diagonals. I found it very difficult to write that with Rust's iterators and ended up giving up.

Here is an implementation using Rust's generators/coroutines along with a number of other features we've discussed already.

#![feature(
    try_blocks,
    generators,
    generator_trait,
    associated_type_bounds,
    type_ascription
)]

use std::{
    iter,
    ops::{Generator, GeneratorState},
    pin::Pin,
};

/// Input
/// [[1, 2, 3]
/// ,[4, 5, 6]
/// ,[7, 8, 9]]
/// Output
/// [1, 2, 4, 3, 5, 7]
fn diagonalize<T>(
    mut matrix: impl Iterator<Item: Iterator<Item = T>>,
) -> impl Generator<Yield = T, Return = ()> {
    move || {
        let mut rows = Vec::new();
        (try {
            rows.push(matrix.next()?);
            for height in 0.. {
                for row in 0..height {
                    if row >= rows.len() {
                        rows.push(matrix.next()?);
                    }
                    yield rows[row].next()?;
                }
            }
        }): Option<()>;
    }
}

fn main() {
    let matrix = (0..).map(|x| iter::once(x).cycle().enumerate());
    let mut diagonals = diagonalize(matrix);
    while let GeneratorState::Yielded(value) = Pin::new(&mut diagonals).resume(()) {
        dbg!(value);
    }
}

It is understandable if you found the above snippet hard to interpret. It makes use of a number of features that you may have just been introduced to.

There is a compelling argument against adding too many new features as they can greatly increase the learning curve.

Generators make it possible to write implementations that are far more difficult or even impossible to write without them.

Generators were added to implement async-await in the standard library. It is most likely that the exact semantics will change before any kind of stabilisation but they are very fun to play with.

Final thoughts

I have to apologise for not including three amazing unstable features; Generic associated types, inline asm and specialization. I simply did not feel able to give these features justice in this article but I may try to talk about them in future.

If you wish to read more about an unstable feature the best place to start is the unstable book where most of them are listed. The unstable book then links to a tracking issue which then often, in turn, links to an RFC. With this combination of sources, you can then build up a picture of the details surrounding a feature.

Thank you for reading my first blog post 😃. The best way to support me is by following my Twitter. I am also looking for employment opportunities so please get in touch if you would like to talk about that.