BUENOS AIRES CLIMA

Radio Pueblo 91.5 FM

  • Nuestras Redes Sociales:
Home Economia Sam Thursfield: Status update, 16/05/2024 – Learning Async Rust
Sam Thursfield: Status update, 16/05/2024 – Learning Async Rust
  • Compartir
  • 76

Sam Thursfield: Status update, 16/05/2024 – Learning Async Rust

  • - 2024-05-16

This is another month where too many different things happened to stick them all in one post together. So here’s a ramble on Rust, and there’s more to come in a follow up post.

I first started learning Rust in late 2020. It took 3 attempts before I could start to make functional commandline apps, and the current outcome of this is the ssam_openqa tool, which I work on partly to develop my Rust skills. This month I worked on some intrusive changes to finally start using async Rust in the program.

How it started

Out of all the available modern languages I might have picked to learn, I picked Rust partly for the size and health of its community: every community has its issues, but Rust has no “BDFL” figure and no one corporation that employs all the core developers, both signs of a project that can last a long time. Look at GNOME, which is turning 27 this year.

Apart from the community, learning Rust improved the way I code in all languages, by forcing more risks and edge cases to the surface and making me deal with them explicitly in the design. The ecosystem of crates has most of what you could want (although there is quite a lot of experimentation and therefore “churn”, compared to older languages). It’s kind of addictive to know that when you’ve resolved all your compile time errors, you’ll have a program that reliably does what you want.

There are still some blockers to me adopting Rust everywhere I work (besides legacy codebases). The “cycle time” of the edit+compile+test workflow has a big effect on my happiness as a developer. The fastest incremental build of my simple CLI tool is 9 seconds, which is workable, and when there are compile errors (i.e. most of the time) its usually even faster. However, a release build might take 2 minutes. This is 3000 lines of code with 18 dependencies. I am wary of embarking on a larger project in Rust where the cycle time could be problematically slow.

Binary size is another thing, although I’ve learned several tricks to keep ssam_openqa at “only” 1.8MB. Use a minimal arg parser library instead of clap. Use minreq for HTTP. Follow the min-size-rust guidelines. Its easy to pull in one convenient dependency that brings in a tree of 100 more things, unless you are careful. (This is a problem for C programmers too, but dependency handling in C is traditionally so horrible that we are already conditioned to avoid too many external helper libraries).

The third thing I’ve been unsure about until now is async Rust. I never immediately liked the model used by Rust and Python of having a complex event loop hidden in the background, and a magic async keyword that completely changes how a function is executed, and requires all other functions to be async such as you effectively have two *different* languages: the async variant, and the sync variant; and when writing library code you might need to provide two completely different APIs to do the same thing, one async and one sync.

That said, I don’t have a better idea for how to do async.

Complicating matters in Rust is the error messages, which can be mystifying if you hit an edge case (see below for where this bit me). So until now I learned to just use thread::spawn for background tasks, with a std::sync::mpsc channel to pass messages back to the main thread, and use blocking IO everywhere. I see other projects doing the same.

How it’s going

My blissful ignorance came to an end due to changes in a dependency. I was using the websocket crate in ssam_openqa, which embeds its own async runtime so that callers can use a blocking interface in a thread. I guess this is seen as a failed experiment, as the library is now “sluggishly” maintained, the dependencies are old, and the developers recommend tungstenite instead.

Tungstenite seems unusable from sync code for anything more than toy examples, you need an async wrapper such as async-tungstenite (shout out to slomo for this excellent library, by the way). So, I thought, I will need to port my *whole codebase* to use an async runtime and an async main loop.

I tried, and spent a few days lost in a forest of compile errors, but its never the correct approach to try and port code “in one shot” and without a plan. To make matters worse, websocket-rs embeds an *old* version of Rust’s futures library. Nobody told me, but there is “futures 0.1” and “futures 0.3.” Only the latter works with the await keyword; if you await a future from futures 0.1, you’ll get an error about not implementing the expected trait. The docs don’t give any clues about this, eventually I discovered the Compat01As03 wrapper which lets you convert types from futures 0.1 to futures 0.3. Hopefully you never have to deal with this as you’ll only see futures 0.1 on libraries with outdated dependencies, but, now you know.

Even better, I then realized I could keep the threads and blocking IO around, and just start an async runtime in the websocket processing thread. So I did that in its own MR, gaining an integration test and squashing a few bugs in the process.

The key piece is here:

use tokio::runtime;
use std::thread;

...

    thread::spawn(move || {
        let runtime = runtime::Builder::new_current_thread()
            .enable_io()
            .build()
            .unwrap();

        runtime.block_on(async move {
            // Websocket event loop goes here

This code uses the tokio new_current_thread() function to create an async main loop out of the current thread, which can then use block_on() to run an async block and wait for it to exit. It’s a nice way to bring async “piece by piece” into a codebase that otherwise uses blocking IO, without having to rewrite everything up front.

I have some more work in progress to use async for the two main loops in ssam_openqa: these currently have manual polling loops that periodically check various message queue for events and then call thread::sleep(250), which work fine in practice for processing low frequency control and status events, but it’s not the slickest nor most efficient way to write a main loop. The classy way to do it is using the tokio::select! macro.

When should you use async Rust?

I was hoping for a simple answer to this question, so I asked my colleagues at Codethink where we have a number of Rust experts.

The problem is, cooperative task scheduling is a very complicated topic. If I convert my main loop to async, but I use the std library blocking IO primitives to read from stdin rather than tokio’s async IO, can Rust detect that and tell me I did something wrong? Well no, it can’t – you’ll just find that event processing stops while you’re waiting for input. Which may or may not even matter.

There’s no way automatically detect “syscall which might wait for user input” vs “syscall which might take a lot of CPU time to do something”, vs “user-space code which might not defer to the main loop for 10 minutes”; and each of these have the same effect of causing your event loop to freeze.

The best advice I got was to use tokio console to monitor the event loop and see if any tasks are running longer than they should. This looks like a really helpful debugging tool and I’m definitely going to try it out.

Screenshot of tokio-console

So I emerge from the month a bit wiser about async Rust, no longer afraid to use it in practice, and best of all, wise enough to know that its not an “all or nothing” switch – its perfectly valid to mix and sync and async in different places, depending on what performance characteristics you’re looking for.


TE PUEDE INTERESAR

Radio Pueblo 91.5 FM



Nuestras Redes Sociales: