speice.io/_posts/2020-07-08-on-implementing-futures.md

305 lines
10 KiB
Markdown
Raw Normal View History

2020-07-09 16:52:18 -04:00
---
layout: post
title: "Tips for Implementing `Future`"
description: ""
category:
tags: [python]
---
When support for async/await launched in Rust, it came with a couple of technical caveats; it was
deemed more beneficial to release a minimum viable product than wait for a feature-complete release.
So far, this model seems to have worked out well. Asynchronous code originally required thread-local
storage for context tracking which mean that it could only be used in projects that included the
Rust standard library. It wasn't a hard requirement; nothing about the async design mandated context
tracking in this way. But given that most users of asynchronous code relied on the standard library
anyway, supporting asynchronous `no_std` projects was left as something to be addressed later. After
some fantastic work, thread-local storage is no longer used and there's some incredibly cool work
being done to enable Rust on `no_std` and embedded systems. While asynchronous programming is very
frequently used to model disk or network I/O, the same principles can be applied to monitoring
signals received from GPIO pins.
NOTE: Should I mention something about how cool it is that we can have `async` without needing heap
allocations or type erasure like in every other `async` implementation?
One other missing feature in the initial async support was being able to write traits that contained
`async fn` methods Normally, when an `async fn` function is declared, the compiler does some magic
to the function signature:
```rust
struct R;
// When you write a function like this:
async fn read_bytes(s: TcpStream) -> R { /* ... */ }
// ...the compiler effectively transforms it into this:
fn read_bytes(s: TcpStream) -> impl Future<Output = R> { /* ... */ }
```
This special return type (the `impl Future` thing) tells the compiler "I have no idea what the
_exact_ return type will be, but it will be something that implements `Future`, just figure it out
for me." If you're writing static or `struct` functions, that's no issue, the compiler can figure
everything out for you.
However, this "figure it out for me" mentality doesn't work when used with traits. The reasons are
varied and complex and out of scope for this discussion. But if we want to mix traits and
asynchronous code, we simply need to make sure the trait method returns a type that implements the
`Future` trait:
```rust
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
pub struct Byte(u8);
// Implementing this section of the code is what we'll be talking about.
// vvvvvvvvvvvvvvvvvvv
impl Future for Byte {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Self::Output> {
return Poll::Ready(self.0);
}
}
// ^^^^^^^^^^^^^^^^^^^
pub trait ByteReader {
fn get_byte(&self) -> Byte;
}
pub async fn my_function(b: impl ByteReader) -> u8 {
b.get_byte().await
}
```
Because of some Rust-specific issues (the `Pin`/`Unpin` system, unhelpful compiler messages),
implementing `Future` directly can be rather difficult. It's possible to use crates like
`async_trait` to work around the limitation, but if you're interested in building your own futures,
these techniques should make the process at least a bit easier.
# Implement functionality before structure
2020-07-18 23:22:32 -04:00
Don't `impl Future` right away; use a separate method and pass eevrything in. It's helpful to
de-couple "what you need in order to function" from "how you get those things"; are you supposed to
use `#[pin_project]` or `let Self { } = &mut *self` or maybe just `&mut self.value`? Self-pinning
makes things weird, and it's typically safe to deal with those questions later. Two guidelines:
2020-07-09 16:52:18 -04:00
2020-07-18 23:22:32 -04:00
1. Everything that needs to be `poll`-ed should be passed as `Pin<&mut T>`
2. Everything else passed by reference.
2020-07-09 16:52:18 -04:00
2020-07-18 23:22:32 -04:00
Don't call this function before it's ready; errors elsewhere in the code can make it difficult to
understand if the problem is in your "inner" function implementation, or the `impl Future`
implementation.
2020-07-09 16:52:18 -04:00
2020-07-18 23:22:32 -04:00
# Dealing with unfulfilled trait bounds
2020-07-09 16:52:18 -04:00
2020-07-18 23:22:32 -04:00
Should also add something about how `AsyncBufRead` isn't implemented for `&R3`, but _is_ after deref
(`R3`). The errors become a lot more obvious if you try to deref `self.reader`:
2020-07-18 00:55:26 -04:00
```rust
use futures_io::AsyncBufRead;
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
fn poll_once<R1: AsyncBufRead + ?Sized>(mut reader: Pin<&mut R1>, cx: &mut Context<'_>) -> Poll<()> {
reader.as_mut().poll_fill_buf(cx);
return Poll::Ready(());
}
struct MyStruct<'a, R2: ?Sized> {
reader: &'a R2,
}
impl<R3: AsyncBufRead + ?Sized + Unpin> Future for MyStruct<'_, R3> {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
2020-07-18 23:22:32 -04:00
// Important bit is the `*self.reader` here
2020-07-18 00:55:26 -04:00
poll_once(Pin::new(&mut *self.reader), cx)
}
}
```
```text
error[E0596]: cannot borrow data in a dereference of `std::pin::Pin<&mut MyStruct<'_, R3>>` as mutable
--> src/lib.rs:19:28
|
12 | reader: &'a R2,
| ------ help: consider changing this to be mutable: `&'a mut R2`
...
19 | poll_once(Pin::new(&mut *self.reader), cx)
| ^^^^^^^^^^^^^^^^^ cannot borrow as mutable
error[E0596]: cannot borrow `self` as mutable, as it is not declared as mutable
--> src/lib.rs:19:34
|
18 | fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
| ---- help: consider changing this to be mutable: `mut self`
19 | poll_once(Pin::new(&mut *self.reader), cx)
| ^^^^ cannot borrow as mutable
```
Now, we can see that `self` can't be borrowed as mutable (it needs to be marked
`mut self: Pin<&mut Self>`) and that the reader can't be borrowed as mutable (the struct definition
2020-07-18 00:57:46 -04:00
needs `&'a mut R2`). After those are fixed, we're good to go.
2020-07-18 00:55:26 -04:00
2020-07-09 16:52:18 -04:00
# Don't feel bad about requiring `Unpin`
2020-07-18 23:22:32 -04:00
For trait bounds, don't require it unless you need to, but don't hesitate to add it if the compiler
thinks you should.
2020-07-09 16:52:18 -04:00
```rust
use futures_io::AsyncBufRead;
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
fn poll_once<R1: AsyncBufRead + ?Sized>(mut reader: Pin<&mut R1>, cx: &mut Context<'_>) -> Poll<()> {
reader.as_mut().poll_fill_buf(cx);
return Poll::Ready(());
}
struct MyStruct<'a, R2: ?Sized> {
reader: &'a R2,
}
impl<R3: AsyncBufRead + ?Sized> Future for MyStruct<'_, R3> {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
poll_once(Pin::new(&mut self.reader), cx)
}
}
```
The type bounds for `R1` and `R3` seem to be identical, but are actually slightly different:
```text
error[E0277]: `R3` cannot be unpinned
--> src/lib.rs:19:9
|
6 | fn poll_once<R1: AsyncBufRead + ?Sized>(mut reader: Pin<&mut R1>, cx: &mut Context<'_>) -> Poll<()> {
| ------------ required by this bound in `poll_once`
...
19 | poll_once(Pin::new(&mut self.reader), cx)
| ^^^^^^^^^ the trait `std::marker::Unpin` is not implemented for `R3`
|
= note: required because of the requirements on the impl of `futures_io::if_std::AsyncBufRead` for `&mut R3`
help: consider further restricting this bound
|
15 | impl<R3: AsyncBufRead + ?Sized + std::marker::Unpin> Future for MyStruct<'_, R3> {
| ^^^^^^^^^^^^^^^^^^^^
```
2020-07-14 00:52:00 -04:00
2020-07-18 23:22:32 -04:00
For struct, if they have no `Pin` elements, `Unpin` is automatically implemented. Just need to make
sure that type bounds contain `Unpin`, or weird things happen when trying to use them:
2020-07-18 00:55:26 -04:00
```rust
2020-07-18 23:22:32 -04:00
#![allow(unused_mut)]
2020-07-18 00:55:26 -04:00
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
2020-07-18 23:22:32 -04:00
struct CantUnpin<T> {
items: Vec<T>
}
impl<T: Default> Future for CantUnpin<T> {
type Output = ();
2020-07-18 00:55:26 -04:00
2020-07-18 23:22:32 -04:00
fn poll(mut self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Self::Output> {
self.items.push(T::default());
Poll::Ready(())
}
}
struct CanUnpin<T> {
items: Vec<T>
2020-07-18 00:55:26 -04:00
}
2020-07-18 23:22:32 -04:00
impl<T: Default + Unpin> Future for CanUnpin<T> {
type Output = ();
fn poll(mut self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Self::Output> {
self.items.push(T::default());
Poll::Ready(())
}
2020-07-18 00:55:26 -04:00
}
```
2020-07-18 23:22:32 -04:00
```text
error[E0596]: cannot borrow data in a dereference of `std::pin::Pin<&mut CantUnpin<T>>` as mutable
--> src/lib.rs:14:9
|
14 | self.items.push(T::default());
| ^^^^^^^^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::pin::Pin<&mut CantUnpin<T>>`
```
2020-07-29 16:50:04 -04:00
Rule of thumb: If you don't know whether it's safe to require `Unpin`, it almost certainly is. Worst
case, can write a "compile test"; put code in a closure that's never called.
2020-07-18 23:22:32 -04:00
# Know what the escape hatches are
When used sparingly, either `#[async_trait]` or `BoxFuture` can enable async functionality in code
that will later not need the allocations. Use the escape hatch when you need to such that you can
continue making incremental improvements later.
Specific trick: use `BoxFuture` for type erasure:
2020-07-18 00:55:26 -04:00
```rust
2020-07-18 23:22:32 -04:00
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
use futures::future::BoxFuture;
async fn function1() {}
2020-07-18 00:55:26 -04:00
2020-07-18 23:22:32 -04:00
async fn function2() -> u8 { 0 }
pub struct MyStruct<T> {
f: BoxFuture<'static, T>
2020-07-18 00:55:26 -04:00
}
2020-07-18 23:22:32 -04:00
impl<T> Future for MyStruct<T> {
type Output = T;
2020-07-18 00:55:26 -04:00
2020-07-18 23:22:32 -04:00
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T> {
self.f.as_mut().poll(cx)
}
2020-07-18 00:55:26 -04:00
}
2020-07-18 23:22:32 -04:00
pub fn another_function() -> MyStruct<u8> {
MyStruct { f: Box::pin(async {
function1().await;
function2().await
}) }
}
2020-07-18 00:55:26 -04:00
```
2020-07-18 23:22:32 -04:00
There's one allocation because of `Box::pin()`, but that's it. We're allowed to use an opaque
`impl Future` and still return values from it.
2020-07-29 16:50:04 -04:00
---
Other thoughts that may be helpful in writing:
Plenty of reasons to write low level futures code; if you feel guilty about every heap allocation
because you wonder if it's really necessary, if you have to write traits, no_std or no alloc
environments
These are tools to help the mortals who don't really understand the Pin system yet. It's like
fighting the borrow checker; you'll probably figure it out eventually, but eventually doesn't help
you right now.
Unpin seems to mostly mean "safe to move", so everything that doesn't interact with the pin system
normally is probably fine.
Also need a note about enum type parameters and pinning
Principle: pinning is needed so Rust can desugar references across await points. Practically? No
idea how to meaningfully use it, what the purpose of pin project is, or how to actually create a
struct with internal reference.