Streams and AsyncIterators #9

Closed
opened 2021-12-31 18:26:55 +00:00 by notplants · 3 comments
Owner

Today and yesterday I've done some research about Iterators and Streams (AsyncIterators), to explore the possibility of replacing get_source_until_eof with something that returns an AsyncIterator/Stream instead of a Vec.

The point of this is that for source calls which work with streams, we could write code that processes the stream without waiting for the stream to "complete". This would allow for more efficient code in certain cases (e.g. looking for the latest about message with a certain key, we could short-circuit the code as soon as we find that message, instead of first loading all about messages into memory), as well as for "live-updating" code down the line if we want that.

As somewhat expected, I found you can't use a simple Iterator with async code. Well you can convert the async code results into a vector and then create a simple Iterator from the vector, but that skips the benefits described above (which would be from creating an Iterator without first getting all the results to convert into a Vector).

This is why (I think) we would need some type of AsyncIterator or Stream type. There seems to be much debate in Rust ecosystem about this, including plans for incorporating an AsyncIterator into future versions of rust, discussion of async traits, and ranting about async and futures in rust in general.

The next thing I'm going to look into is using this implementation of Stream:
https://docs.rs/futures/0.3.19/futures/stream/trait.Stream.html

This could give us the two benefits described above, but it does add complexity over using standard sync iterators, so I'm not convinced yet that we should try to use it everywhere (maybe only where its really genuinely helpful/needed, or possibly not at all).

Here are some other links I found while researching, if you want to click around about it:

rust streams -> AsyncIterator
https://github.com/rust-lang/rust/issues/79024

rust async iterator
http://rust-lang.github.io/rfcs/2996-async-iterator.html

async iterator generator discussion
http://rust-lang.github.io/rfcs/2996-async-iterator.html#generator-syntax

no async traits in rust yet (therefore cannot use async function in iterator)
https://rust-lang.github.io/async-book/07_workarounds/05_async_in_traits.html

error[E0706]: functions in traits cannot be declared `async`

cc @glyph

Today and yesterday I've done some research about Iterators and Streams (AsyncIterators), to explore the possibility of replacing get_source_until_eof with something that returns an AsyncIterator/Stream instead of a Vec. The point of this is that for source calls which work with streams, we could write code that processes the stream without waiting for the stream to "complete". This would allow for more efficient code in certain cases (e.g. looking for the latest about message with a certain key, we could short-circuit the code as soon as we find that message, instead of first loading all about messages into memory), as well as for "live-updating" code down the line if we want that. As somewhat expected, I found you can't use a simple Iterator with async code. Well you can convert the async code results into a vector and then create a simple Iterator from the vector, but that skips the benefits described above (which would be from creating an Iterator without first getting all the results to convert into a Vector). This is why (I think) we would need some type of AsyncIterator or Stream type. There seems to be much debate in Rust ecosystem about this, including plans for incorporating an AsyncIterator into future versions of rust, discussion of async traits, and ranting about async and futures in rust in general. The next thing I'm going to look into is using this implementation of Stream: https://docs.rs/futures/0.3.19/futures/stream/trait.Stream.html This could give us the two benefits described above, but it does add complexity over using standard sync iterators, so I'm not convinced yet that we should try to use it everywhere (maybe only where its really genuinely helpful/needed, or possibly not at all). Here are some other links I found while researching, if you want to click around about it: rust streams -> AsyncIterator https://github.com/rust-lang/rust/issues/79024 rust async iterator http://rust-lang.github.io/rfcs/2996-async-iterator.html async iterator generator discussion http://rust-lang.github.io/rfcs/2996-async-iterator.html#generator-syntax no async traits in rust yet (therefore cannot use async function in iterator) https://rust-lang.github.io/async-book/07_workarounds/05_async_in_traits.html ``` error[E0706]: functions in traits cannot be declared `async` ``` cc @glyph
Owner

Great issue, this is the fun stuff!

I suggest you look closely at the async_std::stream::Stream trait:

An asynchronous stream of values.

This trait is a re-export of futures::stream::Stream and is an async version of std::iter::Iterator.

Then you can check out the StreamExt trait for a list of provided methods.

aysnc_std is the async library used by kuska and, by extension, golgi. It's an incredibly professional library. You might also want to read the async-std book.

I haven't yet read through all the links you posted but I believe those discussions are primarily concerned with bringing async iterators into the standard library.

I also just noticed that we have futures defined as a dependency in golgi but I don't think it's currently required by our code (I just commented it out and the crate compiled). However, it may be required as our code develops - in order to use the features provided by FutureExt.

Great issue, this is the fun stuff! I suggest you look closely at the [async_std::stream::Stream trait](https://docs.rs/async-std/latest/async_std/stream/trait.Stream.html): > An asynchronous stream of values. > > This trait is a re-export of futures::stream::Stream and is an async version of std::iter::Iterator. Then you can check out the [StreamExt trait](https://docs.rs/async-std/latest/async_std/prelude/trait.StreamExt.html) for a list of provided methods. `aysnc_std` is the async library used by kuska and, by extension, golgi. It's an incredibly professional library. You might also want to read the [async-std book](https://book.async.rs/). I haven't yet read through all the links you posted but I believe those discussions are primarily concerned with bringing async iterators into the standard library. I also just noticed that we have `futures` defined as a dependency in golgi but I don't think it's currently required by our code (I just commented it out and the crate compiled). However, it may be required as our code develops - in order to use the features provided by `FutureExt`.
Author
Owner

thanks for the second pair of eyes and references. I made a PR with a working example here #10, although still needs some polishing

thanks for the second pair of eyes and references. I made a PR with a working example here https://git.coopcloud.tech/golgi-ssb/golgi/pulls/10, although still needs some polishing
Author
Owner

Closing this since this is done now.

Closing this since this is done now.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: golgi-ssb/golgi#9
No description provided.