Add SsbKVT, SsbMessageValue and SsbMessageContent Types #6
Loading…
Reference in New Issue
No description provided.
Delete Branch "kvt"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Serializing into serde_json::Value is working well. Also into
Vec<Value>
. See the function get_async_until_eof for where this takes place.Before moving forward, more consideration of types is needed, as you started to discuss here #2
This PR looks at defining our own types in golgi:
- SsbMessageValue (directly pulled from ssb-validate)
- KVT (pulled from kuska, but instead of a value for content, it uses an SsbMessageValue for content)
One advantage of this approach, is that it makes it easy to make
Vec<KVT>
andVec<SsbMessageValue>
which are nicely typed and easier to work with than just json. Since we should be able to count on messages being validated and well-formatted, perhaps its good to take advantage of this, and convert them into nicely typed objects.It also gives the small advantage that since we are defining our own type outside of, we can give it the name KVT (instead of Feed) no problem. Small but could make code more understandable to developers working with this library.
I also started to look into how we would work with the content field of SsbMessageValue and what patterns we could use for converting it into a typed object.
I didn't get super far with this, but did implement an into_post function for SsbMessageValue which was working and allowed for a call like:
(note the above code still thows an error due to an issue with the borrow checker, but I think it shows a possible pattern still).
for particular functions where we know what type of message they are supposed to return (because we are filtering by type in the query) we could also do this conversion within the method, so the user of the function just gets back a nice
Vec<Post>
and doesn't have to convert it themselves. Although I think Ideally, the user would not get back just aVec<Post>
but actually a variation ofVec<SsbMessageValue>
where the content field is of type Post (so that they have all the info, nicely typed, to do whatever they want with it).This PR meant to continue discussion, curious to hear your thoughts. Whether its preferable to use kuska::Feed , or if its worth it to have our own type, so we can have strongly typed objects with all fields easily accessible. And also if any patterns come to mind for how to have the content field be typed, but differently typed for different types of messages (is this something where we use type inheritance? or how to use enum variants for that? I'm still thinking through this).
cc @glyph
Found a feature in the serde docs which I was missing before:
https://serde.rs/enum-representations.html#internally-tagged
I see that adria was already using this tagging on TypedMessage. Cool.
Means you can simply deserialize a TypedMessage from content like this:
let typed_message: TypedMessage = serde_json::from_value(ssb_message.content)?;
where ssb_message.content has type serde_json::Value. Nice that we don't need any match statement on type - serde handles it for us using the type field as the tag to select the variant to deserialize into.
I added a commit here where I added a TypedSsbMessageValue struct, which is the same as SsbMessageValue, but contains an additional field, typed_message, which has type TypedMessage (and is built from content).
Alternatively, we could also not have a struct like this, and we could just create a TypedMessage on the fly when needed from an SsbMessageValue, via the snippet included above. I decided to keep the content field in TypedSsbMessageValue even though typed_message is essentially the typed version of the same data, because I thought it might be helpful.
Not sure which of these types it makes the most sense for our functions to work with. A TypedSsbMessageValue has the most useful fields (has all the info one could want to work with) but maybe there are tradeoffs?
@notplants
It's great to hear that various deserialization strategies are working and I'm stoked to see the thoughtfulness in your approach! 🖤
On a procedural note, I would prefer to keep these sorts of open discussions in issues (we could have rather continued the discussion in issue #2). Then pull-request discussions stay focused on the specific code implementation under submission. We can keep the discussion going here in this case but just noting it as a request for the future.
Type names and definitions
I suggest our two main message types be named
SsbMessageKVT
andSsbMessageValue
.KVT
alone will still be opaque to many golgi users; prefixing both types withSsbMessage
makes their nature clearer.To avoid unnecessary duplication, we can use the
Feed
type from kuska and give it an alias to add clarity.Regarding
SsbMessageValue
, I suggest definingprevious
asOption<String>
andtime
asf64
so that we don't have to depend on thessb_multiformats
andssb_legacy_msg_data
crates. TheMultihash
andLegacyF64
types would be more relevant if we were writing a server implementation and wanted to ensure strict interoperability with other implementations. In our case,go-sbot
handles that for us.These message types should not be defined in
src/utils.rs
(could besrc/ssb.rs
orsrc/messages.rs
or something similar).Deserialization flow
+1 This is rad.
I don't find the
TypedSsbMessageValue
approach particularly elegant. It seems like a case where we could rather implement a conversion method onSsbMessageValue
for on-demand deserialization when greater type specificity is required. That avoids the need for an extra type which is ever-so-slightly different from another, while also following the pattern you established with.into_post()
.So we could have
into_typed_content()
as the generic conversion implementation andinto_post()
,into_vote()
,into_contact()
etc. as the specific conversion implementations.I'd also consider aliasing
TypedMessage
asSsbMessageContent
to match theSsbMessage...
naming pattern (like a matryoshka doll of SSB message types) and clarify what the type contains.Usage might then end up looking something like this:
WIP: Discussion Of Working With KVTto Add SsbKVT, SsbMessageValue and SsbMessageContent TypesPer our discussion, I changed the types to: SsbKvt, SsbMessageValue and SsbMessageContent,
and implemented the other comments you suggested (remove dependency on multiformat, move from utils.rs to messages.rs)
This PR is now ready for a more detailed review
I also removed whoami_res_parse and publish_res_parse and replaced them with generic string_res_parse which can be used for both.
I've left quite a few comments.
I'm not able to compile the code due to
createLogStream
issues.Let me know when you're ready for a re-review and I'll take a second pass.
Cool to see this coming together!
@ -36,2 +36,4 @@
/// JSON serialization or deserialization error.
SerdeJson(JsonError),
/// Error decoding typed ssb message from content.
ContentTypeDecode(String),
I'm not clear where this error variant is being used. Maybe line 50 in
src/messages.rs
(thefrom_value
method)?If so, then we should rather use the existing
SerdeJson
error variant which encapsulates both serialization and deserialization failures.I think I was using this error for something specifically, then removed it. But I needed another error like this in the next PR, specifically for when the JSON is valid (not a json serialization error) but the format of the content is invalid. Maybe needs a different name?
If the variant is not used in this PR then it's best to remove it. It's difficult for me to reason about the naming if I can't see the code it's called from.
@ -48,6 +50,7 @@ impl std::error::Error for GolgiError {
GolgiError::Rpc(ref err) => Some(err),
GolgiError::Sbot(_) => None,
GolgiError::SerdeJson(ref err) => Some(err),
GolgiError::ContentTypeDecode(ref _err) => None,
In this case you can omit the
ref _err
:GolgiError::ContentTypeDecode(_) => None,
@ -0,0 +32,4 @@
impl SsbMessageValue {
/// Gets the type field of the message content if found,
/// and if not returns "none"
pub fn get_message_type(&self) -> String {
It seems like
Option<String>
is a better return type here.Then L40 can just be
.ok()
instead of.unwrap_or_else(|| "none".to_string());
I made some more changes here, and changed this to an enum in the next PR
@ -0,0 +58,4 @@
#[derive(Serialize, Deserialize, Debug)]
#[serde(deny_unknown_fields)]
#[allow(missing_docs)]
pub struct SsbKVT {
I would rather name this
SsbMessageKVT
to match the other two types.@ -100,0 +103,4 @@
/// Call the `createLogStream` RPC method and return a Vec<Message>
pub async fn log(&mut self) -> Result<String, GolgiError> {
let req_id = self.client.log_req_send().await?;
Compilation of your branch fails for me due to this line. I'm guessing you've added this in a local branch of
kuska
? If so, we should remove thecreateLogStream
feature from this PR and only add that once I'm gotten the required functionality merged intokuska
.yup thats what happened. I'll remove this from golgi for now, was really just using it as a sanity check.
@ -26,2 +26,4 @@
/// * `body` - An array of u8 to be parsed.
pub fn string_res_parse(body: &[u8]) -> Result<String, GolgiError> {
// TODO: cleanup with proper error handling etc.
Ok(std::str::from_utf8(body).unwrap().to_string())
Don't forget to add an error variant to handle this properly ;)
@ -54,0 +104,4 @@
/// * `f` - A function which takes in an array of u8 and returns a Result<T, GolgiError>.
/// This is a function which parses the response from the RpcReader. T is a generic type,
/// so this parse function can return multiple possible types (String, json, custom struct etc.)
pub async fn get_async_until_eof<'a, R, T, F>(
Since we're dealing with a stream here and not an async RPC call we should rather name this function
get_source_until_eof
.Then we're sticking with the naming convention used by
go-sbot
which differentiates between async calls and source (aka stream) calls.^ incorporated the code review in that last commit.
Only change I didn't include was removing ContentTypeDecode and get_message_type, which are changed in the next PR.
@notplants
Thanks for the changes!
The last requested change is to remove the unused error variant and associated code (
ContentTypeDecode(String)
).Ok I've removed the error variant.
Perhaps making small PRs is a bit awkward too, if the PRs are connected and there are multiple parts interacting. Will attempt to split PRs into meaningful coherent chunks as best I can, but sometimes its a bit of a shifting target.
@notplants
Thanks for the removal.
Yeah I totally understand how PR's can balloon, especially in the early phases of a project where you're operating in discovery mode.
Speaking from the perspective of a PR reviewer: the smaller, the better. It's really challenging for me to hold a coherent mental model of the changes when the PR touches many files and adds a lot of new functionality. One new feature per PR might be a nice goal to aim for (with the understanding that this isn't always possible or desirable).
This short guide on authoring PR's has been helpful for me in the past.
You're clear to merge when you're ready :) 🟢