Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive Derives and Attributes #10

Merged
merged 28 commits into from
Dec 7, 2023
Merged

Recursive Derives and Attributes #10

merged 28 commits into from
Dec 7, 2023

Conversation

tadeohepperle
Copy link
Contributor

When adding derives and attributes to a specific type, there is now an option to have them applied recursively to all dependencies of this type.

From the user side, the only thing that changed is that the extend_for_type() function on a DerivesRegistry that is part of the TypegenSettings now has an additional parameter recursive: bool.
Internally the DerivesRegistry maintains two: HashMaps mapping type paths to derives/attributes:

  • specific_type_derives: HashMap<syn::TypePath, Derives>
  • recursive_type_derives: HashMap<syn::TypePath, Derives>

When it comes to creating code and resolving recursive dependencies, these two maps are flattened out into a single specific_type_derives: HashMap<syn::TypePath, Derives>. This is done once, in the beginning of the code generation process, transforming the DerivesRegistry into a FlatDerivesRegistry. Only at this point knowledge of the type dependencies need to be known in form of a PortableRegistry.

This design allows for TypegenSettings to be constructed independently of any type registry.
It supports that each derive and attribute can be either recursive or not.

Test added to check the desired behavior.

@tadeohepperle tadeohepperle requested a review from jsdw November 22, 2023 13:09
.gitignore Outdated
@@ -7,3 +7,5 @@
.idea

Cargo.lock

/artifacts/*.rs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Extra endline

@@ -8,12 +8,19 @@ use crate::transformer::Transformer;

use super::formatting::format_type_description;

pub fn type_description(type_id: u32, type_registry: &PortableRegistry) -> anyhow::Result<String> {
/// if format is enabled, `format_type_description` is applied to the end result.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is exposed, I would suggest having more explicit docs for it. Something along the lines of 'Describes the given type ID..' or similar

Comment on lines 22 to 23
dbg!(_type_id);
dbg!(ty);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be removed

@@ -46,61 +82,80 @@ mod tests {
}"}
);
}
// todo!("This test with the generics does not fly yet.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a regression introduced by this feature? I would enable it; and then fix it later rather than disabling, what do you think?

}

let (type_id, type_registry) = make_type::<Shape<u8>>();
dbg!(&type_registry);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here?

/// The `recurse_policy` defines, how to handle cases,
/// where a type has been visited before, and is visited again BEFORE a representation of this type could be computed.
/// If it returns Some(..) we avoid infinite recursion by returning a concrete value.
/// If it returns None, nothing is done in the case of detected recursion and the type falls through.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does falls through mean here? Is the recursion continued as the type is "unresolved"?

For which containers could this happen? The blow "this is dangerous" seems a bit scary, might be worth a # Warning section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the name is not so good, perhaps recursion_should_continue is better. On these types we should just let the recursion continue instead of intersecting. That is the idea behind it. I think having this on the containers Array, Sequence, Tuple and Compact should be fine, because they cannot recurse infinitely without going through a named type (e.g. a Variant or Composite) that will be intersected when recursed on.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense now! Thanks for the info! I guess we could adjust the docs to state this ^ ? 🙏

@@ -56,7 +58,10 @@ where

match self.cache.borrow().get(&type_id) {
Some(Cached::Recursive) => {
return (self.recurse_policy)(type_id, ty, self);
dbg!(self.cache.borrow());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, is this needed?

/// }>
/// }
/// ```
/// But the recursive resolving would get stuck in the middle, reporting recursion.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this solve the issue?

  • keep a cache of visited types
  • when we encounter a recursion in describing a type, we simply resolve the type to the type's name, iff they match?
struct B { -> We are describing type B
id: u8,
other: Vec<B> -> Encountered type B name, resolve to the name, because this B type name == B the type name we are currently describing
}

Copy link
Contributor Author

@tadeohepperle tadeohepperle Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That usually works. This was exactly what we did before this PR. For example, this worked before without any issue:

struct Human {
            name: String,
            friends: Vec<Human>,
            dad: Box<Human>,
            home: House,
        }

But it does not work in the case of container types being resolved, containing the same container type. Like this, would not work before:

struct A {
    bees: Vec<B>
}
struct B {
    id: u8,
    others: Vec<B>
}

That is because, if the policy you described was applied here, the type which is seen twice in recursion is Vec<B>. But what is the name of this type? The type_path of Vec<B> is empty, so the name in the typical sense is also empty. We could just say the name is "Vec<B>". But then we have not defined B anywhere before and the description of struct A in the example above would be: A { bees: Vec<B> } which leaves the viewer wondering what B means.

I could not find an easier solution to this sadly. :(

/// The solution to this is, to just let some container types like Vec do recursion while others can't.
///
/// The safety of the following logic relies on the assumption that ultimately everything resolves down to a primitive or a struct/enum that is in the cache.
/// It basically just returns true o generic wrapper types.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for instead of o?

description/src/description.rs Outdated Show resolved Hide resolved
description/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 107 to 108
/// The safety of the following logic relies on the assumption that ultimately everything resolves down to a primitive or a struct/enum that is in the cache.
/// It basically just returns true for generic wrapper types.
Copy link
Collaborator

@jsdw jsdw Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to really understand this recursion_should_continue thing; perhaps because I don't grasp the problem well enough!

My understanding so far from reading the above is:

We have type info for types like

struct A { bees: Vec<B> } 
struct B { id: u8, others: Vec<B> }

We'd like to print a description like:

struct A { 
    bees: Vec<struct B {
        id: u8,
        others: Vec<B>
    }
> 
} 

So the first time Vec<B> is encountered, we expand B. The second time Vec<B> is encountered, we'll be recursing here, and want to instead print Vec<B> rather than expanding it.

Naively, to me it feels like this could be achieved by running some function on recursion that takes in the type info, here the info about B, and wants back some string to use. Since we know we are recursing in order to run this function, the function looks at the type name, B, and returns that string here.

That is because, if the policy you described was applied here, the type which is seen twice in recursion is Vec

You said this above in response to I think Alex suggesting the same. But wouldn't the type that's seen twice in recursion be B and not Vec<B>? (ie, we would recurse into a Vec again, and then see that the type in the vec was B, and then say oohh wait, I've already encountered B before. We wouldn't go into a vec and instantly decide that the vec itself was resurding would we?)

Copy link
Contributor Author

@tadeohepperle tadeohepperle Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't the type that's seen twice in recursion be B and not Vec<B>? (ie, we would recurse into a Vec again, and then see that the type in the vec was B, and then say oohh wait, I've already encountered B before. We wouldn't go into a vec and instantly decide that the vec itself was resurding would we?)

This is exactly the behavior we want to achieve, yes. The entire point of recursion_should_continue is to make
"...we would recurse into a Vec again" possible. If we do not have this "recursion should continue" thing, we see a Vec<B> (which is a distinct type) for the second time, and it says: "Oh wait, I have seen a Vec<B> before, we should return, otherwise we have infinite recursion!", without going into the Vec and checking B itself.

So it just disables the recursion guards for container types to go one step into them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this!

The bit I missed was that Vec<B> is itself a distinct type, and when we hit the second one we'd trigger the "aah we are recursing" thing, and so need to explicitly opt-in to recursing into this sort of type to print it properly without some special case to print Vec<B> or whatever.

typegen/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@jsdw jsdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that we should look at the recursion_should_continue function to see whether it will always work as intended or whether there's a more general way to handle this, but looks great overall and I don't want to block this from merging any longer!

(I also wonder about whether our documentation not being "inline struct definitions" and instead printing the separate types (ie more natural rust syntax) might be better / more clear (since it's just rust syntax), but would need to try both and see what felt best, and in case case that'd be a separate PR I expect!)

@tadeohepperle
Copy link
Contributor Author

I got rid of the recursion_should_continue by delegating this responsibility to the recurse_policy or a transformer. This now returns an Option. When None is returned, we just continue recursing. Now we look at the type and if it does not have an identifier, we keep recursing when building a type description. This covers the same cases, the recursion_should_continue function covered before.

I also added better support for printing nested generic types and added tests for this.

@tadeohepperle tadeohepperle merged commit 695b990 into master Dec 7, 2023
2 checks passed
@tadeohepperle tadeohepperle deleted the recursive-derives branch December 7, 2023 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants