Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloneIn trait for AST nodes #4284

Closed
overlookmotel opened this issue Jun 21, 2024 · 23 comments · Fixed by #4732
Closed

CloneIn trait for AST nodes #4284

overlookmotel opened this issue Jun 21, 2024 · 23 comments · Fixed by #4732
Assignees

Comments

@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 21, 2024

We need to be able to clone AST nodes. But we cannot implement Clone on most AST types because oxc_allocator::Box is not Clone.

There is a good reason for that - Box stores the value it owns in arena allocator, so there is no Box::new method, only Box::new_in which takes an &Allocator param for the arena to allocate it in.

I propose a new trait CloneIn:

trait CloneIn<'alloc> {
    fn clone_in(&self, allocator: &'alloc Allocator) -> Self;
}

Impl for oxc_allocator::Box:

impl<'alloc, T> CloneIn<'alloc> for Box<'alloc, T>
where
    T: CloneIn<'alloc>,
{
    fn clone_in(&self, allocator: &'alloc Allocator) -> Self {
        Box::new_in(self.deref().clone_in(allocator), allocator)
    }
}

(I'm actually not sure if I have this right - maybe 'alloc lifetime should be on fn clone_in<'alloc> not CloneIn<'alloc>)

Same as Clone, CloneIn can be implemented on any type where all its fields/variants are also CloneIn.

We could make a derive macro for CloneIn so can just #[derive(CloneIn)] on AST types.

NB: clone_in should always clone into the arena of the provided allocator, which may be different from the arena that self is in. i.e. you can use clone_in to copy objects from one arena to another.

@Boshen
Copy link
Member

Boshen commented Jul 15, 2024

This is required by Rolldown app mode.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

This is required by Rolldown app mode.

Urgently required? Or just at some point?

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

I think maybe this is the correct impl so return value has lifetime of the new allocator, not the old one:

trait CloneIn {
    fn clone_in<'alloc>(&self, allocator: &'alloc Allocator) -> Self<'alloc>;
}

impl<'old_alloc, T> CloneIn for Box<'old_alloc, T>
where
    T: CloneIn,
{
    fn clone_in<'new_alloc>(&self, allocator: &'new_alloc Allocator) -> Self<'new_alloc> {
        Self::new_in(self.deref().clone_in(allocator), allocator)
    }
}

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

I think maybe this is the correct impl so return value has lifetime of the new allocator, not the old one:

trait CloneIn {
    fn clone_in<'alloc>(&self, allocator: &'alloc Allocator) -> Self<'alloc>;
}

This isn't a valid syntax

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

This isn't a valid syntax

Yes, maybe not. But what is? What I posted originally isn't right either.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

Just thinking out loud, Maybe we can use a marker trait to ensure the lifetime:

trait Allocated<'alloc> { /* noop */ }

trait CloneIn<'alloc>: Allocated<'alloc> {
    fn clone_in<'old>(&'old self, allocator: &'alloc Allocator) -> Self;
}

But I think using this is enough:

trait CloneIn<'alloc> {
    fn clone_in<'old_alloc>(&'old_alloc self, allocator: &'alloc Allocator) -> Self;
}

If the implemented lifetime differs from the allocator you'd get a different Self than what you were expecting. So it wouldn't compile.
Intuitively people would implement the right return lifetime since you only have access to the new allocator here. But if the user implements it differently(you may even say wrong) it is on them. But the compiler would stop you from shooting yourself in the foot and I think people would figure it out relatively quickly.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

The marker I described in my previous comment won't work either. It has the same allocator lifetime on both ends. I think we should just rely on the end user to implement it correctly!

Maybe we can use this?

/// SAFETY: Cloned type should exist in the `'alloc`
unsafe trait CloneIn<'alloc> {
    fn clone_in(&self, allocator: &'alloc Allocator) -> Self;
}

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

trait CloneIn<'alloc> {
    fn clone_in<'old_alloc>(&'old_alloc self, allocator: &'alloc Allocator) -> Self;
}

I don't think this is right either. 'old_alloc shouldn't be on &self, but on the lifetime param of Self. i.e. without a trait:

impl<'old_alloc> Foo<'old_alloc> {
    fn clone_in<'new_alloc>(&self, allocator: &'new_alloc Allocator) -> Foo<'new_alloc> { /* ... */ }
}

But how to turn it into a trait?

Maybe we can use this?

Does unsafe save us here? Anyway, it must be possible to do this in safe Rust. I just don't know how. I think will need to ask for help on the Rustlang forum.

NB: I'm imagining codegen-ing the impls for all AST types.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

I don't think this is right either. 'old_alloc shouldn't be on &self, but on the lifetime param of Self. i.e. without a trait:

Yes, I noticed my mistake a short while after posting it.

Does unsafe save us here?

No, But If there is no way to describe what we want in the type system then that's a reasonable approach. For derive and codegen we can ensure the safety, unsafe here is to force the end users to be careful with their implementation and read the safety docs first.

I think will need to ask for help on the Rustlang forum.

Let's give it a shot.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

If necessary, we could codegen clone_in methods for all AST node types, without a CloneIn trait, and no #[derive] macro. But I'm sure there must be a way to do this with a trait, and I'd like to know what it is.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

I'd very much like to learn how to describe such a thing in the type system as well. So let's give it a shot.

we could codegen clone_in methods for all AST node types

If the point is to prevent people from implementing CloneIn themselves, I think it would be better to make it internal(by trait bounds) while still exposing the CloneIn trait for use.

@overlookmotel
Copy link
Contributor Author

Got it!

trait CloneIn<'new_alloc> {
    type Cloned;

    fn clone_in(&self, allocator: &'new_alloc Allocator) -> Self::Cloned;
}

impl<'old_alloc, 'new_alloc> CloneIn<'new_alloc> for UnaryExpression<'old_alloc> {
    type Cloned = UnaryExpression<'new_alloc>;

    fn clone_in(&self, allocator: &'new_alloc Allocator) -> Self::Cloned {
        UnaryExpression {
            span: self.span.clone_in(allocator),
            operator: self.operator.clone_in(allocator),
            argument: self.argument.clone_in(allocator),
        }
    }
}

Thanks to: rust-lang/wg-allocators#15 (comment)

@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

Nice, It doesn't necessarily force the return type to be a variant of Self but at least it works as intended.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented Jul 15, 2024

It doesn't necessarily force the return type to be a variant of Self

Yes, that's a shortcoming. Probably there's some wacky type system trick to enforce that, but I don't think it's worth it. You can't prevent someone making Expression::clone_in return an instance of MyTrousers instead of Expression, but then you also can't statically prevent them writing a method body which is loaded full of bugs and panics.

Anyway, we'll codegen the impls for AST nodes, so there will be no MyTrousers mistakes. 😆

@overlookmotel overlookmotel transferred this issue from oxc-project/backlog Jul 15, 2024
@rzvxa
Copy link
Contributor

rzvxa commented Jul 15, 2024

I genuinely burst out laughing when I saw MyTrousers😆

@hyf0
Copy link
Contributor

hyf0 commented Aug 1, 2024

What's the blocker here? It seems the final form has been shown and there are no technical issues.

@hyf0
Copy link
Contributor

hyf0 commented Aug 1, 2024

Would it be more performant if there something like size_hint in CloneIn trait to infer allocator?

@overlookmotel
Copy link
Contributor Author

What's the blocker here? It seems the final form has been shown and there are no technical issues.

No blocker. We've just been working on other things. Is your need for it urgent?

Would it be more performant if there something like size_hint in CloneIn trait to infer allocator?

Sorry I don't get you. How do you mean?

@hyf0
Copy link
Contributor

hyf0 commented Aug 1, 2024

No blocker. We've just been working on other things. Is your need for it urgent?

Not sure how define urgent. But people are writing code like this

https://github.com/rolldown/rolldown/blob/b7dd0c99650d8770a990efa6f86daaa5bc00d469/crates/rolldown_plugin_dynamic_import_vars/src/clone_expr.rs

Sorry I don't get you. How do you mean?

trait CloneIn<'new_alloc> {
    type Cloned;

    fn clone_in(&self, allocator: &'new_alloc Allocator) -> Self::Cloned;

    fn size_hint(&self) -> usize {
       0
    }
}

impl<'old_alloc, 'new_alloc> CloneIn<'new_alloc> for UnaryExpression<'old_alloc> {
    type Cloned = UnaryExpression<'new_alloc>;

    fn clone_in(&self, allocator: &'new_alloc Allocator) -> Self::Cloned {
        // sadly bumpalo doesn't have this API.
        allocator.reserve_additional_bytes(self::size_hint());
        UnaryExpression {
            span: self.span.clone_in(allocator),
            operator: self.operator.clone_in(allocator),
            argument: self.argument.clone_in(allocator),
        }
    }
}

@overlookmotel
Copy link
Contributor Author

Not sure how define urgent. But people are writing code like this
https://github.com/rolldown/rolldown/blob/b7dd0c99650d8770a990efa6f86daaa5bc00d469/crates/rolldown_plugin_dynamic_import_vars/src/clone_expr.rs

Oh my GOD! That's horrendous. Also, it screws up the spans.

fn size_hint(&self) -> usize { 0 }

Ah now I understand. Yes, that would be good. But problem is how do you work out what the size is? In the case of UnaryExpression, argument is an Expression which could be 1 or could be (() => { let loadsOfComplexLogic = foo(bar, qux); return !loadsOfComplexLogic; })() - basically any size.

Only way to work out the size accurately would be to visit all the children of the AST node. I imagine performing that extra visit would cost more than what you save by having the size hint.

@rzvxa
Copy link
Contributor

rzvxa commented Aug 6, 2024

@hyf0 I saw you mentioned this issue, Just want to let you know I've picked this up and it should be expected in the next release.

overlookmotel pushed a commit that referenced this issue Aug 7, 2024
Introduce the trait discussed in #4284.
overlookmotel pushed a commit that referenced this issue Aug 7, 2024
overlookmotel pushed a commit that referenced this issue Aug 7, 2024
overlookmotel pushed a commit that referenced this issue Aug 7, 2024
overlookmotel pushed a commit that referenced this issue Aug 7, 2024
overlookmotel pushed a commit that referenced this issue Aug 7, 2024
@overlookmotel
Copy link
Contributor Author

Implemented in #4732. Very nice work from the man like @rzvxa.

@Dunqing
Copy link
Member

Dunqing commented Aug 9, 2024

BindingIdentifier, IdentifierReference, and other Identifier have been derived Clone trait. So I want to ask should we use clone_in or clone for that? If we should use clone_in everywhere, wouldn't we have to remove Clone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants