Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for Vec::extract_if and LinkedList::extract_if #43244

Open
2 of 3 tasks
Gankra opened this issue Jul 14, 2017 · 227 comments
Open
2 of 3 tasks

Tracking issue for Vec::extract_if and LinkedList::extract_if #43244

Gankra opened this issue Jul 14, 2017 · 227 comments
Assignees
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Gankra
Copy link
Contributor

Gankra commented Jul 14, 2017

Feature gate: #![feature(extract_if)] (previously drain_filter)

This is a tracking issue for Vec::extract_if and LinkedList::extract_if, which can be used for random deletes using iterators.

Public API

pub mod alloc {
    pub mod vec {
        impl<T, A: Allocator> Vec<T, A> {
            pub fn extract_if<F>(&mut self, filter: F) -> ExtractIf<'_, T, F, A>
            where
                F: FnMut(&mut T) -> bool,
            {
            }
        }

        #[derive(Debug)]
        pub struct ExtractIf<'a, T, F, #[unstable(feature = "allocator_api", issue = "32838")] A: Allocator = Global>
        where
            F: FnMut(&mut T) -> bool, {}

        impl<T, F, A: Allocator> Iterator for ExtractIf<'_, T, F, A>
        where
            F: FnMut(&mut T) -> bool,
        {
            type Item = T;
            fn next(&mut self) -> Option<T> {}
            fn size_hint(&self) -> (usize, Option<usize>) {}
        }

        impl<T, F, A: Allocator> Drop for ExtractIf<'_, T, F, A>
        where
            F: FnMut(&mut T) -> bool,
        {
            fn drop(&mut self) {}
        }
    }

    pub mod collections {
        pub mod linked_list {
            impl<T> LinkedList<T> {
                pub fn extract_if<F>(&mut self, filter: F) -> ExtractIf<'_, T, F>
                where
                    F: FnMut(&mut T) -> bool,
                {
                }
            }

            pub struct ExtractIf<'a, T: 'a, F: 'a>
            where
                F: FnMut(&mut T) -> bool, {}

            impl<T, F> Iterator for ExtractIf<'_, T, F>
            where
                F: FnMut(&mut T) -> bool,
            {
                type Item = T;
                fn next(&mut self) -> Option<T> {}
                fn size_hint(&self) -> (usize, Option<usize>) {}
            }

            impl<T, F> Drop for ExtractIf<'_, T, F>
            where
                F: FnMut(&mut T) -> bool,
            {
                fn drop(&mut self) {}
            }

            impl<T: fmt::Debug, F> fmt::Debug for ExtractIf<'_, T, F>
            where
                F: FnMut(&mut T) -> bool,
            {
                fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {}
            }
        }
    }
}

Steps / History

Unresolved Questions

  • What should the method be named?
  • Should extract_if accept a Range argument?
  • Missing Send+Sync impls on linked list's ExtractIf, see comment

See #43244 (comment) for a more detailed summary of open issues.

@Mark-Simulacrum Mark-Simulacrum added B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Jul 19, 2017
@RalfJung
Copy link
Member

RalfJung commented Jul 31, 2017

Related issues:
#25477
#34265

bors added a commit that referenced this issue Aug 15, 2017
Add Vec::drain_filter

This implements the API proposed in #43244.

So I spent like half a day figuring out how to implement this in some awesome super-optimized unsafe way, which had me very confident this was worth putting into the stdlib.

Then I looked at the impl for `retain`, and was like "oh dang". I compared the two and they basically ended up being the same speed. And the `retain` impl probably translates to DoubleEndedIter a lot more cleanly if we ever want that.

So now I'm not totally confident this needs to go in the stdlib, but I've got two implementations and an amazingly robust test suite, so I figured I might as well toss it over the fence for discussion.
@bluss
Copy link
Member

bluss commented Sep 4, 2017

Maybe this doesn't need to include the kitchen sink, but it could have a range parameter, so that it's like a superset of drain. Any drawbacks to that? I guess adding bounds checking for the range is a drawback, it's another thing that can panic. But drain_filter(.., f) can not.

@rustonaut
Copy link

rustonaut commented Sep 11, 2017

Is there any chance this will stabilize in some form in the not to far future?

@rustonaut
Copy link

rustonaut commented Sep 11, 2017

If the compiler is clever enough to eliminate the bounds checks
in the drain_filter(.., f) case I would opt for doing this.

( And I'm pretty sure you can implement it in a way
which makes the compiler clever eneugh, in the worst
case you could have a "in function specialization",
basically something like if Type::of::<R>() == Type::of::<RangeFull>() { dont;do;type;checks; return } )

@jonhoo
Copy link
Contributor

jonhoo commented Sep 22, 2017

I know this is bikeshedding to some extent, but what was the reasoning behind naming this drain_filter rather than drain_where? To me, the former implies that the whole Vec will be drained, but that we also run a filter over the results (when I first saw it, I thought: "how is this not just .drain(..).filter()?"). The former on the other hand indicates that we only drain elements where some condition holds.

@rustonaut
Copy link

No idea, but drain_where sounds much better and is much more intuitive.
Is there still a chance to change it?

@bluss
Copy link
Member

bluss commented Sep 23, 2017

.remove_if has been a prior suggestion too

@rustonaut
Copy link

rustonaut commented Sep 23, 2017

I think drain_where does explains it the best. Like drain it returns values, but it does not drain/remove all values but just such where a given condition is true.

remove_if sounds a lot like a conditional version of remove which just removes a single item by index if a condition is true e.g. letters.remove_if(3, |n| n < 10); removes the letter at index 3 if it's < 10.

drain_filter on the other hand is slightly ambiguous, does it drain then filter in a more optimized way (like filter_map) or does if drain so that a iterator is returned comparble to the iterator filter would return,
and if so shouldn't it be called filtered_drain as the filter get logically used before...

@Gankra
Copy link
Contributor Author

Gankra commented Sep 25, 2017

There is no precedent for using _where or _if anywhere in the standard library.

@jonhoo
Copy link
Contributor

jonhoo commented Sep 25, 2017

@gankro is there a precedent for using _filter anywhere? I also don't know that that is that a reason for not using the less ambiguous terminology? Other places in the standard library already use a variety of suffixes such as _until and _while.

@crlf0710
Copy link
Member

The "said equivalent" code in the comment is not correct... you have to minus one from i at the "your code here" site, or bad things happens.

@thegranddesign
Copy link

IMO it's not filter that's the issue. Having just searched for this (and being a newbie), drain seems to be fairly non-standard compared to other languages.

Again, just from a newbie perspective, the things I would search for if trying to find something to do what this issue proposes would be delete (as in delete_if), remove, filter or reject.

I actually searched for filter, saw drain_filter and kept searching without reading because drain didn't seem to represent the simple thing that I wanted to do.

It seems like a simple function named filter or reject would be much more intuitive.

@thegranddesign
Copy link

On a separate note, I don't feel as though this should mutate the vector it's called on. It prevents chaining. In an ideal scenario one would want to be able to do something like:

        vec![
            "",
            "something",
            a_variable,
            function_call(),
            "etc",
        ]
            .reject(|i| { i.is_empty() })
            .join("/")

With the current implementation, what it would be joining on would be the rejected values.

I'd like to see both an accept and a reject. Neither of which mutate the original value.

@rpjohnst
Copy link
Contributor

You can already do the chaining thing with filter alone. The entire point of drain_filter is to mutate the vector.

@thegranddesign
Copy link

@rpjohnst so I searched here, am I missing filter somewhere?

@rpjohnst
Copy link
Contributor

Yes, it's a member of Iterator, not Vec.

@Gankra
Copy link
Contributor Author

Gankra commented Oct 25, 2017

Drain is novel terminology because it represented a fourth kind of ownership in Rust that only applies to containers, while also generally being a meaningless distinction in almost any other language (in the absence of move semantics, there is no need to combine iteration and removal into a single ""atomic"" operation).

Although drain_filter moves the drain terminology into a space that other languages would care about (since avoiding backshifts is relevant in all languages).

@kennytm kennytm changed the title Tracking issue for Vec::drain_filter Tracking issue for Vec::drain_filter and LinkedList::drain_filter Nov 27, 2017
@polarathene
Copy link

I came across drain_filter in docs as a google result for rust consume vec. I know that due to immutability by default in rust, filter doesn't consume the data, just couldn't recall how to approach it so I could manage memory better.

drain_where is nice, but as long as the user is aware of what drain and filter do, I think it's clear that the method drains the data based on a predicate filter.

@jonhoo
Copy link
Contributor

jonhoo commented Dec 3, 2017

I still feel as though drain_filter implies that it drains (i.e., empties) and then filters. drain_where on the other hand sounds like it drains the elements where the given condition holds (which is what the proposed function does).

@tmccombs
Copy link
Contributor

tmccombs commented Dec 7, 2017

Shouldn't linked_list::DrainFilter implement Drop as well, to remove any remaining elements that match the predicate?

@Gankra
Copy link
Contributor Author

Gankra commented Dec 7, 2017

Yes

bors added a commit that referenced this issue Dec 9, 2017
Add Drop impl for linked_list::DrainFilter

This is part of #43244. See #43244 (comment)
@photino
Copy link

photino commented Dec 10, 2023

I would prefer drain_while, since we already have drain, skip_while, take_while, wait_while.

And the documentation also says that using this method is equivalent to the following code:

let mut i = 0;
while i < vec.len() {
    if some_predicate(&mut vec[i]) {
        let val = vec.remove(i);
        // your code here
    } else {
        i += 1;
    }
}

@tguichaoua
Copy link
Contributor

tguichaoua commented Dec 10, 2023

I would prefer drain_while, since we already have drain, skip_while, take_while, wait_while.

And the documentation also says that using this method is equivalent to the following code:

let mut i = 0;
while i < vec.len() {
    if some_predicate(&mut vec[i]) {
        let val = vec.remove(i);
        // your code here
    } else {
        i += 1;
    }
}

From take_while and skip_while documentation :

After false is returned, take_while()’s job is over,

After false is returned, skip_while()’s job is over,

The _while suffix implies it's job is over after the first time the predicate returns false.
something like this

while predicate(...) {
    // do job
}
// stop job

But as you mention the equivalent code of extract_if/drain_filter/drain_if is different: it doesn't use the predicate in the loop condition.

@photino
Copy link

photino commented Dec 10, 2023

This is a subtle difference. I prefer drain_if now.

The _while suffix implies it's job is over after the first time the predicate returns false. something like this

while predicate(...) {
    // do job
}
// stop job

But as you mention the equivalent code of extract_if/drain_filter/drain_if is different: it doesn't use the predicate in the loop condition.

@rsalmei
Copy link

rsalmei commented Dec 11, 2023

I also understand this func in terms of "drain". I think this word should appear in its name somehow.

@cybersoulK
Copy link

cybersoulK commented Dec 11, 2023

drain/extract already conveys a continuous iteration.
drain_if or extract_if
drain_mapped or extract_mapped (#43244 (comment))

@the8472
Copy link
Member

the8472 commented Dec 11, 2023

The methods were intentionally renamed because they behave differently compared to drain. Drain keeps draining if you drop the Drain struct. I.e. unlike most iterators it is not lazy.
extract_if on the other hand is lazy and only removes elements as long as the iterator gets spun.

@rsalmei
Copy link

rsalmei commented Dec 11, 2023

The methods were intentionally renamed because they behave differently compared to drain. Drain keeps draining if you drop the Drain struct. I.e. unlike most iterators it is not lazy.

extract_if on the other hand is lazy and only removes elements as long as the iterator gets spun.

I wasn't aware of that, thanks. It makes sense to be different then.

@Emilgardis
Copy link
Contributor

Emilgardis commented Dec 11, 2023

Maybe the history section should be updated to highlight the name change and the reason a bit better

@NexusXe
Copy link

NexusXe commented Mar 18, 2024

Is it possible to point to this feature when a user is trying to use the drain_filter method? This seems like it could cause some confusion.

@rsalmei
Copy link

rsalmei commented Sep 29, 2024

I'm eagerly anticipating the addition of this feature to the standard library. Could it be stabilized this year?

@joshtriplett joshtriplett added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Nov 18, 2024
@ebkalderon
Copy link
Contributor

Quick question for anyone familiar with the internals of the unstable extract_if() methods: I noticed that all existing ExtractIf<'a, ...> iterators don't implement the FusedIterator trait. Since these iterators supposedly visit each element exactly once (unless the iterator is dropped partway through iteration, of course), it would seem that they are suitable candidates for implementing FusedIterator. Is this understanding correct? Or is there a reason these iterators aren't fused? Thanks!

@the8472
Copy link
Member

the8472 commented Nov 18, 2024

They shouldn't suddenly restart, no. So it most likely is an oversight.

@joshtriplett
Copy link
Member

@rfcbot merge

@rfcbot concern vec-needs-a-range-argument

We discussed this in today's @rust-lang/libs-api meeting. We were in favor of stabilizing these two, and we have consensus on the current names. However, we would like to have a range parameter to Vec::extract_if.

@rfcbot
Copy link

rfcbot commented Nov 19, 2024

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Nov 19, 2024
@the8472 the8472 self-assigned this Nov 19, 2024
@cuviper
Copy link
Member

cuviper commented Nov 20, 2024

Not necessarily a blocking concern, but can we remove the where F: FnMut constraint on struct ExtractIf and its Drop and Debug? The drain-on-drop behavior was removed a while ago, so now we only need to be callable for Iterator.

@the8472
Copy link
Member

the8472 commented Nov 20, 2024

Sure I'll add that to my PR.

@Amanieu
Copy link
Member

Amanieu commented Nov 23, 2024

Also another note from the libs-api meeting is that we are still interested in solving more exotic use cases such as ExtractMapped proposed in #43244 (comment). However this may be better done via some form of cursor API rather than an iterator: you would need to manually increment the cursor, but it would give you precise control at each point whether to mutate or remove the value currently being pointed to by the cursor.

Such an API should be proposed as a separate ACP, with inspiration from linked list cursors (#58533) and B-Tree cursors (#107540).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests