Address issues with source with component names and replacing listeners #12701
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When sourcing a component, the object from the original pipeline is added to the new pipeline as the same object. This creates a situation where there are several attributes that cannot be in sync between the original pipeline and the new pipeline at the same time for this one object:
component.name
component.listener_map
/component.listening_components
fortok2vec
andtransformer
When running
replace_listeners
on a component, the config is not updated correctly if the state of the component is incorrect for the current pipeline (in particular changes that should be applied frommodel.attrs["replace_listener_cfg"]
as used inspacy-transformers
) due to the fact that:find_listeners
relies oncomponent.name
to set the name in thelistener_map
replace_listeners
relies onlistener_map
to determine how to modify the configsIn addition, there are several places where pipeline components are modified and the listener map and/or internal component names aren't currently updated.
In cases where there is a component shared by two pipelines that cannot be in sync, this PR chooses to prioritize the most recently modified or initialized pipeline. There is no actual solution with the current
source
behavior that will make both pipelines usable, so the current pipeline is updated whenever components are added/renamed/removed or the pipeline is initialized for training.Types of change
Bug fix.
Checklist