You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently all default preprocessors are loaded, even if some of them are not required by the requested features. This causes major delays (e.g. if parsing occurs without being needed). Some design changes would be required, in order to let each one of the feature classes to specify which preprocessors are required before their execution
The text was updated successfully, but these errors were encountered:
I just finished a first pass on re-designing the execution of ResourceProcessors.
ResourceProcessors are not any more initialized and executed in a raw way from the FeatureExtractor.java
The initialization functions who were in the FeatureExtractor.java have been moved to shef.mt.pipelines.DefaultResourcePipeline
The superclass ResourcePipeline can now receive a list of the required resourceNames and fires only the ResourceProcessors that are define with this resourceName.
ResourceProcessors that want to be compatible with this, should have the this.resourceName class variable set with a resource name (i.e. "bparser' etc)
The only ResourceProcessors who were executed by the existing FeatureExtractor were BParser and TopicModelling, so these are the only ones that have been modified to work with the pipeline system.
TODO: the above mentioned solution only avoids RUNNING the ResourceProcessors. ResourceProcessors should actually not be initialized at all (i.e. grammars and tables should not be loaded). This actually requires adding a separate "initialize" function to each of the resourceProcessors, or implementing some kind of pythonic dynamic class loading which is tricky in Java.
According to the proposed design, every tool that implements the ResourceProcessor interface, will have one additional obligatory function, called initialize. This function will have ONLY one parameter, the PropertiesManager, which is an object that holds all parameters read from the user's customized .properties.
Each resource processor will be now responsible in its own class to acquire the parameters they need for their initialization, by directly asking the PropertiesManager for them.
This will solve the problem, that the resource processors had to be initialized "hard-coded" one by one in the FeaturesExtractor.java since each of them had different initialization parameters.
This will also require that we modify the existing processors by moving their initialization code from the FeatureExtractor (or the Pipeline) back to the Processor class.
Currently all default preprocessors are loaded, even if some of them are not required by the requested features. This causes major delays (e.g. if parsing occurs without being needed). Some design changes would be required, in order to let each one of the feature classes to specify which preprocessors are required before their execution
The text was updated successfully, but these errors were encountered: