Load/run only the preprocessors required by the requested features #5

lefterav · 2013-01-29T10:50:37Z

Currently all default preprocessors are loaded, even if some of them are not required by the requested features. This causes major delays (e.g. if parsing occurs without being needed). Some design changes would be required, in order to let each one of the feature classes to specify which preprocessors are required before their execution

lefterav · 2013-02-27T14:00:14Z

We have a version for this in the branch resource-manager. We need to test this shortly and then it will be ready to merge with master.

lefterav · 2013-02-28T21:48:01Z

I just finished a first pass on re-designing the execution of ResourceProcessors.
ResourceProcessors are not any more initialized and executed in a raw way from the FeatureExtractor.java
The initialization functions who were in the FeatureExtractor.java have been moved to shef.mt.pipelines.DefaultResourcePipeline
The superclass ResourcePipeline can now receive a list of the required resourceNames and fires only the ResourceProcessors that are define with this resourceName.
ResourceProcessors that want to be compatible with this, should have the this.resourceName class variable set with a resource name (i.e. "bparser' etc)
The only ResourceProcessors who were executed by the existing FeatureExtractor were BParser and TopicModelling, so these are the only ones that have been modified to work with the pipeline system.

TODO: the above mentioned solution only avoids RUNNING the ResourceProcessors. ResourceProcessors should actually not be initialized at all (i.e. grammars and tables should not be loaded). This actually requires adding a separate "initialize" function to each of the resourceProcessors, or implementing some kind of pythonic dynamic class loading which is tricky in Java.

lefterav · 2013-05-13T10:14:23Z

According to the proposed design, every tool that implements the ResourceProcessor interface, will have one additional obligatory function, called initialize. This function will have ONLY one parameter, the PropertiesManager, which is an object that holds all parameters read from the user's customized .properties.

Each resource processor will be now responsible in its own class to acquire the parameters they need for their initialization, by directly asking the PropertiesManager for them.

This will solve the problem, that the resource processors had to be initialized "hard-coded" one by one in the FeaturesExtractor.java since each of them had different initialization parameters.

This will also require that we modify the existing processors by moving their initialization code from the FeatureExtractor (or the Pipeline) back to the Processor class.

I hope you approve this change

lefterav mentioned this issue Feb 22, 2013

dynamic feature extraction according to the feature xml file #17

Closed

ghost assigned lupo01 Feb 26, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load/run only the preprocessors required by the requested features #5

Load/run only the preprocessors required by the requested features #5

lefterav commented Jan 29, 2013

lefterav commented Feb 27, 2013

lefterav commented Feb 28, 2013

lefterav commented May 13, 2013

Load/run only the preprocessors required by the requested features #5

Load/run only the preprocessors required by the requested features #5

Comments

lefterav commented Jan 29, 2013

lefterav commented Feb 27, 2013

lefterav commented Feb 28, 2013

lefterav commented May 13, 2013