The format of this file is based on Keep a Changelog, and this project uses Semantic Versioning.
NameExtractor
previously raised a hard to understand error message when the location list passed to thePipeline
was empty. It now handles this edge case correctly without raising an exception. (#5, with contributions by @konstin)
0.3.0 (2017-07-07)
- Support for Python 3.4 and later.
- The constructor of
PatternExtractor
now accepts raw (not compiled) regex strings. - All definitions from the
geoextract.splitters
module have been moved to thegeoextract
module. - The old
Normalizer
class has been renamed toBasicNormalizer
.Normalizer
is now an abstract base class from whichBasicNormalizer
derives. KeyFilterPostprocessor
now derives from the new abstract base classPostprocessor
.WhitespaceSplitter
now derives from the new abstract base classSplitter
.- All abstract base classes for pipeline components (
Splitter
,Normalizer
,Extractor
,Validator
,Postprocessor
) now derive from a common abstract base classComponent
. BasicNormalizer.normalize
now removes leading and trailing whitespace at the end of the normalization process.- Normalization, splitting and validation can now be completely disabled by
passing
False
for the corresponding argument of thePipeline
constructor.
WindowExtractor
and its subclasses ignored the first word of some strings.BasicNormalizer
(previously calledNormalizer
) failed when no substitutions were given to the constructor.
PostalExtractor
has been removed.
0.2.0 (2017-05-31)
-
Simple web app for providing geo extraction as a web service.
-
geoextract.NameValidator
is a simple validator that ensures that the named locations referred to by an extracted location (for example a street street name) actually exist in the location database. -
geoextract.Normalizer
is a versatile string normalizer. -
geoextract.KeyFilterPostprocessor
is a simple postprocessor that only keeps
-
GeoExtract now uses a pipeline architecture that covers all aspects of the location extraction process.
-
geoextract.NameExtractor
doesn't take the target names as a constructor argument anymore, instead they are automatically provided by the pipeline. -
The module
geoextract.preprocessing
was renamed togeoextract.splitters
, and thegeoextract.preprocessing.split_components
function has been refactored into thegeoextract.splitters.WhitespaceSplitter
class.
geoextract.reduce_locations
andgeoextract.unique_dicts
were merged into the pipeline architecture and aren't available as a standalone functions anymore.
0.1.0 (2017-05-02)
First release.