Skip to content

Releases: materialsproject/maggma

Added a new JointStore

01 Oct 20:58
Compare
Choose a tag to compare
  • A new JointStore was added. This Store will aggregate across stores to create a single interface for querying.
  • Fixed issues with tqdm and interactive shells
  • switched the order of properties and criteria in query to match that of regular pymongo find calls
  • fix/simplify the JSON validator

Improved Incremental Building; Added MapBuilder example

22 Aug 22:19
Compare
Choose a tag to compare
  • Added MapBuilder -- simply write a function to map a source store document to a target store document. Supports automatic incremental building at the document level that is resilient to builder interruptions. Intended usage: subclass MapBuilder.
  • Added GroupBuilder as an extension of MapBuilder -- group source docs and produce one target doc from each group.
  • breaking change: Runner parameter num_workers renamed to max_workers to better reflect function.

Added MongograntStore

07 Jun 18:15
Compare
Choose a tag to compare

Using mongogrant allows serialization of builders/runners without username/password credentials.

Refined GridFS store

01 May 21:52
8f9968b
Compare
Choose a tag to compare

We've got a refined GridFS store from the Alsace region. It's of mint variety and properly stores keys in the metadata and handles gzip compression internally.

  • GridFS Store is now compliant with mongo specs
  • GridFS store also handles compression internally
  • Added in progress bars
  • Fixed a potential race condition in the multiprocessing

New S3 Store and some fixes

31 Mar 19:32
2ee043f
Compare
Choose a tag to compare

We're starting to expand functionality. One of the keys for Maggma will be connecting to a lot of different data sources but leveraging the powerfull querying in MongoDB. The GridFS concept of an index + bucket storage patterns really well to other cloud storage platforms that we hope to support.

Changes:

  • Added an AmazonS3 Store
  • Added from_collection in mongostore to instantiate a MongoStore from a PyMongo Collection object
    Warning: This object will not serialize and deserialize properly so don't expect this to work with MPI building or saving to files
  • Fixed distinct and groupby to be applicable to all stores.

Almost halfway to 1.0

07 Mar 15:35
Compare
Choose a tag to compare

Another major release. Almost half-way to 1.0!

  • Both MPI and Multiprocessing have been overhauled again. This time to make them more test friendly
  • The coverage has significantly increased to ~ 70% from below 50% due to additional testing
  • There is a new schema system to automatically validate a collection

Third Major Release

01 Feb 23:28
Compare
Choose a tag to compare

We're on our 3rd major release and chugging along. We'll try and adopt a monthly release schedule for now, to ensure we clean up updates in a timely manner.

Major changes since v0.2.0:

  • removed processor serialization from the runner object to be more concise in the json output
  • added a group_by for mongolike Stores
  • added a VaultStore that uses Vault to grab MongoStore credentials
  • removed LAVA as this was not being worked on and was un-necessarily cumbersome
  • Added a way to setup a Schema for a collection and assign that to a Store
  • Updated the multiprocessing to use Pool.imap and take advantage of its worker refreshing

Second major release

01 Jan 23:07
Compare
Choose a tag to compare

Second major release of Maggma. We'll try and keep monthly releases. Lots of updates this time around.

Major changes since v0.1.0:

  • change default lu_field to "last_updated" from "_lu"
  • added in default indexing key to Store, can also be a list of keys
  • added in query, query_one, and distinct functions to Store
  • updated to a single mixin class for all mongo-like stores
  • added a CLI command to run a runner that has been serialized to a json object
  • change lu_key to lu_type and default to datetime dict formats, can also set to isoformat
  • added a GridFS store to be able to access documents larger than 16MB

First Release: v0.1.0

09 Nov 02:31
Compare
Choose a tag to compare

First release of MAGGMA. This is a generic framework to build data aggregation and analysis pipelines using mongodb and python. The goal is to make it easier to write algorithms that scale.