Skip to content

Application Workflow

Gregory Wiedeman edited this page Sep 9, 2021 · 4 revisions

This is intended as a high-level description of the application's intended workflow, i.e. what the application does, in sequence.

Within this document, "mail source" is a term meaning any type of directory structure, file, or other thing that contains or represents an email mailbox to be processed and bagged.

Argument processing:

Mailbag will essentially take all arguments that bagit takes, plus mailbag specific arguments. In order to reduce work, we plan to re-use the argparse.ArgumentParser configured in bagit-python.

The application will import bagit as a module, and then call bagit._make_parser. It will then add an argument group with all the mailbag-specific variables. Note that since mailbag will not be bagging in-place, the directory argument from bagit will be "where you want the processed bag directory" and a new mailbag-specific argument will point at the mail source.

Mailbox parsing

Based on either: an explicit argument describing what type of mail source is being processed Code that looks at the mail source and picks a type Mailbag will process the mail source, retrieving information necessary to create the mailbag.csv file (see draft spec Mailbag Specification_prerelease) and to process and distribute attachments and derivative formats.

Copying and preparation for bagging

Mailbag will create a folder at the location specified by the "directory" arg. It will create the appropriate file structure, write out mailbag.csv, copy attachments as necessary, and basically do all the mailbag-specific tasks required before a bag can be created.

Run bagit

Finally, mailbag will call bagit.make_bag on the location specified by the "directory" arg passing along all bagit arguments from the parser created in the first step.

Clone this wiki locally