Skip to content

Application Workflow

Mark Wolfe edited this page Sep 10, 2021 · 4 revisions

This is intended as a high-level description of the application's intended workflow, i.e. what the application does, in sequence.

Within this document, "mail source" is a term meaning any type of directory structure, file, or other thing that contains or represents an email mailbox to be processed and bagged.

Argument processing

Mailbag will essentially take all arguments that bagit takes, plus mailbag specific arguments. In order to reduce work, we plan to re-use the argparse. ArgumentParser configured in bagit-python.

The application will import bagit as a module, and then call bagit._make_parser. It will then add an argument group with all the mailbag-specific variables. Note that since mailbag will not be bagging in-place, the directory argument from bagit will be "where you want the processed bag directory" and a new mailbag-specific argument will point at the mail source.

Mailbox parsing

Based on either: an explicit argument describing what type of mail source is being processed Code that looks at the mail source and picks a type Mailbag will process the mail source, retrieving information necessary to create the mailbag.csv file (see draft spec Mailbag Specification_prerelease) and to process and distribute attachments and derivative formats.

Copying and preparation for bagging

Mailbag will create a folder at the location specified by the "directory" arg. It will create the appropriate file structure, write out mailbag.csv, copy attachments as necessary, and basically do all the mailbag-specific tasks required before a bag can be created.

Run bagit

Finally, mailbag will call bagit.make_bag on the location specified by the "directory" arg passing along all bagit arguments from the parser created in the first step.