Skip to content

Commit

Permalink
improve documatation of mail collectors and csv parser
Browse files Browse the repository at this point in the history
based on user feedback
  • Loading branch information
sebix committed Jul 8, 2024
1 parent dba94eb commit 9e55aff
Showing 1 changed file with 18 additions and 4 deletions.
22 changes: 18 additions & 4 deletions docs/user/bots.md
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,7 @@ the line) or not. Defaults to true.
### Generic Mail URL Fetcher <div id="intelmq.bots.collectors.mail.collector_mail_url" />

Extracts URLs from e-mail messages and downloads the content from the URLs.
It uses the [`imbox`](https://github.com/martinrusev/imbox) library.

The resulting reports contain the following special fields:

Expand All @@ -360,6 +361,8 @@ The resulting reports contain the following special fields:
- `extra.email_message_id`: The email's message ID.
- `extra.file_name`: The file name of the downloaded file (extracted from the HTTP Response Headers if possible).

The fields can be used by parsers to identify the feed and are not automatically passed on to events.

**Chunking**

For line-based inputs the bot can split up large reports into smaller chunks. This is particularly important for setups
Expand Down Expand Up @@ -392,6 +395,10 @@ limitation set `chunk_size` to something like 384000000 (~384 MB).

(optional, boolean) Whether the mail server uses TLS or not. Defaults to true.

**`mail_starttls`**

(optional, boolean) Whether the mail server uses STARTTLS or not. Defaults to false.

**`folder`**

(optional, string) Folder in which to look for e-mail messages. Defaults to INBOX.
Expand Down Expand Up @@ -422,6 +429,7 @@ certificate is not found, the IMAP connection will fail on handshake. Defaults t
### Generic Mail Attachment Fetcher <div id="intelmq.bots.collectors.mail.collector_mail_attach" />

This bot collects messages from mailboxes and downloads the attachments.
It uses the [`imbox`](https://github.com/martinrusev/imbox) library.

The resulting reports contains the following special fields:

Expand All @@ -432,6 +440,8 @@ The resulting reports contains the following special fields:
- `extra.file_name`: The file name of the attachment or the file name in the attached archive if attachment is to
uncompress.

The fields can be used by parsers to identify the feed and are not automatically passed on to events.

**Module:** `intelmq.bots.collectors.mail.collector_mail_attach`

**Parameters (also expects [feed parameters](#feed-parameters)):**
Expand All @@ -442,7 +452,7 @@ The resulting reports contains the following special fields:

**`mail_port`**

(optional, integer) IMAP server port: 143 without TLS, 993 with TLS. Defaults to 143.
(optional, integer) IMAP server port: 143 without TLS, 993 with TLS. Default depends on SSL setting.

**`mail_user`**

Expand All @@ -456,6 +466,10 @@ The resulting reports contains the following special fields:

(optional, boolean) Whether the mail server uses TLS or not. Defaults to true.

**`mail_starttls`**

(optional, boolean) Whether to use STARTTLS before authenticating to the server. Defaults to false.

**`folder`**

(optional, string) Folder in which to look for e-mail messages. Defaults to INBOX.
Expand All @@ -466,7 +480,7 @@ The resulting reports contains the following special fields:

**`attach_regex`**

(optional, string) Regular expression of the name of the attachment. Defaults to csv.zip.
(optional, string) All attachments which match this [regular expression](https://docs.python.org/3/library/re.html#re.search) will be processed. Defaults to `csv.zip`.

**`extract_files`**

Expand Down Expand Up @@ -1697,8 +1711,8 @@ available with their index.

**`skip_header`**

(optional, boolean/integer) Whether to skip the first N lines of the input (True -> 1, False -> 0). Lines starting
with `#` will be skipped additionally, make sure you do not skip more lines than needed!
(optional, boolean/integer) Whether to skip the first N lines of the input (true equals to 1, false requalis to 0). Lines starting
with `#` will be skipped additionally, make sure you do not skip more lines than needed! Defaults to false/0.

**`time_format`**

Expand Down

0 comments on commit 9e55aff

Please sign in to comment.