Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDB module #6283

Merged
merged 8 commits into from
Apr 10, 2018
Merged

MongoDB module #6283

merged 8 commits into from
Apr 10, 2018

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Feb 5, 2018

I added a new Filebeat module for MongoDB what I created during testing the FB module generator tooling.

This is the dashboard I created:
filebeat-mongodb-overview

It does not look good, as I hasn't been able to get real error messages from my instance.

@kvch kvch added module review Filebeat Filebeat in progress Pull request is currently in progress. labels Feb 5, 2018
description: >
Contains fields from MongoDB logs.
fields:
- name: timestamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use @timestamp for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining timestamp here means it will be mongodb.log.timestamp in the event. But normally the processed timestamp from the log line when the event happened ends up in @timestamp. Is this a different timestamp here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate timestamp field is required, because it stores the actual time the log was generated. @timestamp is the time it was processed. These two times are not necessarily the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar behaviour in other modules, for example Kafka. What we do there is that we have the timestamp of the actual log line in @timestamp and the one when the log line was actually read in read_timestamp. Name of read_timestamp is not great but we don't have a better one yet.

If you look at the ingest pipeline of the Kafka module you will see the the timestamps are moved around to make sure @timestamp is the one after processing.

Context of message
example: initandlisten
type: keyword
- name: message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the common message field here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped type: text.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking this should go into the top level message field. But looking at other modules I see we don't do this yet.

The behaviour I would expect is that the extracted message part of the log line ends up in the top level message field and the raw log line for example under log.message or log.raw. I'm good with keeping it as is as it's not different in other modules yet.

If we have it here, it probably makes sense to keep it as text and not keyword.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I readded text.

@kvch kvch force-pushed the feature/filebeat/mongodb-module branch from 6a4b180 to 06c3a57 Compare February 6, 2018 15:43
@kvch kvch removed the in progress Pull request is currently in progress. label Feb 9, 2018
description: >
Severity level of message
example: I
type: keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting as I so far assume serverity is a sortable integer in ECS. It seems what is used here as severity is more log.level. @MikePaquette FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with field names based on MongoDB logging guide: https://docs.mongodb.com/manual/reference/log-messages/ It might provide more context on the subject.
However, I agree that it is a bit weird in MongoDB. If you look at the section "Severity", you can see the taxonomy of log levels. So I am not sure why it's named severity.
Does it cause problem in ECS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep it as is as it's under the mongodb namespace. In case we map this to ECS later it would end up in log.level from my perspective. No need to solve this now.

@kvch kvch force-pushed the feature/filebeat/mongodb-module branch 2 times, most recently from b7eb3bb to 9f51a27 Compare February 15, 2018 14:32
@kvch
Copy link
Contributor Author

kvch commented Feb 15, 2018

@ruflin I think it's ready for a new round of review. Travis fails, because of XPack installation problems.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good to me. I would suggest we release this first as beta, get some feedback and then push it to GA in the next minor?

@@ -0,0 +1,8 @@
- module: mongodb
# All logs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does all logs mean here? Does mongodb only have 1 type of logs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's all I found in their docs: https://docs.mongodb.com/manual/reference/log-messages/

description: >
Severity level of message
example: I
type: keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep it as is as it's under the mongodb namespace. In case we map this to ECS later it would end up in log.level from my perspective. No need to solve this now.

@@ -0,0 +1,41 @@
{
"description": "Pipeline for parsing MongoDB log logs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mongodb log logs -> I think I start to understand why you call the filset log. What exactly is in these logs?

@kvch kvch force-pushed the feature/filebeat/mongodb-module branch from 9f51a27 to bd64df9 Compare March 27, 2018 10:08
@kvch kvch force-pushed the feature/filebeat/mongodb-module branch from 78eedde to 96f80d0 Compare March 28, 2018 17:54
@kvch
Copy link
Contributor Author

kvch commented Apr 10, 2018

Everything is green.

@ruflin ruflin merged commit 6f99a96 into elastic:master Apr 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants