Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/awsfirehosereceiver] Add support for CloudWatch logs #35077

Merged

Conversation

gavincabbage
Copy link
Contributor

@gavincabbage gavincabbage commented Sep 9, 2024

Description: Add support for CloudWatch logs in JSON format.

The receiver can now be configured to receiver either cwmetrics or cwlogs record type for CloudWatch metrics logs, respectively. The former is only valid for metrics receivers and the latter for logs receivers.

Some changes were necessary to the default record type handling to conform to the contracts required by the generated tests and ensure the default config returned by the factory was usable. Previously the default record type was cwmetrics but this is not sensible for logs receivers.

To accommodate this, the default record type is now telemetry-type-specific. That is, for metrics it is cwmetrics and for logs it is cwlogs. Each receiver makes this determination on its own. Therefore an empty string is now the default returned by the factory and an empty string is also acceptable in the users config. It is no longer required to specify a record type, although it is still supported, so this is not a breaking change.

Link to tracking Issue: n/a

Testing: Unit tests covering new code as well as manual smoke testing against AWS Firehose and CloudWatch themselves.

Documentation: Component README update.

Copy link

linux-foundation-easycla bot commented Sep 9, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@gavincabbage gavincabbage changed the title receiver/awsfirehosereceiver: add cwlogs [receiver/awsfirehosereceiver] Add support for CloudWatch logs Sep 9, 2024
@github-actions github-actions bot requested a review from Aneurysm9 September 9, 2024 16:35
@gavincabbage gavincabbage force-pushed the awsfirehosereceiver/cwlogs branch 2 times, most recently from d3d4f74 to 977ffa7 Compare September 9, 2024 16:52
@gavincabbage gavincabbage force-pushed the awsfirehosereceiver/cwlogs branch from 977ffa7 to 9739a67 Compare September 17, 2024 13:59
@gavincabbage gavincabbage marked this pull request as ready for review September 17, 2024 13:59
@gavincabbage gavincabbage requested review from a team and dashpole September 17, 2024 13:59
@gavincabbage gavincabbage force-pushed the awsfirehosereceiver/cwlogs branch from 9739a67 to 60e448f Compare September 19, 2024 13:18
@gavincabbage gavincabbage requested a review from a team as a code owner September 19, 2024 13:18
@gavincabbage gavincabbage force-pushed the awsfirehosereceiver/cwlogs branch from 60e448f to 5c19051 Compare September 26, 2024 20:19
Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a first pass through this PR. It very closely mirrors what currently exists for metrics. I would like to have eyes from someone with more expertise on this. @Aneurysm9 can you possibly take a look or recommend someone who can?

receiver/awsfirehosereceiver/logs_receiver.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/logs_receiver.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/logs_receiver.go Outdated Show resolved Hide resolved
@gavincabbage
Copy link
Contributor Author

Thanks for the feedback @mwear! I believe addressed your comments and the test failures.

I did a bit of refactoring around the default record type logic to conform to the contracts required by the generated tests and have updated the description accordingly.

@gavincabbage
Copy link
Contributor Author

There are a couple CI failures still but it is unclear to me what the problem is.

@jackgopack4
Copy link
Contributor

There are a couple CI failures still but it is unclear to me what the problem is.

so for build-and-test/checks the test output says "run make generate and commit the changes"

I tried that on my local machine and it generated some changes to receiver/awsfirehosereceiver/generated_component_test.go and receiver/awsfirehosereceiver/generated_package_test.go

Might not solve the cross-compile stage but maybe would check off one of the failing tests

Copy link
Contributor

@jackgopack4 jackgopack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left suggestions for what the make generate command outputted on my machine. Looks like that might help you pass the build-and-test/check stage

}
// Multiple logs in each record separated by newline character
for datumIndex, datum := range bytes.Split(record, []byte(recordDelimiter)) {
if len(datum) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will datum ever be an invalid JSON? If so, could be worth checking to ensure it is a valid JSON before calling json.Unmarshal on line 58

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is unlikely as this is the wrapper coming from CloudWatch and not the user's log payload itself (which could be anything).

Out of curiosity though, how would you check for valid JSON besides trying to unmarshal it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right as long as unmarshal has enough safeguards it's probably fine.

}

// isValid validates that the cWMetric has been unmarshalled correctly.
func (u Unmarshaler) isValid(log cwLog) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any other requirements for correctness here, or is nonempty fields sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. All these attributes are AWS IDs so they're arbitrary strings. I based this validation on what I saw the metric path doing. That path adds more attributes than I have here because metric payloads have more information (e.g. region, which is not included in logs payloads) but otherwise followed the established pattern.

"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver/internal/unmarshaler"
)

// The logsConsumer implements the firehoseConsumer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The logsConsumer implements the firehoseConsumer
// A logsConsumer implements the firehoseConsumer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omitted The/A entirely to follow the usual godoc approach of starting with the subject name (not that it matters much for an unexported type).

@@ -0,0 +1,7 @@
awsfirehose:
endpoint: 0.0.0.0:4433
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the default endpoint is localhost now instead of 0.0.0.0, would it work with localhost?

#34394

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this question.

This receiver works just fine with localhost with the caveat that Firehose needs to be able to reach it (Firehose also requires an HTTPS endpoint, I believe). In my testing I've used ngrok to facilitate this.

receiver/awsfirehosereceiver/generated_component_test.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/generated_component_test.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/generated_component_test.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/generated_package_test.go Outdated Show resolved Hide resolved
receiver/awsfirehosereceiver/generated_package_test.go Outdated Show resolved Hide resolved
@gavincabbage
Copy link
Contributor Author

Thanks @jackgopack4! Sorry I missed that bit about make generate 🤦

Besides the cross compile issue, I think I've addressed everything you called out, with either a change or a question/response.

Copy link
Contributor

@jackgopack4 jackgopack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for humoring my suggestions

@dmitryax dmitryax merged commit 9dbe5e0 into open-telemetry:main Oct 9, 2024
156 checks passed
@github-actions github-actions bot added this to the next release milestone Oct 9, 2024
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this pull request Dec 17, 2024
…telemetry#35077)

**Description:** Add support for CloudWatch logs in JSON format. 

The receiver can now be configured to receiver either `cwmetrics` or
`cwlogs` record type for CloudWatch metrics logs, respectively. The
former is only valid for metrics receivers and the latter for logs
receivers.

Some changes were necessary to the default record type handling to
conform to the contracts required by the generated tests and ensure the
default config returned by the factory was usable. Previously the
default record type was `cwmetrics` but this is not sensible for logs
receivers.

To accommodate this, the default record type is now
telemetry-type-specific. That is, for metrics it is `cwmetrics` and for
logs it is `cwlogs`. Each receiver makes this determination on its own.
Therefore an empty string is now the default returned by the factory and
an empty string is also acceptable in the users config. It is no longer
required to specify a record type, although it is still supported, so
this is not a breaking change.

**Link to tracking Issue:** n/a

**Testing:** Unit tests covering new code as well as manual smoke
testing against AWS Firehose and CloudWatch themselves.

**Documentation:** Component README update.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants