Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for running integration tests on Windows. #411

Merged
merged 1 commit into from
Sep 7, 2022

Conversation

rawahars
Copy link
Contributor

@rawahars rawahars commented Aug 15, 2022

Summary

This commits adds the required docker-compose projects, fluent-bit configurations, and the Powershell scripts needed to run the integration tests on Windows platform.

The design of the integration tests along with the constraints are detailed in the next section.

Constraints and mitigations

When we are using docker-compose to orchestrate our tests, we encounter a few limitations. Here, we detail the ways in which the same are mitigated.

  • Fluent-bit for Windows would have to use the TCP sockets for receiving logs from other containers as Unix Sockets are not well supported. Basically, Fluent-bit would start first and listen on a port. Application containers would start after that and would use fluentd logging driver and connect to Fluent-bit IP Address:Port.
    • Therefore, in order to accommodate the same, we use different profiles for docker-compose. Core profile has fluent-bit service. Test profile has other services which generate test data.
    • Each core service has a health-check configured which checks if the Fluent-bit process is running or not.
    • We run the core profile first in detach mode and wait for the service to become healthy.
    • Then we run the test profile which would send data to fluent-bit started earlier.
    • Note that if for some reason, fluent-bit container exits before or in between the test, then test containers would not be able to connect to the TCP port, and therefore, test would error out.
  • For the above reason itself, we need to know the IP address of the fluent-bit container.
    • Therefore, we use a static NAT network which is part of the docker-compose configuration. We use a static link local subnet for this network which ensures there aren't any IP conflicts.
    • A static IP address is assigned to FLB container.
  • Upstream does not support async DNS in the scenario wherein we need to perform DNS queries multiple times from a Server List. Upstream Issue: dns resolution for plugins using async mode does not consider all the DNS servers on the host fluent/fluent-bit#5862
    • We are using a workaround wherein we use system resolver for performing the DNS query. This is achieved by setting net.dns.resolver LEGACY setting in the config.
    • Note that this is applicable only when using NAT network mode.

Implementation design

As part of each plugin test, we do the following-

  • For few plugins, we need to create test resources using CFN.
  • Perform cleanup of destination if required. This is needed so that the data in destination is from current test only.
  • Set the environment variables needed for the test.
  • Run the Core services of docker-compose
  • Run the Test data generation services of docker-compose
  • Wait for sometime for data to be populated in the upstream destinations.
  • Run the validation docker-compose projects.

At any point, if we encounter any error, then we error out at that point itself.

Testing

The integration tests were triggered multiple times using the run-integ.ps1 script.

Description of changes:

Added support for running integration tests on Windows


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This commits adds the required docker-compose projects, fluent-bit configurations, and the Powershell scripts needed to run the integration tests on Windows platform.
@rawahars rawahars requested a review from a team as a code owner August 15, 2022 15:16
[INPUT]
Name forward
Listen 0.0.0.0
Port 24542
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the standard fluent port is 24224, why not use that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that port can be used by the customers to send logs to FLB. Therefore, we are using another port for sending the traffic usually sent on Linux via Unix Sockets.
This would prevent any port contention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that port can be used by the customers to send logs to FLB.

Sorry I don't understand at all- this is an integ test, customers are not sending any logs to the test.... we send the logs and have full power to determine the port I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct. We can switch to 24224 as well. Given that the proposal for Firelens is to use dedicated port 24542, I had used it here.
If you prefer that we use 24224, I can change it the next revision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants