Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: observability and security for HTTP gateway #1825

Merged
merged 58 commits into from
May 8, 2024

Conversation

shumkov
Copy link
Member

@shumkov shumkov commented Apr 20, 2024

Issue being fixed or feature implemented

Currently, to handle HTTP requests in Evonode we use outdated version of Envoy This version contains many security vulnerabilities and is not configured properly for being an edge proxy for all incoming HTTP traffic. To control and limit the load it should provide mechanisms to limit incoming requests. We don't have any visibility on HTTP traffic in general and specific requests to Drive and DAPI.

What was done?

  • Updated Envoy and configuration to latest version 1.30.1
  • Renamed DAPI Envoy service to Platform Gateway since it handles requests not just for DAPI but for Drive as well
  • Added configuration platform.gateway.maxConnections to limit max active connections to Gateway. If we close to the limit, we disable keep-alive, and stop accepting requests when we reach it.
  • Added configuration platform.gateway.maxHeapSizeInBytes to release free memory and disable keep-alive when we close to limit and reject incoming connections when we reach it.
  • Added configuration platform.gateway.upstreams.*.maxRequest to limit active requests to specific upstream (DAPI, Drive, etc)
  • Added configuration platform.gateway.metrics to expose Envoy Prometheus metrics.
  • Added configuration platform.gateway.admin to enable Envoy admin interface (must be enabled for metrics).
  • Added configuration platform.gateway.listeners.*.http2.maxConcurrentStreams to limit max concurrent streams per incoming HTTP2 connection.
  • Added configuration platform.gateway.log.level to define verbosity of Envoy application logs.
  • Added configuration platform.gateway.log.accessLogs to define any number of access logs. Supported destinations are stdout, stderr and file. Supported formats are text or json. Default output templates are defined for each format and can be ovewritten for any specific access logs.
  • Global HTTP request rate limiter replaced with IP-based so now it limits per IP but not for all incoming requests. Whitelisting and blacklisting are now supported as well.
  • Added configuration platfrom.gateway.rateLimiter.metrics to expose Rate limiter Prometheus metrics.
  • Added configuration platform.drive.abci.metrics to expose Drive ABCI Prometheus metrics.
  • the dapi_tx_filter_stream docker compose service renamed to dapi_core_streams since it expose multiple streaming endpoints.
  • Configured Envoy to sanitize incoming request path
  • Configured Envoy to limit the connection and request max duration and idle timeouts
  • Configured Envoy to limit http2 connection and stream window sizes
  • Rejected external requests to getProofs since this is internal endpoint between DAPI and Drive and can be misused to make a huge load on the system.
  • Reduced new connection read and write buffers to reduce memory consumption
  • Forced more efficient HTTP2 connections between Envoy and upstreams
  • Configured Envoy to expose Prometheus metrics with default path instead of /stats/prometheus
  • Added metrics for Drive ABCI queries

How Has This Been Tested?

Multiple testing scenarios to verify each configuration on the ouzo devnet with metric visualisations in Grafana

Breaking Changes

None

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

packages/dashmate/configs/defaults/getBaseConfigFactory.js Outdated Show resolved Hide resolved
packages/dashmate/configs/defaults/getBaseConfigFactory.js Outdated Show resolved Hide resolved
packages/dashmate/configs/defaults/getBaseConfigFactory.js Outdated Show resolved Hide resolved
packages/dashmate/docker-compose.yml Outdated Show resolved Hide resolved
packages/dashmate/docker-compose.yml Outdated Show resolved Hide resolved
packages/rs-drive-abci/.env.local Outdated Show resolved Hide resolved
packages/rs-drive-abci/.env.mainnet Outdated Show resolved Hide resolved
packages/rs-drive-abci/.env.testnet Outdated Show resolved Hide resolved
packages/rs-drive-abci/src/logging/level.rs Outdated Show resolved Hide resolved
@shumkov shumkov added this to the v1.0.0 milestone Apr 30, 2024
@shumkov shumkov marked this pull request as ready for review April 30, 2024 13:02
@shumkov shumkov changed the title chore: observability and security for HTTP traffic chore: observability and security for HTTP gateway Apr 30, 2024
lklimek
lklimek previously approved these changes May 8, 2024
packages/dashmate/configs/defaults/getBaseConfigFactory.js Outdated Show resolved Hide resolved
packages/dashmate/configs/defaults/getBaseConfigFactory.js Outdated Show resolved Hide resolved
packages/dashmate/src/config/Config.js Outdated Show resolved Hide resolved
packages/rs-drive-abci/src/metrics.rs Outdated Show resolved Hide resolved
packages/rs-drive-abci/src/query/service.rs Outdated Show resolved Hide resolved
Copy link
Member

@QuantumExplorer QuantumExplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving without review.

@shumkov shumkov merged commit e624e02 into v1.0-dev May 8, 2024
27 checks passed
@shumkov shumkov deleted the chore/dashmate/secure-api branch May 8, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants