Performance of `WebMock::Util::Headers.normalize_headers` #1027

baweaver · 2023-05-18T21:34:42Z

While trying to sort out some Capybara speed issues we were running memory profiling and noticed that there was an exceptionally hot path in the normalize_headers method. We took a more detailed look at it and came up with this experiment to see what interesting information we could find on it:

require 'benchmark/ips'
require 'memory_profiler'

# Recreating the Header utils
module WebMock
  module Util
    class Headers
      # Hoist these so we don't end up with recreated objects
      STANDARD_HEADER_DELIMITER = '-'.freeze
      NONSTANDARD_HEADER_DELIMITER = '_'.freeze
      JOIN = ', '.freeze

      # What exists today, unchanged
      def self.normalize_headers(headers)
        return nil unless headers
        array = headers.map { |name, value|
          [name.to_s.split(/_|-/).map { |segment| segment.capitalize }.join("-"),
           case value
            when Regexp then value
            when Array then (value.size == 1) ? value.first.to_s : value.map {|v| v.to_s}.sort
            else value.to_s
           end
          ]
        }
        Hash[*array.inject([]) {|r,x| r + x}]
      end

      def self.just_return_headers(headers)
        headers
      end

      # Experiment to try and speed that up
      def self.enhanced_normalize_headers(headers)
        return nil unless headers

        # Using `each_with_object` to reduce object creations and directly build
        # the new headers hash.
        headers.each_with_object({}) do |(name, value), new_headers|
          # Moved the name normalization into a new method mostly to give a name to what it's
          # doing here.
          new_headers[normalize_name(name)] =
            case value
            when Regexp then value
            when Array then (value.size == 1) ? value.first.to_s : value.map(&:to_s).sort
            else value.to_s
            end
        end
      end

      # Good deal of time savings on that process though
      def self.normalize_name(name)
        name
          .to_s
          # By using `tr` here we can normalize the headers ahead of time
          .tr(NONSTANDARD_HEADER_DELIMITER, STANDARD_HEADER_DELIMITER)
          # ...which will be faster because `split` on a `String` is a good deal faster
          # than on a `Regexp`.
          .split(STANDARD_HEADER_DELIMITER)
          # Now since this is state we control here we can mutate it with `map!` to again
          # avoid extra object creations.
          .map!(&:capitalize)
          # Then join it back together. Most of these constants are to avoid recreating
          # these strings hundreds of thousands of times in larger codebases. Ours I
          # believe was in the millions.
          .join(STANDARD_HEADER_DELIMITER)
      end
    end
  end
end

# Headers pulled from a request on our suite to see how they're handled
headers = {
  "x-powered-by"=>["Express"],
  "x-request-id"=>["80763389-a6a7-4495-aef2-d26290b1ef6f"],
  "timing-allow-origin"=>["*"],
  "content-type"=>["application/json; charset=utf-8"],
  "content-length"=>["17"],
  "etag"=>["W/\"11-pEowaKzwbXVLfeOv1lndFqAk+sg\""],
  "date"=>["Thu, 18 May 2023 17:46:46 GMT"],
  "connection"=>["keep-alive"],
  "keep-alive"=>["timeout=5"]
}

# Then see how they fare against eachother
Benchmark.ips do |x|
  x.report('old') { WebMock::Util::Headers.normalize_headers(headers) }
  x.report('new') { WebMock::Util::Headers.enhanced_normalize_headers(headers) }
  x.report('none') { WebMock::Util::Headers.just_return_headers(headers) }

  x.compare!
end

# Results:
#
# Warming up --------------------------------------
#                  old     7.745k i/100ms
#                  new    10.272k i/100ms
# Calculating -------------------------------------
#                  old     77.489k (± 3.6%) i/s -    387.250k in   5.004044s
#                  new    103.976k (± 2.5%) i/s -    523.872k in   5.041797s

# Comparison:
#                  new:   103976.0 i/s
#                  old:    77489.0 i/s - 1.34x  slower

# While that 1.34 does not seem large we had seen reports of a significant amount of object allocations
# and memory usage out of WebMock in our larger monorepo, so in the process of trying to find out
# sources we investigated fixing this and saw some interesting gains and wanted to upstream it.

# Let's take a look into memory footprints real quick and see how those look for 10k parses.
# Why 10k? We'd seen 20-50k requests running through this on a single spec run at one point, 
# so it's a very hot path for a larger monolithic app. That said also investigating why
# in the world that request count is so high as well.

puts "Old Variant Memory Profile", '=' * 50, ''

MemoryProfiler.report {
  10_000.times { WebMock::Util::Headers.normalize_headers(headers) }
}.pretty_print

puts

puts "New Variant Memory Profile", '=' * 50, ''

MemoryProfiler.report {
  10_000.times { WebMock::Util::Headers.enhanced_normalize_headers(headers) }
}.pretty_print

# While I could give the raw report this is the condensed version formatted for
# readability of what I'd found interesting:
{
  "Allocated Memory by Gem in Bytes" => {
    old: 60_640_360,
    new: 38_080_360
  },

  # Pretty dramatic reduction on Hash, and elimination of MatchData
  # in terms of memory. Didn't get a huge boost on String, though
  # testing this with `gsub` to avoid extra strings is much slower
  # so probably not worth it.
  "Allocated Memory by Class" => {
    old: {
      String: 23_840_000,
      Hash: 21_600_360,
      MatchData: 10_080_000,
      Array: 5_120_000,
    },
    new: {
      String: 21_600_360,
      Hash: 9_280_000,
      Array: 7_200_000,
    }
  },

  "Allocated Objects by Class" => {
    old: {
      String: 540_009,
      Hash: 380_000,
      MatchData: 60_000,
      Array: 10_000,
    },
    new: {
      String: 540_009,
      Hash: 180_000,
      Array: 10_000,
    }
  }
}

There are some important caveats and nuances to mention in this:

We are a very large Rails monolith
That monolith runs a React frontend on Vite
Vite makes a crapton of requests (~20k in one spec on occasion)
We're additionally auditing why in the world it does that and trying to cut that count

That said, this does represent a performance improvement in what can be a hot path in larger applications.

Looking for feedback and/or interest level in this type of change before I go pushing any PRs on it.

The text was updated successfully, but these errors were encountered:

Optimizes the `WebMock::Util::Headers.normalize_headers` method to: * Create less objects * Avoid using Regexp where possible * Hoists constants More details can be found in the related issue, including performance benchmarks: bblimke#1027

baweaver · 2023-06-06T16:53:58Z

Opened a PR which addresses this:

#1029

vanderhoop · 2023-06-13T02:10:11Z

Vite makes a crapton of requests (~20k in one spec on occasion) We're additionally auditing why in the world it does that and trying to cut that count.

@baweaver are you running the conventional vite dev server, or transpiling production-like assets and using the vite "preview" server?

If it's the former, I would think there might be an incredibly long daisy chain of necessary module fetches, which on http/1.1 (the default if you haven't set up local certs) could really slow down your capybara specs. The optimization and recommendation would be to generate assets and serve them via the vite "preview" server.

If you're already using the preview server, I'd be super curious to continue chatting about what you find on the rails perf slack (handle: vanderhoop) or elsewhere.

technicalpickles · 2023-06-14T19:17:59Z

@vanderhoop we're using vite-rails which ends up running the conventional vite-dev server. I looked, and I'm not seeing an option for the preview server. I can file issues there about that.

I think there's still value in improving the performance of this particular method in any event.

If anyone reading this is looking for resources about Ruby & Memory, I've found How Ruby Uses Memory — SitePoint to be a great guide. It touches a bit on how object allocation can affect performance.

bblimke · 2023-08-19T12:47:19Z

@baweaver thank you for this great research and analysis. Awesome work! The PR is merged.

baweaver mentioned this issue Jun 6, 2023

Optimizes normalize_headers for performance #1029

Merged

bblimke closed this as completed Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of `WebMock::Util::Headers.normalize_headers` #1027

Performance of `WebMock::Util::Headers.normalize_headers` #1027

baweaver commented May 18, 2023

baweaver commented Jun 6, 2023

vanderhoop commented Jun 13, 2023 •

edited

Loading

technicalpickles commented Jun 14, 2023

bblimke commented Aug 19, 2023

Performance of WebMock::Util::Headers.normalize_headers #1027

Performance of WebMock::Util::Headers.normalize_headers #1027

Comments

baweaver commented May 18, 2023

baweaver commented Jun 6, 2023

vanderhoop commented Jun 13, 2023 • edited Loading

technicalpickles commented Jun 14, 2023

bblimke commented Aug 19, 2023

Performance of `WebMock::Util::Headers.normalize_headers` #1027

Performance of `WebMock::Util::Headers.normalize_headers` #1027

vanderhoop commented Jun 13, 2023 •

edited

Loading