Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

pmontrasio · 2018-11-19T16:09:31Z

I found a x10 performance regression in the commit "Fixed: Faker.Util.pick/1 work with Enum type now" of October 4.

I have a test suite running in about 3 minutes with Faker 0.11.0 (both on my laptop, my coworkers laptops and on Travis)

$ grep faker mix.lock
  "faker": {:hex, :faker, "0.11.0", "57a712b0261123f0475fdadda359a57488429cb9f5bf4fbf973da2d8848cc2c6", [:mix], [], "hexpm"},
$ time ./run_backend_tests.sh
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 183.1 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 95299

real	3m5.132s
user	2m11.619s
sys	0m6.967s

After the upgrade to 0.11.1 the tests run in 30 minutes (again on our laptops and on Travis)

I traced back the regression to commit a57d499.
With the commit before that (Fixed: Faker.StarWars doctests - b3e933c) my tests still run in 3 minutes

$ grep faker mix.lock
  "faker": {:git, "https://github.com/igas/faker", "b3e933c0934b70f917552f16384eb664fd5a32df", [ref: "b3e933c0934b70f917552f16384eb664fd5a32df"]},
$ mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 184.9 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 752599

real	3m16.822s
user	2m37.335s
sys	0m11.350s

but with a57d499 they run in half an hour.

$ grep faker mix.lock
  "faker": {:git, "https://github.com/igas/faker", "a57d499821375f3b239358f8bfac11bdfc68d06a", [ref: "a57d499821375f3b239358f8bfac11bdfc68d06a"]},
$ time mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 1816.4 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 921450

real	30m18.396s
user	21m39.484s
sys	8m26.721s

I reverted the commit on my fork of Faker and configured mix.exs to use it. My tests run in 3 minutes again.

$ grep faker mix.lock
  "faker": {:git, "https://github.com/pmontrasio/faker", "45c9b9941eae358720d3fb2dfb3bcc894548d4a3", [ref: "45c9b9941eae358720d3fb2dfb3bcc894548d4a3"]},
$ time mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 184.5 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 621815

real	3m6.463s
user	2m13.733s
sys	0m6.123s

Commit a57d499 replaced

@spec pick([any]) :: any
  def pick(list) do
    Enum.at(list, Faker.random_between(0, Enum.count(list) - 1))

with

@spec pick(Enum.t()) :: any
  def pick(enum) do
    enum
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(enum) - 1))

I didn't benchmark Enum.to_list/1 yet but it seems that the slow down is due to it.
Faker's tests from the HEAD of master and my code keep working with the "list" version. Is the new version strictly necessary?

I tested with this code with no improvements (still 30 minutes). I verified it used both versions adding some IO.puts.

@spec pick([any]) :: any
  def pick(list) when is_list(list) do
    Enum.at(list, Faker.random_between(0, Enum.count(list) - 1))
  end

@spec pick(Enum.t()) :: any
  def pick(enum) do
    enum
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(enum) - 1))

I didn't check the implementation of Enum.to_list/1 but maybe it returns immediately if the argument is a list (), so it's a no op in that case.

Given the lack of test failures, are there some cases for the conversion from Enum to list is necessary?

The text was updated successfully, but these errors were encountered:

keown · 2018-11-19T16:11:40Z

+1

pmontrasio · 2018-11-19T16:48:00Z

I noticed that the tests of Faker pick doesn't receive a list argument only when the argument is a range of integers.
I made a benchmark using benchee for two ranges generated by the tests. In the source directory of Faker:

$ grep benchee mix.exs 
      {:benchee, "~> 0.11", only: :dev},

$ mix deps.get

$ mkdir samples

$ cat > samples/run.exs <<EOF
range_1 = (0..10000)
range_2 = (61090588800..61122211199)

Benchee.run(%{
  "(0..10000) with Enum.to_list/1" => fn -> range_1
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(range_1) - 1))
  end,

  "(0..10000) without Enum.to_list/1" => fn -> range_1
    |> Enum.at(Faker.random_between(0, Enum.count(range_1) - 1))
  end
})

Benchee.run(%{
  "(61090588800..61122211199) with Enum.to_list/1" => fn -> range_2
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(range_2) - 1))
  end,

  "(61090588800..61122211199) without Enum.to_list/1" => fn -> range_2
    |> Enum.at(Faker.random_between(0, Enum.count(range_2) - 1))
  end
})
EOF

$ mix run samples/run.exs

I split the output for ease of reading

Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Number of Available Cores: 8
Available memory: 31.19 GB
Elixir 1.6.1
Erlang 20.0

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 14 s


Benchmarking (0..10000) with Enum.to_list/1...
Benchmarking (0..10000) without Enum.to_list/1...

Name                                        ips        average  deviation         median         99th %
(0..10000) without Enum.to_list/1      838.47 K        1.19 μs  ±2836.71%           1 μs           2 μs
(0..10000) with Enum.to_list/1           2.88 K      347.47 μs    ±14.56%         344 μs         491 μs

Comparison: 
(0..10000) without Enum.to_list/1      838.47 K
(0..10000) with Enum.to_list/1           2.88 K - 291.34x slower

So Enum.to_list has a big hit on performances, but for the other range, which is much larger, it's much worse

Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Number of Available Cores: 8
Available memory: 31.19 GB
Elixir 1.6.1
Erlang 20.0

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 14 s


Benchmarking (61090588800..61122211199) with Enum.to_list/1...
Benchmarking (61090588800..61122211199) without Enum.to_list/1...

Name                                                        ips        average  deviation         median         99th %
(61090588800..61122211199) without Enum.to_list/1      835.88 K      0.00000 s  ±2721.80%      0.00000 s      0.00000 s
(61090588800..61122211199) with Enum.to_list/1        0.00046 K         2.16 s    ±50.10%         1.63 s         3.41 s

Comparison: 
(61090588800..61122211199) without Enum.to_list/1      835.88 K
(61090588800..61122211199) with Enum.to_list/1        0.00046 K - 1806443.19x slower

Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum. Fixes elixirs#213 and fixes elixirs#219.

Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum. Fixes #213 and fixes #219.

igas · 2018-11-21T11:39:20Z

Hi, please test 0.11.2

pmontrasio · 2018-11-21T12:12:10Z

My tests are quick again with 0.11.2. Thank you!

igas · 2018-11-21T19:57:47Z

No worries. It's @aptinio & @elbow-jason who done the job.

aptinio added a commit to aptinio/faker that referenced this issue Nov 21, 2018

Fix performance regression

4eea7eb

Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum. Fixes elixirs#213 and fixes elixirs#219.

aptinio added a commit to aptinio/faker that referenced this issue Nov 21, 2018

Fix performance regression

b8ec859

Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum. Fixes elixirs#213 and fixes elixirs#219.

aptinio mentioned this issue Nov 21, 2018

Fix performance regression #221

Merged

2 tasks

igas closed this as completed in #221 Nov 21, 2018

igas pushed a commit that referenced this issue Nov 21, 2018

Fixed: Performance regression on Faker.Util.pick/1 (#221)

85a6ce4

Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum. Fixes #213 and fixes #219.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

pmontrasio commented Nov 19, 2018

keown commented Nov 19, 2018

pmontrasio commented Nov 19, 2018 •

edited

Loading

igas commented Nov 21, 2018

pmontrasio commented Nov 21, 2018

igas commented Nov 21, 2018

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

Comments

pmontrasio commented Nov 19, 2018

keown commented Nov 19, 2018

pmontrasio commented Nov 19, 2018 • edited Loading

igas commented Nov 21, 2018

pmontrasio commented Nov 21, 2018

igas commented Nov 21, 2018

pmontrasio commented Nov 19, 2018 •

edited

Loading