Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

Closed
pmontrasio opened this issue Nov 19, 2018 · 5 comments · Fixed by #221
Closed

Performance regression after commit a57d499821375f3b239358f8bfac11bdfc68d06a #219

pmontrasio opened this issue Nov 19, 2018 · 5 comments · Fixed by #221

Comments

@pmontrasio
Copy link
Contributor

I found a x10 performance regression in the commit "Fixed: Faker.Util.pick/1 work with Enum type now" of October 4.

I have a test suite running in about 3 minutes with Faker 0.11.0 (both on my laptop, my coworkers laptops and on Travis)

$ grep faker mix.lock
  "faker": {:hex, :faker, "0.11.0", "57a712b0261123f0475fdadda359a57488429cb9f5bf4fbf973da2d8848cc2c6", [:mix], [], "hexpm"},
$ time ./run_backend_tests.sh
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 183.1 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 95299

real	3m5.132s
user	2m11.619s
sys	0m6.967s

After the upgrade to 0.11.1 the tests run in 30 minutes (again on our laptops and on Travis)

I traced back the regression to commit a57d499.
With the commit before that (Fixed: Faker.StarWars doctests - b3e933c) my tests still run in 3 minutes

$ grep faker mix.lock
  "faker": {:git, "https://github.com/igas/faker", "b3e933c0934b70f917552f16384eb664fd5a32df", [ref: "b3e933c0934b70f917552f16384eb664fd5a32df"]},
$ mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 184.9 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 752599

real	3m16.822s
user	2m37.335s
sys	0m11.350s

but with a57d499 they run in half an hour.

$ grep faker mix.lock
  "faker": {:git, "https://github.com/igas/faker", "a57d499821375f3b239358f8bfac11bdfc68d06a", [ref: "a57d499821375f3b239358f8bfac11bdfc68d06a"]},
$ time mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 1816.4 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 921450

real	30m18.396s
user	21m39.484s
sys	8m26.721s

I reverted the commit on my fork of Faker and configured mix.exs to use it. My tests run in 3 minutes again.

$ grep faker mix.lock
  "faker": {:git, "https://github.com/pmontrasio/faker", "45c9b9941eae358720d3fb2dfb3bcc894548d4a3", [ref: "45c9b9941eae358720d3fb2dfb3bcc894548d4a3"]},
$ time mix test
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 184.5 seconds
688 tests, 0 failures, 5 skipped

Randomized with seed 621815

real	3m6.463s
user	2m13.733s
sys	0m6.123s

Commit a57d499 replaced

@spec pick([any]) :: any
  def pick(list) do
    Enum.at(list, Faker.random_between(0, Enum.count(list) - 1))

with

@spec pick(Enum.t()) :: any
  def pick(enum) do
    enum
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(enum) - 1))

I didn't benchmark Enum.to_list/1 yet but it seems that the slow down is due to it.
Faker's tests from the HEAD of master and my code keep working with the "list" version. Is the new version strictly necessary?

I tested with this code with no improvements (still 30 minutes). I verified it used both versions adding some IO.puts.

@spec pick([any]) :: any
  def pick(list) when is_list(list) do
    Enum.at(list, Faker.random_between(0, Enum.count(list) - 1))
  end

@spec pick(Enum.t()) :: any
  def pick(enum) do
    enum
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(enum) - 1))

I didn't check the implementation of Enum.to_list/1 but maybe it returns immediately if the argument is a list (), so it's a no op in that case.

Given the lack of test failures, are there some cases for the conversion from Enum to list is necessary?

@keown
Copy link

keown commented Nov 19, 2018

+1

@pmontrasio
Copy link
Contributor Author

pmontrasio commented Nov 19, 2018

I noticed that the tests of Faker pick doesn't receive a list argument only when the argument is a range of integers.
I made a benchmark using benchee for two ranges generated by the tests. In the source directory of Faker:

$ grep benchee mix.exs 
      {:benchee, "~> 0.11", only: :dev},

$ mix deps.get

$ mkdir samples

$ cat > samples/run.exs <<EOF
range_1 = (0..10000)
range_2 = (61090588800..61122211199)

Benchee.run(%{
  "(0..10000) with Enum.to_list/1" => fn -> range_1
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(range_1) - 1))
  end,

  "(0..10000) without Enum.to_list/1" => fn -> range_1
    |> Enum.at(Faker.random_between(0, Enum.count(range_1) - 1))
  end
})

Benchee.run(%{
  "(61090588800..61122211199) with Enum.to_list/1" => fn -> range_2
    |> Enum.to_list()
    |> Enum.at(Faker.random_between(0, Enum.count(range_2) - 1))
  end,

  "(61090588800..61122211199) without Enum.to_list/1" => fn -> range_2
    |> Enum.at(Faker.random_between(0, Enum.count(range_2) - 1))
  end
})
EOF

$ mix run samples/run.exs

I split the output for ease of reading

Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Number of Available Cores: 8
Available memory: 31.19 GB
Elixir 1.6.1
Erlang 20.0

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 14 s


Benchmarking (0..10000) with Enum.to_list/1...
Benchmarking (0..10000) without Enum.to_list/1...

Name                                        ips        average  deviation         median         99th %
(0..10000) without Enum.to_list/1      838.47 K        1.19 μs  ±2836.71%           1 μs           2 μs
(0..10000) with Enum.to_list/1           2.88 K      347.47 μs    ±14.56%         344 μs         491 μs

Comparison: 
(0..10000) without Enum.to_list/1      838.47 K
(0..10000) with Enum.to_list/1           2.88 K - 291.34x slower

So Enum.to_list has a big hit on performances, but for the other range, which is much larger, it's much worse

Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Number of Available Cores: 8
Available memory: 31.19 GB
Elixir 1.6.1
Erlang 20.0

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 μs
parallel: 1
inputs: none specified
Estimated total run time: 14 s


Benchmarking (61090588800..61122211199) with Enum.to_list/1...
Benchmarking (61090588800..61122211199) without Enum.to_list/1...

Name                                                        ips        average  deviation         median         99th %
(61090588800..61122211199) without Enum.to_list/1      835.88 K      0.00000 s  ±2721.80%      0.00000 s      0.00000 s
(61090588800..61122211199) with Enum.to_list/1        0.00046 K         2.16 s    ±50.10%         1.63 s         3.41 s

Comparison: 
(61090588800..61122211199) without Enum.to_list/1      835.88 K
(61090588800..61122211199) with Enum.to_list/1        0.00046 K - 1806443.19x slower

aptinio added a commit to aptinio/faker that referenced this issue Nov 21, 2018
Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum.

Fixes elixirs#213 and fixes elixirs#219.
aptinio added a commit to aptinio/faker that referenced this issue Nov 21, 2018
Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum.

Fixes elixirs#213 and fixes elixirs#219.
@aptinio aptinio mentioned this issue Nov 21, 2018
2 tasks
@igas igas closed this as completed in #221 Nov 21, 2018
igas pushed a commit that referenced this issue Nov 21, 2018
Removed `Enum.to_list/1` call since `Enum.at/2` works fine with any enum.

Fixes #213 and fixes #219.
@igas
Copy link
Member

igas commented Nov 21, 2018

Hi, please test 0.11.2

@pmontrasio
Copy link
Contributor Author

My tests are quick again with 0.11.2. Thank you!

@igas
Copy link
Member

igas commented Nov 21, 2018

No worries. It's @aptinio & @elbow-jason who done the job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants