-
Notifications
You must be signed in to change notification settings - Fork 171
List append benchmark #857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think we currently do not deallocate at the end, so one can also benchmark just the append cycle: #include <vector>
#include <iostream>
#include <chrono>
int main() {
std::vector<int32_t> a = {0, 1, 2, 3, 4};
int32_t n = 100000000;
auto t1 = std::chrono::high_resolution_clock::now();
for (int32_t i = 0; i < n; i++) {
a.push_back(i + 5);
}
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << a[n] << std::endl;
std::cout << "Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << std::endl;
return 0;
} and timing: $ clang++ -std=c++17 -Ofast a.cpp
$ time ./a.out
100000000
Time: 153
./a.out 0.07s user 0.10s system 93% cpu 0.177 total So I think our benchmark above is probably quite solid, LPython seems faster on my computer. |
I also tried Ubuntu 18.04 on Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz:
|
Let's try without |
I added results without |
Great. Makes sense. If we are beating other compilers without |
I also tried Ubuntu 18.04.6 on Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz:
From Apple M1 Macbook Pro macOS Monterey 12.5
First 3 are normal mode compilation, second 3 are optimizations enabled and the last one is Python. So, all in all top is |
Please, could someone possibly share how we are computing the |
Dividing with the smallest time from across all the results you have computed on your machine. |
Benchmark on Apple M1 Air 2020 (Monterey):
|
From the output of |
|
Result on Intel® Core™ i5-8250U CPU @ 1.60GHz × 8 (OS: Ubuntu 22.04 LTS)
|
Machine: Mac Air M1(2020)
|
Result on AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx @1.600 GHz, Ubuntu 20.04.4 LTS
LPython version: 0.3.0-350-g6d29003ea
Platform: Linux
Default target: x86_64-unknown-linux-gnu
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Python 3.9.7 |
Numba benchmark: from timeit import default_timer as clock
from numba import njit
@njit(nogil=True, cache=True)
def test_list():
a = [0, 1, 2, 3, 4]
n = 100000000
for i in range(n):
a.append(i + 5)
print(a[n])
test_list()
test_list()
test_list()
t1 = clock()
test_list()
t2 = clock()
print(t2-t1) On my computer: $ python b.py
100000000
100000000
100000000
100000000
0.2843555830186233 So it takes 0.28s. |
Here is a simple benchmark for appending to a list in Python:
and C++:
Results on Apple M1 Max (I ran each benchmark many times, took the lowest numbers):
Versions:
Thanks @czgdp1807 for implementing lists in our LLVM backend (#835)! This is just a first implementation, but I already like the results. :)
The text was updated successfully, but these errors were encountered: