Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change index at DataFrame creation in get_agent_vars_dataframe #1586

Merged
merged 1 commit into from
Jan 21, 2023

Conversation

Tortar
Copy link
Contributor

@Tortar Tortar commented Jan 20, 2023

This is slightly faster as one would expect, code to reproduce:

import timeit
a = timeit.timeit("s =pd.DataFrame.from_records(t); s = s.set_index(['a', 'b'])","import pandas as pd\nq=list(range(1000)); t={'a':q,'b':q}", number=1000)
b = timeit.timeit("s =pd.DataFrame.from_records(t, index=['a', 'b'])","import pandas as pd\nq=list(range(1000)); t={'a':q,'b':q}", number=1000)
print(a, b)

@codecov
Copy link

codecov bot commented Jan 21, 2023

Codecov Report

Base: 82.07% // Head: 82.06% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (dcbc90b) compared to base (c7b2ccf).
Patch has no changes to coverable lines.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1586      +/-   ##
==========================================
- Coverage   82.07%   82.06%   -0.02%     
==========================================
  Files          18       18              
  Lines        1389     1388       -1     
  Branches      271      271              
==========================================
- Hits         1140     1139       -1     
  Misses        205      205              
  Partials       44       44              
Impacted Files Coverage Δ
mesa/datacollection.py 95.29% <ø> (-0.06%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Tortar Tortar changed the title Change index at DataFrame creation in DataCollector.get_agent_vars_dataframe Change index at DataFrame creation in get_agent_vars_dataframe Jan 21, 2023
@rht
Copy link
Contributor

rht commented Jan 21, 2023

This is slightly faster ...

By how much?

@Tortar
Copy link
Contributor Author

Tortar commented Jan 21, 2023

25% in this case, expected more, but it seems that indexing costs also at creation, but more you have columns less it impacts

@Tortar
Copy link
Contributor Author

Tortar commented Jan 21, 2023

Actually it is almost the same with more columns

import timeit
a = timeit.timeit("s =pd.DataFrame.from_records(t); s = s.set_index(['a', 'b'])","import pandas as pd\nq=list(range(10000)); t={x:q for x in 'abcdefghilmn'}", number=100)
b = timeit.timeit("s =pd.DataFrame.from_records(t, index=['a', 'b'])","import pandas as pd\nq=list(range(10000)); t={x:q for x in 'abcdefghilmn'}", number=100)
print(a, b)

gives me anyway a 15% better where we have 12 columns

import timeit

a = timeit.timeit("s =pd.DataFrame.from_records(t); s = s.set_index(['a', 'b'])","import pandas as pd\nq=list(range(10000)); cols = 'abcdefghilmnopqrstuvz123456'; t={x:q for x in cols}", number=100)
b = timeit.timeit("s =pd.DataFrame.from_records(t, index=['a', 'b'])","import pandas as pd\nq=list(range(10000)); cols = 'abcdefghilmnopqrstuvz123456'; t={x:q for x in cols}", number=100)
print(a, b)

gives me anyway a 30% better where we have 30+ columns

@rht
Copy link
Contributor

rht commented Jan 21, 2023

LGTM

@rht rht merged commit e3cc4d4 into projectmesa:main Jan 21, 2023
@jackiekazil jackiekazil added this to the v1.2.0 Taylor milestone Feb 27, 2023
@jackiekazil jackiekazil mentioned this pull request Mar 7, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants