-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split AgentSet into map and do to seperate return types #2237
Conversation
for more information, see https://pre-commit.ci
Performance benchmarks:
|
Performance benchmarks:
|
In general I'm in favor of splitting methods when they have different outputs. I think in this case I also support it.
I almost feel like AgentSet should be the input to some function here. Like On a totally different note, I'm taking a few days of Mesa until Monday. Don't let my absence block anything. |
MESA is not some DataFrame library but we can draw inspiration from their API. While working with dataframes, it makes sense that each operation returns a new dataframe. You are manipulating data. In MESA, sometimes we want to manipulate agents (so we use For a further motivation for this PR, please check this comment which gives a clear example of how confusing method chaining can become. So rather than controlling the return type with a keyword argument (as we currently do), it seems easier to split the two return types into two seperate methods. This makes it easier to see what is going on. Note that I also use the exact same seperation between |
I find the AgentSet and DF to be analogous for the use case, i.e. the method chaining. But even in the DF libraries, there are disagreements. I forgot to emphasize the pandas API choice and the Polars API choice are actually very different from each other. It seems that this PR is taking the Polars approach. Understandably for Polars, because it is written in Rust and needs a consistent typing. |
One last concern is that |
I am fine with giving apply a different name if we can come up with one. However, a method name should be a verb in my view because it does something. So, I am not in favor of changing it to As an aside, I don't care about LLMs so for me that is not a good argument for changing the name. In my experience they produce crapy code. Students can use them in my MESA exam but does that rely on them exclusively all fail the exam. In my own usage, I only find them useful for boilerplate stuff. But even then, it is often quicker for me to just write it out with more traditional autocomplete. |
I think the naming is much more confusing than I initially thought. In Pandas "apply" executes a function along rows (default) or columns. We don't have that in AgentSet. Pandas "map" function executes elemen-wise, which might be much more appropriate. However, pandas does not have a "map" function for groupby objects. the built-in function "map" also applies a function to an iterator and results in a new iterator of the return values. So I think "map" might actually be a much better name, with the exception that pandas groupby doesnt use that name. Would you still be okay with that name @quaquel (for consistency we then should rename apply to map for groupby) |
Changing |
Oh I like |
I have changed it (3 maintainers in favor seems enough). |
…2237) * seperate original do into map and do in AgentSet based on their difference in return type.
Motivation
In #2220 @Corvince suggested to seperate
AgentSet.do
into 2 seperate methods based on their return type.AgentSet.do
executes the callable/method/function and returns the original AgentSet.AgentSet.map
executes the callable/method/function and returns the results. At the moment both behaviors are in AgentSet.do and can be controlled with thereturn_results
keyword argument.Implementation details.
The code is straightforward. I wrap all kwargs to
AgentSet.do
into a dict. Check ifreturn_results
is in it. If it is, a DepractionWarning is issued. The rest of the code proceeds as normal.Usage