-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AgentSet: Allow selecting a fraction of agents in the AgentSet #2253
Conversation
Performance benchmarks: |
what is the motivation for adding this to the agentset? |
That seems useful, thanks! The only worry I have is how this behaves if a user specifies both n and p. That probably should raise an error? Or maybe there is a good name that could incorporate both p and n? So if it is between 0 and 1 use a fraction and if it is a whole number above 1 use that number? |
Sorry, was still working on other features (and my actual model), wrote it up.
Yeah I was thinking about that. Maybe just don't do that (and we mention it in the docstring)? If you just want to select a fraction of
Very interesting idea, but maybe in this case explicit is better than implicit. Except if you can come up with a killer name. |
I like the clarity of p. So my suggestion would be to raise a value error if both n and p are passed |
see the few minor comments and once unit tests are added, this is good to go. |
Allow selecting a fraction of agents in the AgentSet.
for more information, see https://pre-commit.ci
7966884
to
e096bc1
Compare
Also add the Raises ValueError and Note about not shuffling by default.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Okay, I:
However, I noticed that there's an important difference between Currently Why? Because if you take these two use cases:
The latter is used way more than the former. And it will be way more logical if you select by type. So I would suggest applying fraction afterwards, on the selected AgentSet after all other operations are done. Then you could still do both: # Select the agents with "wealth" less than 5, and at most 20% of total agents
agents.select(fraction=0.2).select(lambda agent: agent.wealth < 5)
# Select the agents with "wealth" less than 5, and then 20% of those agents
agents.select(lambda agent: agent.wealth < 5, fraction=0.2)
# or, equivalently:
agents.select(lambda agent: agent.wealth < 5).select(fraction=0.2) But now the one that's more used and more intuitive will go well by default. Totally other options could be:
|
@EwoutH I'm also wondering about this. Not saying that this shouldn't be in the library, but a concrete example could give some illustration. Is this used in your project? |
This was the thing I wanted to do: # Randomly select 40% of the agents from the AgentSet and give them a license
model.agents.shuffle().select(fraction=0.4).set('has_license', True) I needed to do this: n_license = round(model.agents * license_chance)
model.agents.shuffle().select(n=n_license).do(lambda agent: setattr(agent, 'has_license', True)) With #2254 it got simplified to: n_license = round(model.agents * license_chance)
model.agents.shuffle().select(n=n_license).set('has_license', True) It's not a huge use case, but it's nice. Especially that you don't need to break the chain. Combine it with a function and it get's really powerful though. Assume I want to distribute some cars around (I know a certain percentage of all people has a car), but only to agents with licenses. agents.select(lambda a: a.has_license, fraction=car_chance).set('has_car', True) Without the fraction, this would have been: n_car = round(model.agents * car_chance)
model.agents.shuffle().select(n=n_car ).set('has_license', True) So yeah, it's not a huge use case. Maybe it adds some complexity. There's an unique application for fraction as upper limit (cap), as currently implemented, and a unique application for doing it afterwards. I need to think about this a bit longer. |
Right, |
Good catch! I see two possibilities now. Either just change the special meaning from 0 to -1. I don't know if there was a good use case for 0, but it's rather strange for 0 to indicate all agents. The more holistic approach would be to split select into a filter function and a sample function. This would also simplify the logic and solve the "before or after" question (which was present but unconsidered before fraction was introduced) |
The brain is so interesting that after a nights sleep you look at it again and you think oh, and it all clicks together. Now I just have to write it up, rewrite the codes, tests and examples. Can’t wait for 2026/2027 where with a voice message a bit does that automatically. Long story short: There’s a special use case for when filtering, you want a certain number or fraction at most. Especially the fraction should happen right there in the function, because after the function is done, you don’t know how large the For all other cases (before, after) a
Obviously the way to go. I was thinking |
I think having an (agent for agent in agentset) should already give you an iterator over the agentset. Definitely worth exploring that more, but certainly way out of scope for this PR //Edit |
n is removed with a fallback max (int | float, optional): The maximum amount of agents to select. Defaults to infinity. - If an integer of 1 or larger, the first n matching agents are selected. - If a float between 0 and 1, at most that fraction of original the agents are selected.
I updated this PR to replace
Some details:
Tests are updated. Please double check the internal If we decide this is the way to go, I will update the PR description. I plan on adding a separate But that would be separate PR. |
I am unsure about using a single keyword for both the number and the percentage, but I won't object to it either. I would change the name, however. It would be nice to see a quick overview of what the API is now becoming just for clarity.
I hate the language, but, yes, we can pick up useful ideas and give them a better name. |
Any suggestions (either these or another)? |
I like |
Currently it does round, do you think it shouldn't? |
If its an upper limit I think it should always round down/floor |
Difficult one. Because if you describe it as "selecting a fraction" I would expect it to select the closest match. I think in many practical scenarios the closest selection to the fraction you wanted is most logical. |
If we go with |
Valid argument for "selecting a fraction", but for selecting "at most" 33% I would not expect it to select 40% |
Exactly. Thats why I think its a good name (if we floor), because people will always have different expectations for "selecting a fraction" with respect to rounding. |
I renamed |
PR description is updated, including the usage examples |
@projectmesa/maintainers ready to go? (would like to merge myself) |
(keeping the branch in case of regressions) |
This PR updates the `select` method in the `AgentSet` class by replacing the `n` parameter with a more versatile `at_most` parameter. The `at_most` parameter allows for selecting either a specific number of agents or a fraction of the total agents when provided as an integer or a float, respectively. Additionally, backward compatibility is maintained by supporting the deprecated `n` parameter, which will trigger a warning when used. ### Motive Previously, the `select` method only allowed users to specify a fixed number of agents (`n`) to be selected. The new `at_most` parameter extends this functionality by enabling the selection of agents based on a proportion of the total set, which is particularly useful in scenarios where relative selection is desired over absolute selection. ### Implementation - **`at_most` Parameter:** - Accepts either an integer (to select a fixed number of agents) or a float between 0.0 and 1.0 (to select a fraction of the total agents). - `at_most=1` selects one agent, while `at_most=1.0` selects all agents. - If a float is provided, it determines the maximum fraction of agents to be selected from the total set. It rounds down to the nearest number of whole agents. - **Backward Compatibility:** - The deprecated `n` parameter is still supported, but it now serves as a fallback for `at_most` and triggers a deprecation warning. - **Behavior Notes:** - `at_most` serves as an upper limit on the number of selected agents. If additional filtering criteria are provided, the final selection may include fewer agents. - For random sampling, users should shuffle the `AgentSet` before applying `at_most`.
This PR updates the
select
method in theAgentSet
class by replacing then
parameter with a more versatileat_most
parameter. Theat_most
parameter allows for selecting either a specific number of agents or a fraction of the total agents when provided as an integer or a float, respectively. Additionally, backward compatibility is maintained by supporting the deprecatedn
parameter, which will trigger a warning when used.Motive
Previously, the
select
method only allowed users to specify a fixed number of agents (n
) to be selected. The newat_most
parameter extends this functionality by enabling the selection of agents based on a proportion of the total set, which is particularly useful in scenarios where relative selection is desired over absolute selection.Implementation
at_most
Parameter:at_most=1
selects one agent, whileat_most=1.0
selects all agents.n
parameter is still supported, but it now serves as a fallback forat_most
and triggers a deprecation warning.at_most
serves as an upper limit on the number of selected agents. If additional filtering criteria are provided, the final selection may include fewer agents.AgentSet
before applyingat_most
.Usage Examples
To randomly select a fraction, add a
shuffle()
:Combining with sorting:
The most powerful feature is that you can combine
at_most
with additional criteria:You can also use it with chaining: