Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intro tutorial: Analysing model reporters #1955

Merged
merged 6 commits into from
Jan 15, 2024

Conversation

EwoutH
Copy link
Member

@EwoutH EwoutH commented Jan 11, 2024

Some enhancements to the Mesa Introductory Tutorial:

  1. Visualization Library Update: Updated the docs to mention Seaborn instead of matplotlib for data visualization.
  2. Model Efficiency Improvement: Reduced the number of agents in batch runs, resulting in quicker model executions and generating more insightful data without compromising the educational value.
  3. New Agent Reporter: Introduced a new agent reporter steps_not_given, which tracks the number of consecutive steps an agent hasn't transacted. This addition enriches the tutorial's analytical depth, demonstrating how to handle multiple reporters.
  4. Improved Tutorial Layout and Structure: Enhanced the text layout and structure of the batch run section for better readability and understanding.
  5. General Steps for Analysis: Added a section outlining general steps for analyzing model results, providing a structured approach for new users to follow and apply in their modeling endeavors.

It also fixes Readthedocs not having enough runtime to finish running the notebook, by focussing the batch_run on runs with fewer agents (which run faster).

Tested and renders correctly in Readthedocs. Please review and merge. Feel free to squash while merging.

Copy link

codecov bot commented Jan 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (a15ce9f) 79.45% compared to head (07ce5b9) 79.45%.
Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1955   +/-   ##
=======================================
  Coverage   79.45%   79.45%           
=======================================
  Files          15       15           
  Lines        1285     1285           
  Branches      285      285           
=======================================
  Hits         1021     1021           
  Misses        225      225           
  Partials       39       39           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rht rht marked this pull request as ready for review January 11, 2024 09:27
@rht
Copy link
Contributor

rht commented Jan 11, 2024

I set the PR to ready for review so that I can read the rendered tutorial on the RTD CI.

@EwoutH
Copy link
Member Author

EwoutH commented Jan 11, 2024

@rht There's a KeyboardInterrupt in the batch_run. Was already there on the current docs, and I can't reproduce it locally. Do you know what's happening there? Maybe something with the multithreading?

@EwoutH
Copy link
Member Author

EwoutH commented Jan 11, 2024

Maybe the runtime is just too long for Read the Docs

@Corvince
Copy link
Contributor

While you are at it. Can you also include a super().__init__() call in the model? Otherwise we will trigger the warning.

Also I find it suboptimal that another warning is raised, a FutureWarning regarding AgentSet. I think its a bit overwhelming for first-timers, especially since the model doesn't make any direct use the AgentSet feature (only indirectly in the schedulers)

@Corvince
Copy link
Contributor

Corvince commented Jan 11, 2024

Also also: Can you fix the reference to the matplotlib installation in the beginning? The tutorial uses seaborn. And we have to install seaborn manually for the colab version, since its not a requirement of mesa

"outputs": [],
"source": [
"# Create a point plot with error bars\n",
"g = sns.pointplot(data=results_filtered, x=\"N\", y=\"Gini\", linestyle='none')\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be slightly duplicated from the previous paragraph

g = sns.scatterplot(data=results_filtered, x="N", y="Gini")
g.set(
    xlabel="Number of agents",
    ylabel="Gini coefficient",
    title="Gini coefficient vs. number of agents",
);

Describing both seems to be an exercise in showcasing the feature of Seaborn, because they are essentially showing the same info. But if I have to choose which one to keep, the sns.pointplot with error bars seems to be more informative.

{
"cell_type": "markdown",
"source": [
"In this case it looks like the Gini coefficient increases slower for smaller populations. This can be because of different things, either because the Gini coefficient is a measure of inequality and the smaller the population, the more likely it is that the agents are all in the same wealth class, or because there are less interactions between agents in smaller populations, which means that the wealth of an agent is less likely to change."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Gini coefficient being the way it is, is due to preferential attachment. The inequality scales faster than linearly against the number of nodes. I don't think this paragraph has incorporated preferential attachment in the explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit in doubt. Sometimes it's useful to show the same thing in two different ways to engrain the idea (in this case that it's just a dataframe that can be plotted in different ways).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to reply to #1955 (comment) ? If you say so, you have to explicitly say it's a different way to visualize the same data.

@rht
Copy link
Contributor

rht commented Jan 11, 2024

Maybe the runtime is just too long for Read the Docs

I assume it's because AgentSet is slightly slower than the previous version without it?

With less agents the model runs are quicker and so the batch_run method is finished earlier. The resulting data is also more interesting.
Add a steps_not_given agent reporter to the model that's used in the batch_run() function. This way we agent plots can be discussed.
@EwoutH EwoutH force-pushed the tutorial_improvement_2024 branch from 428c25a to 07ce5b9 Compare January 11, 2024 14:55
@EwoutH
Copy link
Member Author

EwoutH commented Jan 11, 2024

Okay fixed all the stuff (see commit messages), ready for another round of review!

@quaquel
Copy link
Member

quaquel commented Jan 12, 2024

Maybe the runtime is just too long for Read the Docs

I assume it's because AgentSet is slightly slower than the previous version without it?

I quickly ran 2.2 against 2.1.5 for a few example models yesterday. We indeed sacrificed a bit of performance for the convenience of AgentSet.

@EwoutH
Copy link
Member Author

EwoutH commented Jan 12, 2024

I updated the PR description. It renders correctly in Readthedocs.

Please review and merge. Feel free to squash while merging!

@tpike3 tpike3 added the docs Release notes label label Jan 13, 2024
Copy link
Member

@tpike3 tpike3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice additions on the data analysis thanks!

@tpike3 tpike3 merged commit f21d242 into projectmesa:main Jan 15, 2024
12 checks passed
@EwoutH EwoutH mentioned this pull request Jan 16, 2024
@EwoutH
Copy link
Member Author

EwoutH commented Jan 16, 2024

Note to self: Need to talk about nested / multidimensional aggregation somewhere. Like how to aggregate over multiple agents, over multiple iterations (and possibly over multiple timesteps). When to do what and how to do it properly.

(after having seen students just throwing all there data on a big pile and drawing a single average out of it. And then reporting a confidence interval which means god knows what)

Ideally maybe a full stack “how to perform experiments and report results with agent-based models”. Including:

  • Experimental setup (full factorial, scenario based, alternatives)
  • Metrics (KPIs) and the datacollector
  • Data aggregation and visualization

And the theory and how to do it in Mesa sandwiched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Release notes label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants