-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rng fixes #428
Rng fixes #428
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall a nice set of improvements! A few minor suggestions here already, and I'll continue playing with it and report back should anything meaningful come up.
starsim/demographics.py
Outdated
@@ -344,7 +344,7 @@ def standardize_fertility_data(self): | |||
|
|||
def initialize(self, sim): | |||
super().initialize(sim) | |||
low = sim.pars.n_agents+1 # TODO: or 0? | |||
low = 0 # Was sim.pars.n_agents + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know collosions will occur if reusing slots from the initial population.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, increasing slot_scale
previously guaranteed no slot duplicates in the limit, but with this change we would no longer have such a guarantee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some testing (see #436), and agree the previous version had fewer collisions (though still a lot!), with only a tiny performance penalty for slot_scale=5
(~3%). Reverted.
starsim/modules.py
Outdated
if val.initialized is not True: # Catches False and 'partial' | ||
val.initialize(module=self, sim=sim, force=True) # Actually a dist | ||
else: | ||
print(f'TEMP: tried to reinitialize {val}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjust before final.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
if not self.pars: | ||
self.pars.n_contacts = 10 | ||
if 'n_contacts' in self.pars: # Convert from n_contacts to probability | ||
self.pars.p = self.pars.pop('n_contacts')/n_agents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I hadn't realized that n_contacts
is normalized by the population size rather than being used as an absolute number. Benefits and drawbacks to each approach, but we should rename the parameter in light of population scaling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #437
starsim/network.py
Outdated
self.graph = graph | ||
self.pars = ss.omerge(pars) | ||
self.dist = ss.Dist(distname='StaticNet') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As my first time seeing this code, I don't know what type of Dist will result from this statement. If I were to ask this dist for rvs
, would they be normally distributed? Uniform? Etc. I suspect here it's just being used to access the underlying rng and machinery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it will give:
ValueError: Dist.rvs() failed: no valid NumPy/SciPy function found in this Dist. Has it been created and initialized correctly?
target = np.random.permutation(source) | ||
source = self.get_source(inds, n_contacts) | ||
target = self.dist.rng.permutation(source) | ||
self.dist.jump() # Reset the RNG manually # TODO, think if there's a better way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe RandomNet is RNG safe anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, but there's more and less safe versions :)
self.diseases = ss.ndict(diseases, type=ss.Disease) | ||
self.networks = ss.ndict(networks, type=ss.Network) | ||
self.connectors = ss.ndict(connectors, type=ss.Connector) | ||
self.interventions = ss.ndict(interventions, type=ss.Intervention, strict=False) # strict=False since can be a function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider a strict=True
default that reverts to False
if the user provides a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not Dist(..., strict=False)
, this is ss.ndict(..., strict=False)
, which raises an exception if an object is the "wrong" type. But here we expect some interventions might be function
instead of ss.Intervention
, so we don't want to be strict. (Will probably refactor soon anyway!)
self.networks = ss.ndict(networks, type=ss.Network) | ||
self.connectors = ss.ndict(connectors, type=ss.Connector) | ||
self.interventions = ss.ndict(interventions, type=ss.Intervention, strict=False) # strict=False since can be a function | ||
self.analyzers = ss.ndict(analyzers, type=ss.Analyzer, strict=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analyzers are unlikely to be using random numbers anyway, so strict
shouldn't really matter. But curious on why False
for a default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Weird, I thought I already replied to this!) This is strict
for ndict
, not for Dist
. This prevents strict type checking, which caused this to fail if you passed it a function instead of an ss.Analyzer
object.
starsim/sim.py
Outdated
@@ -447,9 +463,9 @@ def validate_post_init(self): | |||
TBC whether we keep this or incorporate the checks into the init methods | |||
""" | |||
# Make sure that there's a contact network if any diseases are present | |||
if self.diseases and not self.networks: | |||
if self.diseases and not self.networks: # TODO: handle NCDs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES! Many NCDs and even some disese modeling applications will not have networks. It's just a warning, but wouldn't want to give users the impression that something is wrong if this is as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, just removed for now. It should really be a warning as it's trying to process beta.
kw = sc.mergedicts(dict(pars=dict(diseases='sir', networks='random')), kwargs) | ||
sim = Sim(**kw) | ||
pars = sc.mergedicts(dict(diseases='sir', networks='random'), kwargs) | ||
sim = Sim(pars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like the demo to at least mention the mainstream approach to specifying modules so it doesn't appear as though the string-based approach is the only way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think anyone whose sole knowledge of how to run Starsim comes from looking at the demo code probably should be restricted to using string arguments 🙃 But agree that we should make sure the rest of the docs, tests, and examples illustrate the full range of possible usages.
tests/test_base.py
Outdated
@@ -6,7 +6,7 @@ | |||
import sciris as sc | |||
import numpy as np | |||
import starsim as ss | |||
import matplotlib.pyplot as plt | |||
import pylab as pl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had this discussion previously, but I think important matplotlib.pyplot is preferable here. If you browse to Matplotlib's page on pylab here (https://matplotlib.org/stable/api/pylab.html) they say:
pylab is a historic interface and its use is strongly discouraged.
The use of pylab is discouraged for the following reasons:
from pylab import * imports all the functions from matplotlib.pyplot, numpy, numpy.fft, numpy.linalg, and numpy.random, and some additional functions into the global namespace.
Such a pattern is considered bad practice in modern python, as it clutters the global namespace. Even more severely, in the case of pylab, this will overwrite some builtin functions (e.g. the builtin sum will be replaced by numpy.sum), which can lead to unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine† :) I've replaced it with import matplotlib.pyplot as pl
, which hopefully will annoy everyone equally :)
† The article seems to conflate import pylab as pl
with from pylab import *
-- the latter I definitely agree is bad!
Wait, just seeing the |
Additional RNG fixes for #426. Also adds an SIS disease class among other fixes.