-
-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEP007] Decay and merge isotopic abundance dataframe #757
Conversation
tardis/model/base.py
Outdated
@@ -189,6 +191,33 @@ def abundance(self): | |||
abundance.columns = range(len(abundance.columns)) | |||
return abundance | |||
|
|||
def decay(self, day, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should happen during the initialization of the model.
The behavior should be:
If there is isotope_abundance
as parameter, decay these with time_explosion
and merge them with the normal abundances. If we do this, save the normal abundances and the undecayed abundances somewhere on the model (example raw_abundance
, raw_isotope_abundance
)
8d1b968
to
9073eff
Compare
tardis/model/base.py
Outdated
isotope_abundance = self.raw_isotope_abundance.decay(time_explosion) | ||
|
||
#Set atomic_number as index in isotopic_abundance dataframe | ||
isotope_abundance = isotope_abundance.reset_index().drop( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the many reset_index
- not sure that you need those.
tardis/model/base.py
Outdated
|
||
#Merge abundance dataframes | ||
modified_df = isotope_abundance.append(self.abundance) | ||
modified_df = modified_df.reset_index().set_index('atomic_number') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. you do not need to make something an index to have groupby
work in it.
tardis/model/base.py
Outdated
'mass_number', axis=1).set_index(['atomic_number']) | ||
|
||
#Merge abundance dataframes | ||
modified_df = isotope_abundance.append(self.abundance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you definitely do not want to append - but add the decayed dataframe to the abundances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf I have now used pd.concat()
in the latest commit .
- If you mean
pd.add()
, then according to what I could find ,index values
for bothDataframe
must be equal, then only result is correct, otherwise, values not present in one dataframe , are filled withNaN
values pd.merge()
andpd.join()
operation results in addition along columns.
So, I think , it leaves me withpd.concat()
, and I have also added a test to ensure if it works correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes into the right direction, however I think we should extend the current abundances
property to be more flexible so that we don't require to call a decay
method.
tardis/model/base.py
Outdated
@@ -189,6 +196,33 @@ def abundance(self): | |||
abundance.columns = range(len(abundance.columns)) | |||
return abundance | |||
|
|||
def decay(self, time_explosion, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want a decay function at all. Maybe we need a helper function at some point but in general we don't need this.
We don't need functionality to 'redecay' a model because we will always use time_explosion
as the time of the decay.
tardis/model/base.py
Outdated
@@ -72,7 +72,14 @@ def __init__(self, velocity, homologous_density, abundance, time_explosion, | |||
self.v_boundary_outer = v_boundary_outer | |||
self.homologous_density = homologous_density | |||
self._abundance = abundance | |||
self.decayed = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed.
tardis/model/base.py
Outdated
self.time_explosion = time_explosion | ||
|
||
self.raw_abundance = self._abundance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should discuss and define at some point the names of all abundance
variables involved.
There is input: abundance
and isotope_abundance
(should be attached to model for diagnosis and debugging purposes)
There is output: Model.abundance
which should only contain the region defined by the velocity boundaries (as it currently does) AND which I think should contain the merged result if isotope_abundance
is passed as an optional argument.
Then we have self._abundance
which looks like it's currently the reference to the input (for diagnosis)
So what we basically should decide is, how do we cache the result of the decay (do we even cache it?) and how do we attach the input to the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vg3095 I think the decay is extremely quick - maybe just a property would be fine. Actually can you profile how quick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf If by profile
, meaning time taken by decay function here
I made a dummy dataframe, then used decay on it. (with 2 shells/columns)
It takes around 0.005 seconds for 10 elements and around 0.01 seconds for 100 elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf I have now used cProfile
module, If you want to know , no. of function calls and time taken for decay method.
(Updated) No of shells/columns - 20
-
For 10 elements
33189 function calls (32937 primitive calls) in 0.030 seconds -
For 20 elements
40899 function calls (40647 primitive calls) in 0.034 seconds -
For 50 elements
64029 function calls (63777 primitive calls) in 0.050 seconds -
For 100 elements
102579 function calls (102327 primitive calls) in 0.067 seconds
I'd suggest to add a |
As of now, these are the variables associated with
|
Travis fail is due to time out in Mac build . |
I don't think we need |
tardis/model/base.py
Outdated
@@ -189,6 +199,25 @@ def abundance(self): | |||
abundance.columns = range(len(abundance.columns)) | |||
return abundance | |||
|
|||
def as_atomic_numbers(self, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should be part of IsotopeAbundance
tardis/io/decay.py
Outdated
|
||
#Merge abundance dataframes | ||
modified_df = pd.concat([isotope_abundance, abundance]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want you to add them here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf For pd.add()
, index values
for both Dataframe
must be equal, then only result is correct, otherwise, index values not present in one dataframe , are filled with NaN
values
tardis/model/base.py
Outdated
@@ -63,7 +63,7 @@ class Radial1DModel(HDFWriterMixin): | |||
def __init__(self, velocity, homologous_density, abundance, time_explosion, | |||
t_inner, luminosity_requested=None, t_radiative=None, | |||
dilution_factor=None, v_boundary_inner=None, | |||
v_boundary_outer=None): | |||
v_boundary_outer=None, isotope_abundance=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this up
tardis/model/base.py
Outdated
@@ -73,6 +73,13 @@ def __init__(self, velocity, homologous_density, abundance, time_explosion, | |||
self.homologous_density = homologous_density | |||
self._abundance = abundance | |||
self.time_explosion = time_explosion | |||
|
|||
self.raw_abundance = self._abundance | |||
self.raw_isotope_abundance = isotope_abundance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is None - you should make an empty frame
@wkerzendorf I have changed it . Travis fail is due to timeout in Mac build. I think you will have to re-trigger Mac build, after some time. |
@wkerzendorf |
tardis/io/decay.py
Outdated
return df | ||
|
||
def as_atomic_numbers(self, abundance, t, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thi should be called - merge_isotopes
and should only perform the groupby operation.
tardis/io/decay.py
Outdated
""" | ||
#Drop mass_number coloumn in isotopic_abundance dataframe | ||
isotope_abundance = self.decay(t).reset_index( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use groupby to sum up over all mass numbers.
tardis/io/decay.py
Outdated
isotope_abundance = self.decay(t).reset_index( | ||
level='mass_number').drop('mass_number', axis=1) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then return here. The rest should happen somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf the rest
means normalizing and returning as model abundance dataframe
format should be moved to other separate function (as_atomic_numbers
) and only groupby here(merge_isotopes
)
tardis/model/base.py
Outdated
@@ -73,6 +73,15 @@ def __init__(self, velocity, homologous_density, abundance, time_explosion, | |||
self.homologous_density = homologous_density | |||
self._abundance = abundance | |||
self.time_explosion = time_explosion | |||
|
|||
self.raw_abundance = self._abundance | |||
if isotope_abundance is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is None make an empty one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wkerzendorf I have made one , see line 82-84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use self.raw_isotope_abundance = isotope_abundance or pd.DataFrame()
tardis/model/base.py
Outdated
|
||
|
||
if hasattr(config.model, 'isotope_abundance'): | ||
isotope_abundance = config.model.isotope_abundance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no that's wrong - we have not decided on a config option and that should also come later!
tardis/model/base.py
Outdated
self.raw_abundance = self._abundance | ||
if isotope_abundance is not None: | ||
self.raw_isotope_abundance = isotope_abundance | ||
self._abundance = self.raw_isotope_abundance.as_atomic_numbers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add this line to the abundance
property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeganer If I put this line in under abundance
property, then this line would be executed every time model.abundance
is called , and I think it should only be called 1 time during initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As your profiling showed, executing this for 100 elements takes less than 0.02 seconds. In my opinion it's okay to call it every time. especially as it should always reflect the current state of the model.
Currently tardis objects are not designed to be changed, nevertheless if one were to change time_explosion
on the model, the isotopic abundances have to be recalculated. The easiest way to achieve this is by doing this (very fast) computation every time. I think you can count the number of accesses to model.abundance
during a typical tardis run with one hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think after some small changes this PR is done.
Please add one commit dedicated to PEP8 issues currently in this file. There are tools for commandline and some editors also have plugins for this job.
I know you didn't write a lot of the code that contains style errors but I always try to fix all PEP8 issues whenever I edit a file :)
tardis/io/decay.py
Outdated
return self.groupby('atomic_number').sum() | ||
|
||
def as_atomic_numbers(self, abundance, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call this method something like merge_with
or only merge
as that reflects what this function does, merge Isotopes with a normal DataFrame.
Maybe we can call the other argument other
so we can easily distinguish between self
and other
.
tardis/io/decay.py
Outdated
return df | ||
|
||
def merge_isotopes(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not happy with the name of this function, maybe we should call it as_atoms
? That's short and reflects that the output is only atoms instead of isotopes.
@wkerzendorf I think , in last meeting, I could not convey my message properly.
I think, you confused with the names , as almost the names are reversed , and thought |
@vg3095 this is outdated right? |
@wkerzendorf Yes, this is outdated. I will do a rebase |
@vg3095 are you sure that is not in the codebase - as we merged the uniform case? |
@wkerzendorf I have rebased it now. You can review it. |
@wkerzendorf, @vg3095 - so, do we still need to merge this or not? I am confused since I thought that all the decay stuff has already been merged. |
@wkerzendorf @unoebauer Yes, it needs to be merged. This is the only PR related to decay stuff. Rest all were related to config options for isotope. |
I have added a
decay
method underModel
class , which will taketime_explosion
as parameter , and then decayisotopic abundance frame
, and returnsmerged dataframe
.See this notebook for detailed output
Jupyter Notebook