You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a small development plan for allowing any Dataset::Derived to be a parent, as long as one ancestor remains Dataset::Full. This is the first step of the new dataset pipelines project.
There are three identified steps to be taken in Atlas:
Every dataset can be a parent, not just full datasets
Validation of complete parent/grandparent
Chain fallback locations to grandparents etc to pick up the correct curves
I will describe what needs to happen for each of them.
Prologue
Before we start, let's clean up some deprecated code and methods with confusing names from the Dataset and Dataset::Derived models.
Initializer Inputs
As long as I can remember these have been deprecated, but the code is still present in the Dataset::Derived model. These inputs were used to initialise Derived sets that did not originate from ETLocal and thus had no graph values.
Remove the InitializerInput model
Remove any references to and validations of InitializerInput from the Dataset::Derived model. This includes everything surrounding uses_deprecated_initializer_inputs
Remove any affected specs
PARENT_VALUE
In Runtime Atlas defines methods that can be called for the dataset within nodes and edges to build the present graph (so for Refinery). This includes methods like EB and PRIMARY_PRODUCTION. The method PARENT_VALUE has been unused for a long time and references an obsolete csv file demands/parent_values.csv within the dataset. The name for this method can become confusing while we work on this project as well as in the future, so I'd like to get rid of it.
Remove PARENT_VALUE from Atlas::Runtime
Remove parent_values method from Dataset
Check for any of these obsolete csv's still hanging around in ETSource
1. Every dataset can be a parent, not just full datasets
Our first step does not include validation of a Full ancestor yet. We are merely setting up the hierarchy.
In the Dataset::Derived model, the parent should be able to be a Derived dataset:
defparentDataset.find(base_dataset)# Was Dataset::Full.find(base_dataset)end
The same goes for validate_presence_of_base_dataset.
Adjust parent method
Adjust validate_presence_of_base_dataset method
Adjust affected specs
2. Validation of complete parent/grandparent
Because a Full dataset is the only one that can use EB methods, and those are still in use, we need to ensure that one ancestor is still a Dataset::Full and the energy_balance method of the Derived dataset is delegated to that set. Otherwise Refinery will not be able to build the graph.
Add a spec that creates a hierarchy of three datasets Full -> Derived -> Derived and test the energy_balance method on the grandchild. (It could very well be that this will pass directly)
Add a spec that creates a hierarchy of three datasets Derived -> Derived -> Derived and test the creation of the grandchild. Test if the creation throws a validation error. (This we will build in the next todo)
Add validation method validate_presence_of_full_ancestor. It will be something like this:
defhas_full_parent?Dataset::Full.exists?(base_dataset) || parent.has_full_parent?enddefvalidate_presence_of_full_ancestorreturnifhas_full_parent?errors.add(:base_dataset,'has no Full parent')end
3. Chain fallback locations to grandparents etc to pick up the correct curves
A Dataset::Derived may be incomplete because it picks up incomplete items from its parent, like missing curves.
For curves and other csvs the PathResolver will look in all supplied locations (dataset folders). For each Derived dataset the PathResolver is initialised with the method resolve_paths, which returns an array of locations that can be inspected in order of importance. We should alter this method to recursively look at the parent sets. It will be something like this:
Update the resolve_paths method in Dataset::Derived
Update and write new specs to ensure correct behaviour of located curves etc.
Epilogue: Atlas::Scaler
We now successfully made it possible for Derived datasets to have children. However, the Scaled datasets (these are Derived dataset with a scaler attached) just look only at their direct parent for scaling. If this direct parent is not Full this could possibly lead to some fallout. I did not check this thoroughly as I'd like to discuss our vision on scaled datasets first.
@mabijkerk Let's discuss in our brainstorm how we see the future of scaled datasets and check if and how they are still in use.
NB: if there is time to remove more stuff, I'd like to get rid of the old Preset still present in Atlas. These represent what we now have as featured scenarios in csv format. This has not been used for years.
The text was updated successfully, but these errors were encountered:
This is a small development plan for allowing any
Dataset::Derived
to be a parent, as long as one ancestor remainsDataset::Full
. This is the first step of the new dataset pipelines project.There are three identified steps to be taken in
Atlas
:I will describe what needs to happen for each of them.
Prologue
Before we start, let's clean up some deprecated code and methods with confusing names from the
Dataset
andDataset::Derived
models.Initializer Inputs
As long as I can remember these have been deprecated, but the code is still present in the
Dataset::Derived
model. These inputs were used to initialiseDerived
sets that did not originate from ETLocal and thus had no graph values.InitializerInput
modelInitializerInput
from theDataset::Derived
model. This includes everything surroundinguses_deprecated_initializer_inputs
PARENT_VALUE
In
Runtime
Atlas defines methods that can be called for the dataset within nodes and edges to build the present graph (so for Refinery). This includes methods likeEB
andPRIMARY_PRODUCTION
. The methodPARENT_VALUE
has been unused for a long time and references an obsoletecsv
filedemands/parent_values.csv
within the dataset. The name for this method can become confusing while we work on this project as well as in the future, so I'd like to get rid of it.PARENT_VALUE
fromAtlas::Runtime
parent_values
method fromDataset
1. Every dataset can be a parent, not just full datasets
Our first step does not include validation of a
Full
ancestor yet. We are merely setting up the hierarchy.In the
Dataset::Derived
model, theparent
should be able to be aDerived
dataset:The same goes for
validate_presence_of_base_dataset
.parent
methodvalidate_presence_of_base_dataset
method2. Validation of complete parent/grandparent
Because a
Full
dataset is the only one that can useEB
methods, and those are still in use, we need to ensure that one ancestor is still aDataset::Full
and theenergy_balance
method of theDerived
dataset is delegated to that set. Otherwise Refinery will not be able to build the graph.Full
->Derived
->Derived
and test theenergy_balance
method on the grandchild. (It could very well be that this will pass directly)Derived
->Derived
->Derived
and test the creation of the grandchild. Test if the creation throws a validation error. (This we will build in the next todo)validate_presence_of_full_ancestor
. It will be something like this:3. Chain fallback locations to grandparents etc to pick up the correct curves
A
Dataset::Derived
may be incomplete because it picks up incomplete items from its parent, like missing curves.For curves and other csvs the
PathResolver
will look in all supplied locations (dataset folders). For eachDerived
dataset thePathResolver
is initialised with the methodresolve_paths
, which returns an array of locations that can be inspected in order of importance. We should alter this method to recursively look at the parent sets. It will be something like this:resolve_paths
method inDataset::Derived
Epilogue:
Atlas::Scaler
We now successfully made it possible for
Derived
datasets to have children. However, theScaled
datasets (these areDerived
dataset with a scaler attached) just look only at their direct parent for scaling. If this direct parent is notFull
this could possibly lead to some fallout. I did not check this thoroughly as I'd like to discuss our vision on scaled datasets first.@mabijkerk Let's discuss in our brainstorm how we see the future of scaled datasets and check if and how they are still in use.
NB: if there is time to remove more stuff, I'd like to get rid of the old
Preset
still present in Atlas. These represent what we now have as featured scenarios in csv format. This has not been used for years.The text was updated successfully, but these errors were encountered: