-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement Dataset and InferenceData using DimensionalData #191
Conversation
Here's an example of how this looks: julia> using ArviZ
julia> idata = load_arviz_data(:radon)
InferenceData with groups:
> posterior
> posterior_predictive
> log_likelihood
> sample_stats
> prior
> prior_predictive
> observed_data
> constant_data
julia> idata.posterior
Dataset with dimensions:
Dim{:chain} Sampled Int64[0, 1, 2, 3] ForwardOrdered Irregular Points,
Dim{:draw} Sampled Int64[0, 1, …, 498, 499] ForwardOrdered Irregular Points,
Dim{:g_coef} Sampled PyCall.PyObject[PyObject 'intercept', PyObject 'slope'] ForwardOrdered Irregular Points,
Dim{:County} Sampled PyCall.PyObject[PyObject 'AITKIN', PyObject 'ANOKA', …, PyObject 'WRIGHT', PyObject 'YELLOW MEDICINE'] ForwardOrdered Irregular Points
and 7 layers:
:g Float64 dims: Dim{:chain}, Dim{:draw}, Dim{:g_coef} (4×500×2)
:za_county Float64 dims: Dim{:chain}, Dim{:draw}, Dim{:County} (4×500×85)
:b Float64 dims: Dim{:chain}, Dim{:draw} (4×500)
:sigma_a Float64 dims: Dim{:chain}, Dim{:draw} (4×500)
:a Float64 dims: Dim{:chain}, Dim{:draw}, Dim{:County} (4×500×85)
:a_county Float64 dims: Dim{:chain}, Dim{:draw}, Dim{:County} (4×500×85)
:sigma Float64 dims: Dim{:chain}, Dim{:draw} (4×500)
with metadata Dict{Symbol, Any} with 6 entries:
:inference_library_version => "3.9.2"
:sampling_time => 18.097
:tuning_steps => 1000
:created_at => "2020-07-24T18:15:12.191355"
:arviz_version => "0.9.0"
:inference_library => "pymc3" There are some numpy dtypes that just never seem to be converted on the Julia side, and this seems unlikely to change soon. This can be avoided if we have a converter from netcdf directly to our |
Are still going to have a concat function that creates a new idata object? |
Yes, |
This PR reimplements
Dataset
to be aDimensionalData.AbstractDimStack
that behaves identically toDimensionalData.DimStack
, andInferenceData
as a keyed collection ofDataset
s. DimensionalData is the closest thing in the Julia ecosystem to an xarray replacement, in that it has a structured likexarray.Dataset
. Its API is quite different from xarray's, so this is a major breaking change.Additional major differences:
concat!
has been removed from the API and package, asInferenceData
is now immutabledims
andcoords
can in general be any collection indexed by symbols, generallyDict{Symbol}
orNamedTuple
. However, complete type-inferrability is only possible withNamedTuple
. For Python functions, however,Dict
s are better behaved, since PyCall doesn't mapNamedTuple
s to python'sdict
type.Symbol
s should be used in most places instead of strings. Exceptions are when the string corresponds to a coordinate/index or a plotting label.InferenceData
andDataset
can be transparently converted toarviz.InferenceData
andxarray.Dataset
with no copying of data, so there is no appreciable loss in efficiency by having the storage be in Julia.Fixes #128 and #141