You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently our InferenceData just wraps arviz.InferenceData for dispatch. This is convenient because it's low maintenance and highly functional, as we get all class methods implemented on the Python side for free. However, the data itself is stuck in xarray Datasets, which is fine for Python users who may already be familiar with xarray, but for Julia users who do not use xarray, it is not as friendly as we would like for actually accessing or modifying the underlying data.
An alternative is to reimplement the InferenceDataschema using only Julia packages. This comes with a maintenance cost but has the benefit of improving this package's usability for Julia users. By providing converts to/from the Python implementation of InferenceData, and using PyCall's feature of no-copy array passing, we can construct xarray views of our InferenceData to access the all existing xarray-based functionality.
To implement the schema, we need to choose (at least) one representation for arrays with named dimensions and named keys and for collections of these arrays into datasets with named variables and attributes. The old solution to named dimensions and named keys was AxisArrays.jl, which MCMCChains.jl is built on. This is probably not the best way going forward, as a rich ecosystem of packages that provide either named dimensions or name keys has been developing. See JuliaCollections/AxisArraysFuture#1 for useful discussion.
In particular, it was noted on slack that AxisSets.jl provides a KeyedDataset type that contains collections of KeyedArrays from AxisArrays.jl, which wrap NamedArray from NamedDims.jl. Because AxisSets and NamedDims are going into production at Invenia labs, they are battle tested and are maintained by dedicated teams. It is also more likely then that our users will be familiar with one or more of these packages going forward. Therefore, I propose that our Julian InferenceData is a collection of Dataset objects that wrap AxisSets.jl's KeyedDataset objects.
The text was updated successfully, but these errors were encountered:
DimensionalData has been downloaded 50x more than AxisSets in the last 6 months and has similar functionality as xarray, so I've begun work in #191 using DimensionalData as the backing for a pure Julia InferenceData.
Currently our
InferenceData
just wrapsarviz.InferenceData
for dispatch. This is convenient because it's low maintenance and highly functional, as we get all class methods implemented on the Python side for free. However, the data itself is stuck in xarrayDataset
s, which is fine for Python users who may already be familiar with xarray, but for Julia users who do not use xarray, it is not as friendly as we would like for actually accessing or modifying the underlying data.An alternative is to reimplement the
InferenceData
schema using only Julia packages. This comes with a maintenance cost but has the benefit of improving this package's usability for Julia users. By providing converts to/from the Python implementation ofInferenceData
, and using PyCall's feature of no-copy array passing, we can construct xarray views of ourInferenceData
to access the all existing xarray-based functionality.To implement the schema, we need to choose (at least) one representation for arrays with named dimensions and named keys and for collections of these arrays into datasets with named variables and attributes. The old solution to named dimensions and named keys was AxisArrays.jl, which MCMCChains.jl is built on. This is probably not the best way going forward, as a rich ecosystem of packages that provide either named dimensions or name keys has been developing. See JuliaCollections/AxisArraysFuture#1 for useful discussion.
In particular, it was noted on slack that AxisSets.jl provides a
KeyedDataset
type that contains collections ofKeyedArray
s from AxisArrays.jl, which wrapNamedArray
from NamedDims.jl. Because AxisSets and NamedDims are going into production at Invenia labs, they are battle tested and are maintained by dedicated teams. It is also more likely then that our users will be familiar with one or more of these packages going forward. Therefore, I propose that our JulianInferenceData
is a collection ofDataset
objects that wrap AxisSets.jl'sKeyedDataset
objects.The text was updated successfully, but these errors were encountered: