Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manual creation of dataset #20

Open
peendebak opened this issue Aug 19, 2016 · 8 comments
Open

manual creation of dataset #20

peendebak opened this issue Aug 19, 2016 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@peendebak
Copy link

The following code created a dataset, but only the z array is shown.

There are two issues here:

import qcodes
from qcodes.tests.data_mocks import *

def DataSet2D(location=None):
    # DataSet with one 2D array with 4 x 6 points
    yy, xx = numpy.meshgrid(range(4), range(6))
    zz = xx**2+yy**2
    # outer setpoint should be 1D
    xx = xx[:, 0]
    x = DataArray(name='x', array_id='x', label='X', preset_data=xx, is_setpoint=True)
    y = DataArray(name='y', array_id='y',  label='Y', preset_data=yy, set_arrays=(x,),
                  is_setpoint=True)
    z = DataArray(name='z',  array_id='z', label='Z', preset_data=zz, set_arrays=(x, y))
    #return new_data(arrays={'x': x, 'y': y, 'z': z}, location=location) # this would fail
    return new_data(arrays=[x,y,z], location=location)

d=DataSet2D()
print(d)

@alexcjohnson @giulioungaretti

@giulioungaretti
Copy link
Member

@peendebak @eendebakpt yeah, it must not be a dict, because iterating over a dict yelds keys and we clearly want to add the data_array.
But man the code/docs are misleading.
There is a data_set.arrays, which is indeed a {arra_id: data_array} thing.

But actually the data is all there.
Maybe a bug in the repr function of the dataset ?

@giulioungaretti
Copy link
Member

2016-08-19-153217_845x598_scrot

@giulioungaretti
Copy link
Member

the bug is in def _clean_array_ids(self, arrays), and precisely in the return statement.

If any array has the same action_index then it will not exist in the action_id_map (whose meaning goes beyond my understanding). Maybe @alexcjohnson can shed some light on it ?

@giulioungaretti
Copy link
Member

@eendebakpt @peendebak I guess that any data create from the loop will always have different action ids.

@peendebak
Copy link
Author

@giulioungaretti @alexcjohnson The following does work. The issue is indeed with the action_id_map. Do we want to solve this, or wait untill the entire DataSet objects gets fixed?

def DataSet2D(location=None):
    # DataSet with one 2D array with 4 x 6 points
    yy, xx = numpy.meshgrid(range(4), range(6))
    zz = xx**2+yy**2
    # outer setpoint should be 1D
    xx = xx[:, 0]
    x = DataArray(name='x', array_id='x', label='X', preset_data=xx, is_setpoint=True)
    y = DataArray(name='y', array_id='y',  label='Y', preset_data=yy, set_arrays=(x,),
                  is_setpoint=True)
    z = DataArray(name='z',  array_id='z', label='Z', preset_data=zz, set_arrays=(x, y))

    print('new data...')
    dd =  new_data(arrays=[], location=location)
    dd.add_array(x)
    dd.add_array(y)
    dd.add_array(z)
    return dd
d=DataSet2D()
print(d)

@alexcjohnson
Copy link

@peendebak thanks for bringing this up.

The bandaid solution to the __repr__ problem would be to check if action_id_map actually points to all the arrays before trying to use it. Or I guess to give each array a unique action_indices (in the order the arrays were provided!) inside _clean_array_ids, that would solve it for this particular case.

But really, per my TODO we should get action_id_map out of DataSet entirely, its function is really internal to a Loop so DataSet shouldn't know anything about it.

One difficulty with this, and the reason I think @MerlinSmiles used action_id_map in __repr__ in the first place, is that we'd like to maintain the order of arrays within a DataSet so that it tells you the order of acquisition. Currently they're unordered because DataSet.arrays is a dict, so only the action_ids gives an order. We can change it to an OrderedDict or something to solve this. That change would need to be propagated to our storage formats - @AdriaanRol I think we talked about this at some point, I don't know if there's a natural way to do this within HDF5? Right now I believe action_id_map is lost when you save and reload a DataSet, so the order of __repr__ entries will be undefined after that even if you made it with a regular Loop in the first place.

@AdriaanRol
Copy link

@alexcjohnson
I approve of using OrderedDict for the arrays 👍 .

From a hdf5/h5py technical perspective the h5py Group works like a dictionary. The way I would encode this is by adding a list containing array_id's that contains the order of the arrays. That way it is easy to both store and extract in the proper order (in any case quite natural).

Additionally I would like to have a good example dataset and a good test to see if two datasets are identical to see if I correctly write and read. (most importantly this test will tell me what actually defines the dataset)

In microsoft/Qcodes#179 I am currently passing all tests for writing and saving simple data but I have not included things like the action id (which may explain why it does not yet work with the loop). The tests I use are based on the test_format, which tests the gnuplot formatter.

tl;dr

  • OrderedDict 👍 , action_ids 👎
  • OrderedDict in hdf5 -> easy to implement
  • required -> way to test if working correctly

@giulioungaretti giulioungaretti added the bug Something isn't working label Oct 11, 2016
@giulioungaretti giulioungaretti self-assigned this Oct 11, 2016
@giulioungaretti
Copy link
Member

microsoft/Qcodes#162 won't happen if the madness in action_id_map gets fixed, which in turn will probably fix this.

@jenshnielsen jenshnielsen transferred this issue from microsoft/Qcodes Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants