Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate json_import and json_export and switch testing to databroker-pack #479

Closed
prjemian opened this issue Jan 13, 2021 · 9 comments · Fixed by #503
Closed

Deprecate json_import and json_export and switch testing to databroker-pack #479

prjemian opened this issue Jan 13, 2021 · 9 comments · Fixed by #503
Labels
task Something to be done.
Milestone

Comments

@prjemian
Copy link
Contributor

prjemian commented Jan 13, 2021

Refactor the unit tests to use databroker-pack and databroker-unpack.

The PendingDeprecationWarning about Broker.insert applies to utils.json_export() and utils.json_import(). AFAIK, this is not used by any beam line at APS. It is only used here for unit testing. There is a newer way to do this, using using databroker-pack (command-line tool).

Originally posted by @prjemian in #475 (comment)

@prjemian prjemian added the task Something to be done. label Jan 13, 2021
@prjemian prjemian added this to the 1.4.1 milestone Jan 13, 2021
@prjemian prjemian modified the milestones: 1.4.1, 1.4.2 Jan 19, 2021
@prjemian
Copy link
Contributor Author

Add the unpack in the GitHub Actions workflow

@gfabbris
Copy link
Collaborator

We've done some of this in polartools.manage_database and added unpack to Actions. The one aspect I don't fully understand is that the pytest of manage_database only works if the databroker-unpack stays in the Github Actions (APS-4ID-POLAR/polartools#34).

@prjemian
Copy link
Contributor Author

That's a good example of a problem this project may also face. Likely it is how GitHub Actions provides a working directory. Perhaps the YAML file is placed in a directory that GitHub discards after pytest finishes?

@gfabbris
Copy link
Collaborator

The polartools.manage_database.from_databroker should do pretty much the same as the databroker-unpack command line tool, but if I don't run the command line tools first, then from_databroker seem to fail. I suspect that this is an issue with where to place the YAML file, but I think I need to dig a bit more into databroker-unpack to understand this.

@prjemian
Copy link
Contributor Author

In tests/test_export_json.py, the get_db() function creates a msgpack-backed catalog in /tmp that can be used for this process. Slight modifications:

from apstools.utils import json_export, json_import

def get_db(json_file, zip_file):
    import databroker

    db = databroker.temp()
    datasets = json_import(json_file, zip_file)
    insert_docs(db, datasets)
    return db

def insert_docs(db, datasets, verbose=False):
    db = db.v1
    for i, h in enumerate(datasets):
        if verbose:
            print(f"{i+1}/{len(datasets)} : {len(h)} documents")
        for k, doc in h:
            db.insert(k, doc)

Then, in that same directory:

dcat = get_db("data.json", "bluesky_data.zip")

from apstools.utils import listruns
listruns(db=dcat)
catalog name: temp
========= ========================== ======= ======= ========================================
short_uid date/time                  exit    scan_id command                                 
========= ========================== ======= ======= ========================================
3e89a55   2019-05-24 10:47:11.731741 success 131     count(detectors=['adsimdet'], num=1)    
2edf5d0   2019-04-12 12:58:41.239802 success 2       scan(detectors=['noisy_det'], num=8, ...
ffb80ba   2019-05-06 15:03:00.163182 success 102     count(detectors=['noisy'], num=100)     
0e8188e   2019-05-06 15:02:56.365410 success 101     count(detectors=['noisy'], num=100)     
a729093   2019-05-06 16:39:06.248241 success 127     count(detectors=['scaler'], num=1)      
67b7ef3   2019-05-06 15:22:21.472708 success 107     count(detectors=['scaler', 'noisy'], ...
22db858   2019-05-06 15:02:51.853075 success 100     count(detectors=['noisy'], num=100)     
f0a39ab   2019-04-11 16:05:49.970579 success 1       scan(detectors=['noisy_det'], num=8, ...
0a87c46   2019-05-06 15:25:21.014055 success 109     count(detectors=['scaler'], num=5)      
7ef69ea   2019-04-11 15:51:39.778569 success 2       scan(detectors=['noisy_det'], num=8, ...
4d41c06   2019-04-11 15:59:06.373829 success 1       scan(detectors=['noisy_det'], num=8, ...
837ffac   2019-05-06 15:02:46.729177 success 99      count(detectors=['noisy'], num=100)     
50cd05b   2019-04-11 16:05:50.739003 success 3       scan(detectors=['noisy_det'], num=8, ...
64d4ed4   2019-05-06 15:59:29.680473 success 113     count(detectors=['scaler_channels_ch ...
75f68f4   2019-05-06 15:01:16.722168 success 94      count(detectors=['noisy'], num=10)      
389cf14   2019-04-12 12:58:40.786560 success 1       scan(detectors=['noisy_det'], num=8, ...
bb7e048   2019-04-12 12:59:16.131039 success 2       scan(detectors=['noisy_det'], num=8, ...
616de31   2019-04-12 10:25:34.788890 success 2       scan(detectors=['noisy_det'], num=8, ...
9af10cf   2019-05-06 15:01:25.681316 success 95      count(detectors=['noisy'], num=100)     
2551749   2019-04-12 10:25:34.315701 success 1       scan(detectors=['noisy_det'], num=8, ...
========= ========================== ======= ======= ========================================

@prjemian
Copy link
Contributor Author

Similar for the USAXS test data:

import json
import zipfile

def get_test_data(json_file, zip_file):
    """get document streams as dict from zip file"""
    with zipfile.ZipFile(zip_file, "r") as fp:
        buf = fp.read(json_file).decode("utf-8")
        return json.loads(buf)

objs = get_test_data("usaxs_docs.json.txt", "usaxs_docs.json.zip")
ucat = databroker.temp()
insert_docs(ucat, objs.values(), verbose=True)

print(f"{ucat.v2.name = }")
print(f"{len(ucat.v2) = }")
listruns(db=ucat)

with this output:

1/10 : 4 documents
2/10 : 7 documents
3/10 : 7 documents
4/10 : 37 documents
5/10 : 7 documents
6/10 : 37 documents
7/10 : 41 documents
8/10 : 27 documents
9/10 : 37 documents
10/10 : 7 documents
ucat.v2.name = 'temp'
len(ucat.v2) = 10
catalog name: temp
========= ========================== ======= ======= ========================================
short_uid date/time                  exit    scan_id command                                 
========= ========================== ======= ======= ========================================
2ffe4d8   2019-05-02 17:45:33.937294 success 108     tune_mr()                               
3554003   2019-05-02 15:38:37.612823 success 103     tune_ar()                               
fdf496e   2019-04-23 14:52:04.605015 success 27      run_Excel_file()                        
1996598   2019-05-02 17:48:29.729382 success 110     Flyscan(pos_X=60, pos_Y=160, thickne ...
e5d2cbd   2019-05-02 18:17:58.932330 success 1       snapshot()                              
6cfeb21   2019-05-02 15:38:30.190181 success 102     tune_m2rp()                             
ddffefc   2019-05-02 17:48:20.934118 success 109     measure_USAXS_Transmission(detectors ...
99fe9e0   2019-04-23 16:09:54.520233 success 2       TuneAxis.tune()                         
b0aa643   2019-05-02 15:38:56.536864 success 104     tune_a2rp()                             
555a604   2019-05-02 16:53:31.423197 success 2       count(detectors=['scaler0'], num=1)     
========= ========================== ======= ======= ========================================

@prjemian
Copy link
Contributor Author

Then, created catalog configuration YAML files with steps such as:

cd /tmp/
mv ./tmpf8cvsiqs ./aps_tools_test_data
cd ./aps_tools_test_data
mv data documents
cat <<EOF >catalog.yml
sources:
  packed_catalog:
    args:
      paths:
      - /tmp/apstools_test/documents/*.msgpack
      root_map: {}
    driver: bluesky-msgpack-catalog
    metadata:
      generated_by:
        library: databroker_pack
        version: 0.3.0
      relative_paths:
      - ./documents/*.msgpack
EOF
cd ..
databroker-unpack inplace  /tmp/apstools_test_data   apstools_test_data

@prjemian
Copy link
Contributor Author

Now have these catalogs available:

(bluesky_2021_1) prjemian@zap:/tmp$ databroker-pack --list-catalogs
class_2021_03
apstools_test_data
class_data_examples
usaxs_test_data

@prjemian
Copy link
Contributor Author

prjemian commented Mar 22, 2021

Pack each using:

databroker-pack apstools_test_data --all /tmp/apstools_test --copy-external
databroker-pack usaxs_test_data --all /tmp/usaxs_test --copy-external

then make .zip file of each directory

prjemian added a commit that referenced this issue Mar 22, 2021
prjemian added a commit that referenced this issue Mar 22, 2021
prjemian added a commit that referenced this issue Mar 22, 2021
prjemian added a commit that referenced this issue Mar 22, 2021
prjemian added a commit that referenced this issue Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Something to be done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants