Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get json document stream(s) for testing #136

Closed
prjemian opened this issue May 8, 2019 · 11 comments
Closed

get json document stream(s) for testing #136

prjemian opened this issue May 8, 2019 · 11 comments
Assignees

Comments

@prjemian
Copy link
Contributor

prjemian commented May 8, 2019

useful with #128, for example

@prjemian prjemian added this to the milestone-2019-09 milestone May 8, 2019
@prjemian prjemian self-assigned this May 8, 2019
@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

Code to collect data from databroker assuming mongodb configuration:

import datetime
from pyRestTable import Table
from databroker import Broker

db = Broker.named("mongodb_config")

all_docs = []
plans_seen = []

t = Table()
t.addLabel("uid")
t.addLabel("plan_name")
t.addLabel("datetime")
t.addLabel("len(text)")

for h in db(since="2019-03-02 01:00"):
    plan_name = h.start["plan_name"]
    if plan_name in plans_seen:
        # just a sampling of latest from each different plan
        continue
    
    plans_seen.append(plan_name)
    ts = h.start["time"]
    dt = datetime.datetime.fromtimestamp(ts)
    uid = h.start["uid"]
    
    docs = str(list(db.get_documents(h)))
    all_docs.append(docs)
    
    row = [uid.split("-")[0], ]
    row.append(plan_name)
    row.append(dt)
    row.append(len(docs))
    t.addRow(row)

print(t)

with open("/tmp/test_docs.txt", "w") as fp:
    fp.write(",\n".join(all_docs))

@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

Here is a data file (full document streams) and list of recent USAXS plans (only representative of current conditions):

usaxs_docs.txt

uid plan_name datetime len(text)
e5d2cbdc snapshot 2019-05-02 18:17:58.932330 147891
19965989 Flyscan 2019-05-02 17:48:29.729382 131093
ddffefc1 measure_USAXS_Transmission 2019-05-02 17:48:20.934118 131541
2ffe4d87 tune_mr 2019-05-02 17:45:33.937294 154386
555a6047 count 2019-05-02 16:53:31.423197 131167
b0aa6435 tune_a2rp 2019-05-02 15:38:56.536864 155243
3554003e tune_ar 2019-05-02 15:38:37.612823 157691
6cfeb213 tune_m2rp 2019-05-02 15:38:30.190181 148217
99fe9e07 TuneAxis.tune 2019-04-23 16:09:54.520233 163658
fdf496ee run_Excel_file 2019-04-23 14:52:04.605015 131167

Note: The X-ray beam was off so the data have no scientific meaning.

@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

Depositing the data file in this issue rather than dumping >1MB in the repo. Might grab smaller data for the repo but this will work for now.

@prjemian prjemian closed this as completed May 8, 2019
@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

Data will be useful when refactoring the newfile() command.

@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

That output was not interpretable as json.

@prjemian prjemian reopened this May 8, 2019
@prjemian
Copy link
Contributor Author

prjemian commented May 8, 2019

Might look at this related issue. export from one mongodb db and import to another sqlite

@prjemian
Copy link
Contributor Author

prjemian commented May 9, 2019

Databroker API docs: http://nsls-ii.github.io/databroker/api.html

Seems it is not so easy to copy from db to another for testing. Consider suitcase? Perhaps with JSON backend?

suitcase-jsonnl is very new (v0.1.1): https://github.com/NSLS-II/suitcase-jsonl

@prjemian
Copy link
Contributor Author

prjemian commented May 9, 2019

NumpyEncoder might be the trick. See:

https://github.com/NSLS-II/suitcase-jsonl/blob/master/suitcase/jsonl/tests/tests.py#L12-L13

from event_model import NumpyEncoder
...
    expected = [json.loads(json.dumps(doc, cls=NumpyEncoder))
                        for doc in documents]

@prjemian
Copy link
Contributor Author

prjemian commented May 9, 2019

slight edits from above:

import datetime
import json
from pyRestTable import Table
from databroker import Broker
from event_model import NumpyEncoder

SINCE="2019-03-02 01:00"


def routine():
    db = Broker.named("mongodb_config")

    known_plans = {}

    t = Table()
    t.addLabel("uid")
    t.addLabel("plan_name")
    t.addLabel("datetime")
    t.addLabel("len(text)")

    for h in db(since=SINCE):
        plan_name = h.start["plan_name"]
        if plan_name in known_plans.keys():
            # just a sampling of latest from each different plan
            continue

        print(plan_name)
        ts = h.start["time"]
        dt = datetime.datetime.fromtimestamp(ts)
        uid = h.start["uid"]

        docs = list(db.get_documents(h))
        known_plans[plan_name] = docs
        #text = json.dumps(docs, cls=NumpyEncoder)

        row = [uid.split("-")[0], ]
        row.append(plan_name)
        row.append(dt)
        row.append(len(docs))
        t.addRow(row)

    print(t)

    with open("/tmp/test_docs.txt", "w") as fp:
        json.dump(known_plans, fp, cls=NumpyEncoder)


if __name__ == "__main__":
    routine()

@prjemian
Copy link
Contributor Author

prjemian commented May 9, 2019

Now able to read that JSON but still cannot insert it into sqlite Broker backend. Something else is bolloxed. That's not for this issue, though. Here's the error:


  File "/home/mintadmin/Apps/anaconda/envs/bluesky/lib/python3.6/site-packages/databroker/headersource/sqlite.py", line 257, in _insert_events
    values[desc_uid])
sqlite3.InterfaceError: Error binding parameter 107 - probably unsupported type.

parameter values[desc_uid] is a list. The error happens in a thread that times out in 5s so it is not so easy to capture more details.

@prjemian
Copy link
Contributor Author

prjemian commented May 9, 2019

Revised JSON file, full and just one tune_mr scan, respectively.

usaxs_docs.json.txt
tune_mr.json.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant