Skip to content

PoC mantid reducer #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

PoC mantid reducer #122

wants to merge 3 commits into from

Conversation

Tom-Willemsen
Copy link
Contributor

Proof-of-concept for mantid reduction as part of a bluesky plan using FIA api

Closes #107

@Tom-Willemsen Tom-Willemsen marked this pull request as draft February 28, 2025 10:08
@Tom-Willemsen
Copy link
Contributor Author

Tom-Willemsen commented Feb 28, 2025

Initial thoughts:

  • Currently we have to do a binary search to find last-submitted job, and hope that's our job. Not great. I don't think we can viably even run automated system tests with this, if we're not able to reliably get Job IDs.
  • Latency is about 10-15 seconds, plus however long the actual mantid reduction script takes. Quite high variability, sometimes it's just a few seconds or sometimes up to 30s. Guess this might just be a function of load on the cluster.
  • Currently passing various data around as json/strings. This isn't very efficient (and FIA outright rejects medium-large spectra-maps if we just do the dumb "embed them in the script" approach. Need to do something cleverer, likely with input files. Unclear if FIA api currently has infrastructure to help us do this (beyond just saving out a full nexus, waiting for it to show up on archive, and getting mantid to read that - but that'll be very slow)
  • Various bits of FIA api are only really documented in their source code right now - e.g. exactly what you have to print to get "results" to show up in output.
  • Right now we're abusing output_files to return answers that aren't filenames. That's probably bad.
  • I think this probably is viable, but only for use cases with quite long runs. 15s latency is way too much for e.g. reflectometry alignment scans.
  • FIA will briefly give bad http responses when it gets redeployed, we'll need to put in some careful retry logic to get around this. Have asked my contact what kinds of retry delays/timeouts we might need.

@Tom-Willemsen
Copy link
Contributor Author

Some answers from my mantid/FIA contact...

Latency is about 10-15 seconds, plus however long the actual mantid reduction script takes. Quite high variability, sometimes it's just a few seconds or sometimes up to 30s. Guess this might just be a function of load on the cluster.

This is expected and not easy to reduce.

Currently passing various data around as json/strings. This isn't very efficient (and FIA outright rejects medium-large spectra-maps if we just do the dumb "embed them in the script" approach. Need to do something cleverer, likely with input files. Unclear if FIA api currently has infrastructure to help us do this (beyond just saving out a full nexus, waiting for it to show up on archive, and getting mantid to read that - but that'll be very slow)

The reduction process will need to map the instrument data area soon anyway, so we can just dump arrays there rather than embedding them.

Various bits of FIA api are only really documented in their source code right now - e.g. exactly what you have to print to get "results" to show up in output.

Yes the documentation is basically just the source code at present and what we're doing is "correct" in the sense that it works.

Right now we're abusing output_files to return answers that aren't filenames. That's probably bad.

Some suggestion of using outputs in future rather than output_files but fundamentally what we're doing isn't "wrong".

I think this probably is viable, but only for use cases with quite long runs. 15s latency is way too much for e.g. reflectometry alignment scans.

Yes this is true, mantid reductions won't be viable for short runs, this is unlikely to change in short term. Long term there might be other solutions.

FIA will briefly give bad http responses when it gets redeployed, we'll need to put in some careful retry logic to get around this. Have asked my contact what kinds of retry delays/timeouts we might need.

Should be up again within ~seconds. Suggest retry loop retries 20 times at 5s intervals or something before erroring if API is dead.

@Tom-Willemsen Tom-Willemsen changed the title [DO NOT MERGE] PoC mantid reducer PoC mantid reducer Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Investigate/PoC] DAE reducer which uses mantid via FIA API
1 participant