-
Notifications
You must be signed in to change notification settings - Fork 1
Read timeout stopped measurements. #233
Comments
As stated in #231 (comment):
Thought we had a longer timeout than 2 seconds. Needs investigating why this had only a 2 second timeout. |
The default (communications) timeout is set to 10 s for all EPICS PVs: ipython-8idiuser/profile_bluesky/startup/instrument/framework/initialize.py Lines 123 to 125 in 7d5ee99
|
At APS, empirical evidence shows that beam lines all around the ring experience random slow response times for EPICS The USAXS instrument had a very similar issue with this last week. Looks like the same thing to me. So the question is: @danielballan, @tcaswell, @MikeHart85: Your comments and suggestions are most welcome. |
Here all the places that
|
Looking through the
|
Specifically:
|
Is this a local problem or an |
@prjemian Is it better at the current stage to simply ignore the time out and make sure the scan continues? I have a hard time imaging that the users would need the storage ring signal for data analysis. |
@prjemian E038 crashed at measurement 14145 at 5:27 pm on 10/4 It looks like another timeout issue I believe Error from terminal
Logger
edit: Shortly after I restarted I saw this come up in the terminal, not sure if unusual or not Terminal message
|
E039 crashed at measurement 2242 at 21:06, 10/04:
Error from terminal
Error from logger file
|
Scan restarted at 23:14, 10/04 |
Reports of random caget timeouts from a variety of EPICS clients are common yet not frequent. See this issue for some discussion: bluesky/ophyd#776 See this issue comment for comparison of longer timeout v. retry: bluesky/ophyd#788 (comment) (summary: the preference depends on the EPICS client) |
@prjemian E040 crashed at measurement 06615 at 9:25 am on 10/5 Error from terminal
Logger
|
Yikes! Monday operations? I've got a branch open instead of master. |
Confirmed this is the active issue on which to report. |
Scan aborted on 9:48 pm, 41 min after the initial jam. No update from the terminal since then. |
Instead of using a countdown timer (see #238), the countdown timer is one aspect of the |
NOTE: |
code snippet: from ophyd.status import Status
import time
def thing(attempts=5, timeout_s=1, wait_s=2):
success = False
for attempt in range(attempts):
status = Status(timeout=timeout_s)
print("starting to wait...")
expired = time.time() + wait_s
while not status.done and time.time() < expired:
time.sleep(0.01)
if not status.done:
status.set_finished() # successfully
time.sleep(1e-6) # brief pause for status internals
print(status)
if status.success:
print("Timer finished.")
success = True
break
else:
print(attempt+1, "time expired")
if not success:
print(f"did not succeed after {attempts} attempts") examples
|
In a plan, use |
@prjemian H053 crashed at measurement 10396 at 1:43 pm on 10/6 Error from terminal
Logger
|
This error: This error: This error: |
@prjemian New Bluesky code jammed after the first measurement. Logger file shows ERROR Logger file
|
Anything on the console?
…On Tue, Oct 6, 2020, 8:12 PM Qingteng Zhang ***@***.***> wrote:
New Bluesky code jammed after the first measurement. Logger file shows
ERROR
Logger file
|2020-10-06 17:07:47.345|INFO|16271|bluesky-session|shutters|16|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/shutters.py
|2020-10-06 17:07:47.354|INFO|16271|bluesky-session|slits|9|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/slits.py
|2020-10-06 17:07:47.537|INFO|16271|bluesky-session|soft_glue_fpga|15|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/soft_glue_fpga.py
|2020-10-06 17:07:47.609|INFO|16271|bluesky-session|tables|9|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/tables.py
|2020-10-06 17:07:47.678|INFO|16271|bluesky-session|lambda_750k|14|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/lambda_750k.py
|2020-10-06 17:07:47.705|INFO|16271|bluesky-session|rigaku_ufxc|14|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/devices/rigaku_ufxc.py
|2020-10-06 17:07:47.750|INFO|16271|bluesky-session|alignments|16|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/alignments.py
|2020-10-06 17:07:47.752|INFO|16271|bluesky-session|shutters|15|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/shutters.py
|2020-10-06 17:07:47.754|INFO|16271|bluesky-session|flyscan|11|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/flyscan.py
|2020-10-06 17:07:47.756|INFO|16271|bluesky-session|flux_calculations|14|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/flux_calculations.py
|2020-10-06 17:07:47.758|INFO|16271|bluesky-session|functions|9|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/functions.py
|2020-10-06 17:07:47.760|INFO|16271|bluesky-session|lineup_tweak|12|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/lineup_tweak.py
|2020-10-06 17:07:47.762|INFO|16271|bluesky-session|move_diodes|18|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/move_diodes.py
|2020-10-06 17:07:47.763|INFO|16271|bluesky-session|move_sample|19|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/move_sample.py
|2020-10-06 17:07:47.766|INFO|16271|bluesky-session|pv_registers|17|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/pv_registers.py
|2020-10-06 17:07:47.827|INFO|16271|bluesky-session|xpcs_acquire|11|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/plans/xpcs_acquire.py
|2020-10-06 17:07:47.833|INFO|16271|bluesky-session|explorer|14|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/utils/explorer.py
|2020-10-06 17:07:47.833|INFO|16271|bluesky-session|collection|23|MainThread| - Disabling log output to the console now. Log output will still be recorded in log files. Re-enable with this console-only command:
logger.addHandler(console_log_writer)
|2020-10-06 20:04:30.374|INFO|16271|bluesky-session|emails|13|MainThread| - /home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/instrument/utils/emails.py
|2020-10-06 20:04:30.377|DEBUG|16271|bluesky-session|pv_registers|86|bluesky-run-engine| - Backing up RIGAKU500K_NoGap detector Beam Params
|2020-10-06 20:04:30.448|DEBUG|16271|bluesky-session|pv_registers|104|bluesky-run-engine| - Detector RIGAKU500K_NoGap Beam Params are Backed up
================ ========== ===========
value from to
================ ========== ===========
254.1 8idi:Reg11 8idi:Reg141
149.4 8idi:Reg12 8idi:Reg142
13.386 8idi:Reg13 8idi:Reg143
-0.2315 8idi:Reg14 8idi:Reg144
876.71777020394 8idi:Reg15 8idi:Reg145
148.45656772884 8idi:Reg16 8idi:Reg146
250.0 8idi:Reg17 8idi:Reg147
72.0 8idi:Reg18 8idi:Reg148
0.0 8idi:Reg19 8idi:Reg149
0.56746170980549 8idi:Reg20 8idi:Reg150
================ ========== ===========
|2020-10-06 20:04:30.460|INFO|16271|bluesky-session|pv_registers|208|bluesky-run-engine| - ******** detector number 46.0 ************************
|2020-10-06 20:04:30.502|DEBUG|16271|bluesky-session|pv_registers|133|bluesky-run-engine| - Detector RIGAKU500K_NoGap Beam Params are restored
================ =========== ==========
value from to
================ =========== ==========
254.1 8idi:Reg141 8idi:Reg11
149.4 8idi:Reg142 8idi:Reg12
13.386 8idi:Reg143 8idi:Reg13
-0.2315 8idi:Reg144 8idi:Reg14
876.71777020394 8idi:Reg145 8idi:Reg15
148.45656772884 8idi:Reg146 8idi:Reg16
250.0 8idi:Reg147 8idi:Reg17
72.0 8idi:Reg148 8idi:Reg18
0.0 8idi:Reg149 8idi:Reg19
0.56746170980549 8idi:Reg150 8idi:Reg20
================ =========== ==========
|2020-10-06 20:04:31.507|INFO|16271|bluesky-session|pv_registers|213|bluesky-run-engine| - Moving RIGAKU to the direct beam position
|2020-10-06 20:04:43.426|INFO|16271|bluesky-session|bp_Test_Bluesky|36|bluesky-run-engine| - J054_00001_CPMV_6C_7v1_Water_att00_Try1
|2020-10-06 20:04:57.052|INFO|16271|bluesky-session|xpcs_acquire|49|bluesky-run-engine| - AD_Acquire starting
|2020-10-06 20:04:57.053|INFO|16271|bluesky-session|xpcs_acquire|60|bluesky-run-engine| - file_path = /home/8ididata/2020-3/qzhang202010/J054_00001_CPMV_6C_7v1_Water_att00_Try1/
|2020-10-06 20:04:57.079|INFO|16271|bluesky-session|xpcs_acquire|318|bluesky-run-engine| - calling full_acquire_procedure()
|2020-10-06 20:04:57.090|INFO|16271|bluesky-session|xpcs_acquire|147|bluesky-run-engine| - detNum=46, det_pars={'ccdHardwareColSize': 512, 'ccdHardwareRowSize': 1024, 'ccdxsense': 1, 'ccdzsense': 1, 'harmonic': 1, 'dpix': 0.076, 'saturation': 3, 'adupphot': 1.0, 'efficiency': 1.0, 'gain': 1, 'blemish': 1, 'flatfield': 0, 'distortion': 0, 'parasitic': 0, 'lld': 0, 'manufacturer': 'RIGAKU500K_NoGap'}
|2020-10-06 20:04:57.222|INFO|16271|bluesky-session|xpcs_acquire|282|bluesky-run-engine| - scaler should be autocounting now
|2020-10-06 20:04:57.378|INFO|16271|bluesky-session|rigaku_ufxc|129|bluesky-run-engine| - Shutter will be controlled by UFXC if shutter is left in the **showbeam** state.
|2020-10-06 20:04:57.412|INFO|16271|bluesky-session|rigaku_ufxc|33|bluesky-run-engine| - process_info status: running
|2020-10-06 20:04:57.416|INFO|16271|bluesky-session|rigaku_ufxc|77|bluesky-run-engine| - command: echo FILE:F:J054_00001_CPMV_6C_7v1_Water_att00_Try1 | nc rigaku1.xray.aps.anl.gov 10000 - process info: pid=11528, info=psutil.Process(pid=11528, name='sh', started='20:04:56')
|2020-10-06 20:04:57.701|INFO|16271|bluesky-session|rigaku_ufxc|33|bluesky-run-engine| - process_info status: running
|2020-10-06 20:04:57.701|INFO|16271|bluesky-session|rigaku_ufxc|77|bluesky-run-engine| - command: echo EXPOSURE | nc rigaku1.xray.aps.anl.gov 10000 - process info: pid=11534, info=psutil.Process(pid=11534, name='sh', started='20:04:57')
|2020-10-06 20:05:02.633|INFO|16271|bluesky-session|xpcs_acquire|308|Thread-697| - DM workflow kickoff starting: analysis:True file:/home/8ididata/2020-3/qzhang202010/J054_00001_CPMV_6C_7v1_Water_att00_Try1/J054_00001_CPMV_6C_7v1_Water_att00_Try1_0001-100000.hdf
|2020-10-06 20:05:02.645|ERROR|16271|bluesky-session|bp_Test_Bluesky|120|bluesky-run-engine| - timeout while collecting J054_00001_CPMV_6C_7v1_Water_att00_Try1 (attempt 1 of 5)
|2020-10-06 20:05:02.645|INFO|16271|bluesky-session|bp_Test_Bluesky|125|bluesky-run-engine| - waiting 300 seconds before retry or continue ...
|2020-10-06 20:05:03.501|INFO|16271|bluesky-session|xpcs_acquire|313|Thread-697| - DM workflow kickoff done
|2020-10-06 20:05:03.501|INFO|16271|bluesky-session|xpcs_acquire|314|Thread-697| - b'id=4fdeb260-fb67-4323-b509-bccc00f6ac77 owner=8idiuser status=pending startTime=1602032703.46 startTimestamp=2020/10/06 20:05:03 CDT \n'
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#233 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARMUMFDNJZCFTN5P7ARGOTSJO6A5ANCNFSM4SCWH73Q>
.
|
@prjemian It stopped updating after the first measurement and Rigaku is idle. Should I check out the older version or would you like to take a look at it? Output from Console
|
The sample is already loaded and I'm already home. Will be in standby all night. Please let me know how we should proceed |
Adjusted the code to watch the status object after acquire started, now working, started run with 20k measurements. Monitor the log files for |
With recent changes, the plan logs timeouts (10 s now), waits the prescribed time (5 minutes now) and then continues.
Doesn't seem to be retrying but that is a lower priority than the priority to keep running. We could lengthen the timeouts to 60s (1m). Perhaps encounter them even less frequently. |
FYI, the code should look a bit simpler and more like other bluesky code that subscribes to a signal with a related status object. The The example code also uses a userCalc record to simulate the detector acquire and complete signals. The detector completion is manipulated manually in a different client. Here's the revised example:
|
I put the new code in a file next to the existing |
Scan crashed at 14239/20000 due to what appears to be a timeout. Did not go to Try2. Terminal
Logger
|
Try the |
Rigaku interface hung again. The Terminal
Logger
|
Part of this issue is related to the Rigaku LabView PC, which has since been switched to the dedicated Linux server. Closing the issue for now and will reopen if problem resurfaces. |
@prjemian C034 crashed at measurement 08539 at 4:21 am on 10/03
Error from terminal
Originally posted by @Kacperswitalski in #231 (comment)
The text was updated successfully, but these errors were encountered: