BTHofmann2023 additions #305

steinnymir · 2023-11-29T10:48:06Z

Bug fixes and utilities added during the beamtime at FLASH in nov2023.

Additions:

A pretty print of the processor, including some statistical information about the data loaded. Still requires testing for non-flash loaders.
filter the dataframe before binning to only look at the required columns. This avoids errors when there are issues with values in other columns (i.e. if a column is entirely nans)
added the option of sequentially loading data in the flash loader. This helps to track errors otherwise lost in the parallel processes.

Bugs:

added an error catching when trying to bin a column which has only nans. The error handled by numba was otherwise unclear.

rettigl · 2023-11-29T10:54:03Z

sed/binning/binning.py

+    # filter dataframe to use only the columns needed for the binning
+    df = df[axes]


I saw this before in your commits. I don't think this is helpful, as it introduces another graph layer. Did you try whether this really improve computation time? In my tests, it slowed things down. Or why did you introduce this in the first place?

add mono photon energy calculator

rettigl · 2023-12-01T20:49:07Z

filter the dataframe before binning to only look at the required columns. This avoids errors when there are issues with values in other columns (i.e. if a column is entirely nans)

I don't quite understand this issue, and as I have commented, it makes computation of the dataframe, and thus binning, slower. Can you elaborate on this problem? I tried to reproduce by adding a NaN column to the df, which worked without problems...

steinnymir · 2023-12-01T20:53:47Z

it makes computation of the dataframe, and thus binning, slower.

did you test this? does it really make it slower?

I cannot recall exactly what the problem was, but will get back to it during analysis certainly. The problem was with a column which had dtypes not accepted by numba. I don't know why numba needed to test all dtypes of all columns, but this was a quick fix which solved the problem.

rettigl · 2023-12-01T22:21:29Z

it makes computation of the dataframe, and thus binning, slower.

did you test this? does it really make it slower?

I cannot recall exactly what the problem was, but will get back to it during analysis certainly. The problem was with a column which had dtypes not accepted by numba. I don't know why numba needed to test all dtypes of all columns, but this was a quick fix which solved the problem.

For mpes, I get something like this pretty consistently:

And also binning is notably slower. For Flash dataframes, the difference is smaller, but also there.

Note the added graph layer:

rettigl · 2024-02-05T22:14:36Z

Tests for the repr methods are failing. This is due to various reasons, the first being that you don't check if the dataframe is loaded, another that you don't check if the columns you request exist (e.g. for non-Flash-loaders).

steinnymir added 9 commits November 21, 2023 16:42

expose parquet metadata

5a8f9b0

implement get_run_info and bugfix

684eb19

augment info prints

1606f06

augment info prints

8c1ee97

error catching for empty h5 channel

e0beeff

fixed failing loading empty per_pulse channels

264630c

filter dataframe prior to binning

0fbd2d9

catch numba typing error

e8cd06a

add mono photon energy calculator

e776f33

rettigl reviewed Nov 29, 2023

View reviewed changes

steinnymir and others added 6 commits November 29, 2023 11:58

Merge branch 'main' into BTHofmann2023

c3a5f82

Merge branch 'BTHofmann2023' into update_flsah_channels

54305b6

Merge pull request #306 from OpenCOMPES/update_flsah_channels

5a65e80

add mono photon energy calculator

fix monochromator_photon_energy

b9e98d2

inting

bf79089

remove unused function

702203f

zain-sohail mentioned this pull request Mar 20, 2024

Unify axis naming and definition #138

Closed

zain-sohail linked an issue Mar 20, 2024 that may be closed by this pull request

Extract statistical data from FLASH parquets #171

Open

zain-sohail mentioned this pull request May 5, 2024

Html representation of processor and metadata in notebooks #395

Merged

zain-sohail mentioned this pull request May 13, 2024

Expand on repr/repr html #400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BTHofmann2023 additions #305

BTHofmann2023 additions #305

Uh oh!

steinnymir commented Nov 29, 2023 •

edited

Loading

Uh oh!

rettigl Nov 29, 2023

Uh oh!

rettigl commented Dec 1, 2023

Uh oh!

steinnymir commented Dec 1, 2023 •

edited

Loading

Uh oh!

rettigl commented Dec 1, 2023

Uh oh!

rettigl commented Feb 5, 2024

Uh oh!

Uh oh!

		# filter dataframe to use only the columns needed for the binning
		df = df[axes]

BTHofmann2023 additions #305

Are you sure you want to change the base?

BTHofmann2023 additions #305

Uh oh!

Conversation

steinnymir commented Nov 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rettigl Nov 29, 2023

Choose a reason for hiding this comment

Uh oh!

rettigl commented Dec 1, 2023

Uh oh!

steinnymir commented Dec 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rettigl commented Dec 1, 2023

Uh oh!

rettigl commented Feb 5, 2024

Uh oh!

Uh oh!

steinnymir commented Nov 29, 2023 •

edited

Loading

steinnymir commented Dec 1, 2023 •

edited

Loading