Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add main functions for the computation of Kr maps #866

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

carhc
Copy link
Collaborator

@carhc carhc commented Mar 8, 2024

This PR addresses issue #863 (ICAROS: Calibration map creation) and lies on top of PR #865 (Data preparation functions).

It contains the main and auxiliary functions needed for the core of Icaro city. Many different functions have been defined in order to prepare, fit and produce the Kr calibration maps.

WIP:

  • Complete tests
  • Define regularization.
  • Define how to select map bins that fall within the detector volume (maybe using DB).
  • Fix the prescription for the p-value calculation.
  • Cosmetics and unify style. Many of the functions here have been changing almost daily for the last week (even today!), so there may still be inconsistencies, especially regarding the functions message for parameters and returns.

@jahernando
Copy link
Collaborator

jahernando commented Apr 4, 2024

Comments & Corrections:

  1. get_number_of_bins ok
  2. get_binned_data what is does is: get_bin_counts_and_event_bin_id, change name!
  3. update_dst extend: update_dst_with_event_bin_id
  4. in calculate_residuals line 611, it is missing KrFitFunction in lin_function?
  5. fit_and_fill_map, I think the try is not necessary.
  6. fit_and_fill_map, I will remove the else associated with if not map_bin['in_active'] or not map_bin['has_min_counts']:, and put the rest of the code (most of the function body) in the first identation level.
  7. fit_and_fill_map, L736, name = get_par_name_from_fittype(fittype = fittype), you are transforming the fit paramters into (E0, LT), so this is not necessary [it is already solved in a posterior commit!].
  8. icaro city is not commented, you need to indicate the meaning of the parameters, in particular what is nStimeprofile
  9. icaro city, I understand it is still a draft, but nevertheless, we should add a check function after the map_builder and map_evol main functions.
  10. icaro city, L80, L81, about the sinks, what are the arguments: "pointlike_event", "pmap_passed"
  11. compute_map, has some hardwired values, this is not acceptable, these parameters should be in a configuration file, even if they are almost never modified [Some modifications done in a later commit].
  12. What is the decision about computing the chi2?
  13. Most of the functions have not an associated test!

Tests:

  • ok get_number_of_bins

@jahernando
Copy link
Collaborator

This is a large commit, better do small commits with specific changes. Otherwise is difficult to follow the flow of changes.

I understant that:

  • some (empty) functions are introduced in the main flow to check the map and the map evolution internally
  • some functions have been renamed to gain in clarity
  • some arguments are not passed by default to gain robustness

Do I miss other changes?

There is a comment of many lines related with time-series, I understant this is the time-evolution part, that will be re-adapted in a new PR.

@carhc Are you preparing to commit more tests of the functions used in the code?

@bpalmeiro bpalmeiro linked an issue Oct 18, 2024 that may be closed by this pull request
6 tasks
@carhc carhc marked this pull request as draft December 30, 2024 10:14
@carhc carhc force-pushed the map_components branch 2 times, most recently from 063b20c to c7606ed Compare December 30, 2024 10:34
Copy link
Collaborator

@gonzaponte gonzaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being done with peras

Comment on lines +202 to +203
geom_comb = itertools.product(b_center[1], b_center[0])
r_values = np.array([np.sqrt(x**2+y**2)for x, y in itertools.product(b_center[1], b_center[0])])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually take x as the outer look and y as the inner one. Here geom_comb will loop over x first, I suggest to change it, if it doesn't have any implications.

Also, you have two instances of this cartesian product. I suggest making a list and use it in the next line and below in the assignment to the columns

Comment on lines +205 to +212
kr_map['bin'] = bin_index
kr_map['counts'] = counts
kr_map['R'] = r_values
kr_map[['Y', 'X']] = pd.DataFrame(geom_comb)
kr_map['in_active'] = kr_map['R'] <= r_max
kr_map['has_min_counts'] = kr_map['counts'] >= n_min
kr_map['fit_success'] = False
kr_map['valid'] = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are going to enumerate most columns anyway, why not create the dataframe using the syntax:
pd.DataFrame(dict(col0 = values0, ...)?

Comment on lines +245 to +266
def get_XY_bins(n_bins : np.array,
XYrange : Tuple[float, float]):
'''
Returns the bins that will be used to make the map.

Parameters
---------
b_nins: np.array
array of len = 2 containing the number of bins in X and Y
XYrange: Tuple[float, float]
Limits (mm) of X and Y for the map computation

Returns
---------
bins: Tuple[np.array, np.array]
Bins in each direction (X,Y) (square map).
'''
bins_x = np.linspace(*XYrange, n_bins[0]+1)
bins_y = np.linspace(*XYrange, n_bins[1]+1)
return bins_x, bins_y


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this function is used in many many places, I don't think it's worth it.

Comment on lines +352 to +358

function = fit_output.fn

res = y - function(x)
std = res.std()

return res, std
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this more compact: it can fit in one line.

p-value for the Shapiro-Wilk normality test.
'''

pval = stats.shapiro(residuals)[1] if (len(residuals) > 10) else 0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 10 a magic number?

Comment on lines +445 to +446
k = map_bin.bin

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superfluous variable, please remove

map_bin['res_std'] = std
map_bin['chi2'] = chi2
map_bin['pval'] = pval
map_bin['fit_success'] = True if ier in [1, 2, 3, 4] else False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
map_bin['fit_success'] = True if ier in [1, 2, 3, 4] else False
map_bin['fit_success'] = ier in range(1, 5)

map_bin['chi2'] = chi2
map_bin['pval'] = pval
map_bin['fit_success'] = True if ier in [1, 2, 3, 4] else False
map_bin['valid'] = map_bin['fit_success'] & map_bin['has_min_counts'] & map_bin['in_active']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for booleans it's more idiomatic to use and instead of &

Comment on lines +480 to +501
def find_outliers(maps : pd.DataFrame,
x2range : Tuple[float, float] = (0, 2)):
'''
For a given maps and deserved range, it returns a mask where values are
within the interval.

Parameters
---------
maps: pd.DataFrame
Map to check the outliers
x2range : Tuple[float, float]
Range for chi2

Returns
---------
mask: pd.Series
Mask.
'''
mask = in_range(maps.chi2, *x2range)
return mask


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is also superfluous

Comment on lines +523 to +524
outliers = maps.in_active
outliers &= np.logical_not(find_outliers(new_map, x2range))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
outliers = maps.in_active
outliers &= np.logical_not(find_outliers(new_map, x2range))
outliers = maps.in_active & ~in_range(new_map.chi2, *x2range))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICAROS: calibration map creator
4 participants