Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

City for Kr calibration #829

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

bpalmeiro
Copy link
Member

This PR aims to serve as a first step towards the joint efforts of the Kr Calibration Team towards the standardization and update of the procedure for the Kr maps production, validation and posterior usage. It proposes a first draft of the city to focus the efforts and set the guidelines.

The city will read Kr kdsts and perform a series of quality selections and data checks, then a set of fits to extract both the energy and lifetime map and last, a computation of the evolution of the relevant parameters through the run.

@bpalmeiro bpalmeiro marked this pull request as draft January 30, 2023 15:41
@soleti soleti changed the title City for Kr callibration City for Kr calibration Feb 13, 2023


@city
def icaro(files_in, file_out, compression, detector_db, run_number,
bootstrap, quality_ranges, band_sel_params, map_params, krevol_params):
def icaro(files_in, file_out, compression, event_range, print_mod,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you introduce event_range and print_mod? If I'm not wrong, the aim is that all the events are run in one interaction of the pipe. Therefore, those variables don't make sense in that context.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below there is:

fl.slice(*event_range, close_all=True),
print_every(print_mod)                ,

so I added them to the list of arguments (as I saw in other cities). If we remove them, we should also change those lines I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that regard, I think that the best example for the concept I mean is Eutropia, that is also taking the whole range of events as a whole and doing a single iteration. Do you agree, @gonzaponte?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the city needs the parameter event_range. That's probably why Eutropia takes it even though it doesn't use it (Although skimming through the code I didn't see why it is needed). But yes, in this case, we don't want to iterate over single events so the Eutropia example is a good one.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I removed those lines and the print_mod argument for now.

def icaro(files_in, file_out, compression, event_range, print_mod,
detector_db, run_number, bootstrap, quality_ranges,
band_sel_params, map_params,
r_fid, nStimeprofile, x_range, y_range):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check, shouldn't we group the variables in dicts in the config file? I think this is easier to read but not strongly opinionated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I just took a look at other cities and I saw that it was done this way. You mean grouping r_fid and nStimeprofile in a krevol_params dictionary and then call the function like this:

add_krevol(**krevol_params)

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something of that sort, but again, it's something we can discuss in the meeting, not a big deal for me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only use dictionaries as arguments whenever there are large groups of parameters that impact a specific part of the flow. For instance, if your arguments affect only the time evolution and they are quite a few, I would group them into a dictionary, otherwise, I would leave them as individual arguments.

There is no clear cut though, so waiting for the city to be more advanced will give us a better perspective.

@bpalmeiro bpalmeiro force-pushed the Kr_calibration_city branch from 00323e2 to 966db40 Compare March 23, 2023 15:59
@carhc
Copy link
Collaborator

carhc commented Apr 12, 2023

Some general comments on my first draft:

The idea was to have the map core, the function in charge of performing the fits for each bin and return an object that we could use. I created a krmap_components file in which I put almost everything, because I dind't want to flood the city file with hundreds of lines of code, making it worse in terms of readability...

  1. It begins with the map class containing the elements we discussed, even though it may change or not used in the end (maybe a dataframe is enough...). The success of the map computation is described by "valid" attribute, which should ideally be True for a bin after the whole process.

  2. Then some functions used in the map core are defined to get the number of bins, define a certain binning, split the events into the different bins... Also the fit functions and some other for statistcs.

  3. The map_builder, with a relatively small set of arguments, is the map core. With the previously defined functions:

  • It chooses the binning, assigns each event to its corresponding bin (updating the DST with 2 new columns: xbin_index and ybin_index) and initializes the map object.

  • The "valid" attribute in the map is updated: if a bin reaches the min. number of events, its set to True, and if not, to False.

  • Looping on the bins, if the bin has the min.number of events, it performs the E vs DT fit. The parameters, errors, statistics, etc are stored. The valid value is updated again if the fit has bin successful. If the bin does not have enough events, then it sets the value of e0, lt, res, etc to nan in the object.

  • After the whole process, it checks the ratio of successful bins in case many of them have failed.

  1. Some things are now hardcoded in this version but soon will be replaced, for instance, distances, geometries in general, etc which will be the values in the database. It's noted in the functions presenting this "problem", like XYrange in get_XY_bins or dt0 in linear_function.

  2. I'm already cleaning things and this is not at all a final version of the map core, of course, but I wanted to at least have this preliminary thing here so you could express any comment, opinion, etc. And then everything can be modified following your remarks.

  3. The description of the functions sometimes is not complete or maybe it doesn't have the right structure for IC. Of course this will change in following versions. I tried to leave line comments inside the functions for clarity.

  4. Names of variables, functions, etc might be misleading sometimes, or just not adequate... It worked for me but of course everything can change depending or your thoughts.

Copy link
Collaborator

@gonzaponte gonzaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look at anything in detail, but at first glance looks ok. I think we can safely assume that there is symmetry in x-y and remove all those bits dealing with potentially different x- and y-bins. The most important thing is to look into changing the format of KrMap. If you feel like there are potential problems with that we can discuss it.

from invisible_cities.core.core_functions import in_range, shift_to_bin_centers


class KrMap():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more inclined to use a dataclass

@dataclass
class KrMap:
    x_bins : np.ndarray, #1D
    y_bins : np.ndarray, #1D
    data   : pd.DataFrame

with an appropriate init method. I have the feeling that it will make everything easier. If you find that some other part of the code will be more difficult to implement, let me know and we will discuss.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this now! I'll tell you if I struggle with anything.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question here, maybe due to my limited knowledge, but why are we creating a new KrMap class instead of using the ASectorMap class that already exists and was used in ICAROS?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we will end up with a similar dataclass, but refactored into something better. As I see it, the field MapInfo in ASectorMap should be unraveled into attributes, and the fields of type pd.DataFrame should be combined into a single pd.DataFrame.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see, ASectorMap is already defined in reco/corrections.py but as far as I can see it's never used in the codebase. We should decide if we want to expand that, or create a new dataclass (KrMap?) and delete the unused ASectorMap. If we want to keep KrMap, I believe it needs a chi2 attribute as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked too much into the whole organization of the code, but I believe we are discarding all the data model form icaros.

If we want to keep KrMap, I believe it needs a chi2 attribute as well.

This can be a column of the dataframe help by KrMap

Comment on lines +81 to +82
def get_XY_bins(n_bins : int,
XYrange : Tuple[float, float] = (-490, 490)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this function? seems a bit trivial.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is... I initially added it in order to have customizable range of the map, in case someone wanted it that way, but it's true that maybe it could be inserted into other function. In any case, I guess the XYrange parameter would have to appear before, in the kr selection part, so here it would not be necessary to have the DB values (or hardcoded values, as they are now), just the parameter used before.

Comment on lines +138 to +139
bin_centers : list
List contaning np.arrays with the center of the bins
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not returned

Comment on lines +261 to +262
binsizes_x = bins_x[1:] - bins_x[:-1] # Compute bin sizes
binsizes_y = bins_y[1:] - bins_y[:-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use np.diff

from .. types.ic_types import MasksContainer

from . components import city
from . components import print_numberofevents
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print_numberofevents, quality_check and kr_selection are not defined in components.py, so the code gives ImportError

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the rest of the city as it was before my commits, I only included the imports that I needed for my part and the compute_map function, but did not modify anything else, so it should be OK when the different parts of the code are included... Right now here it's the skeleton that Brais drafted and some of those imports are not done yet (It's not that they are not in components.py because they are somewhere else, I guess they are nowhere right now).

from . components import dst_from_files
from . components import quality_check
from . components import kr_selection
from . components import map_builder
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_builder and add_krevol are defined in icaro.py and not in components (I followed the convention that if a function is used only in the module it stays in the module itself)

@bpalmeiro bpalmeiro linked an issue Feb 27, 2024 that may be closed by this pull request
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICAROS: IC city structure
4 participants