-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat : add PU files #38
Conversation
- add pu files, adapt correction will available dicts - add Muon_RunDv1 to data json files - improve file splitting for dask/lxplus - add autorebin for plotting tools - remove redudant puwei file fix : python3.7->3.10 in pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Ming-Yan the PR looks mostly fine. I have mainly comments on the README, one question about the logic used in the job splitting using the indices in runner.py, and spotted an issue with the CI pipelines for python 3.10
else: | ||
mins = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this else
(and the if
directly above) is correct...
For the simple case that we have only one sample with 150 files, i.e. 3 batches of 50 files [0:50,50:100,100:150]:
If I only managed to finish the first job (i.e. the one with findex = 0
), and want to restart the remaining ones with --index (0,1)
. For findex == 1
, it correctly sets mins
to 50, but then after incrementing findex
by 1 in L 668, it would set mins
to 0 instead of 100 because of the if condition in L 659, or do I misunderstand something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions are set outside the while loop, which means when the findex increased, the mins would be automatically update with maxs
as what written in the while loop
if int(args.index.split(",")[1]) == findex:
mins = findex * 50
else:
mins = 0
The conditions above did not check after entering the while loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I have missed that part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I play around the bottom scripts but not seeing any suspicious thing there
import json
with open("metadata/data_Winter22_mu_BTV_Run3_2022_Comm_v1.json") as f:
sample_dict = json.load(f)
index="1,3"
findex = int(index.split(",")[1])
for sindex,sample in enumerate(sample_dict.keys()):
if sindex<int(index.split(",")[0]):continue
if int(index.split(",")[1])==findex:mins=findex*100
else:mins=0
while mins < len(sample_dict[sample]):
splitted = {}
maxs = mins + 100
splitted[sample] = sample_dict[sample][mins:maxs]
mins = maxs
findex=findex+1
print(sample,splitted)
- update readme - bugfix for dileptt,semileptt workflow - add init to PU directory - update descriptions
run3 changes
Other