-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding custom modifiers to cabinetry #382
Comments
@pariRieck kindly notified me that I didn't post any code related to the validation problem with cabinetry. With the modified workspace: ws_test.json
{
"channels": [
{
"name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
"samples": [
{
"name": "LQ",
"data": [
-0.05606459826231003,
-0.12818728387355804,
-0.2890488803386688,
-0.32937803864479065,
-0.31295812129974365,
-0.19933012127876282,
-0.37242090702056885,
-0.17492030560970306,
0.04373674467206001,
-0.06333830207586288,
-0.15742823481559753,
-0.22937218844890594,
-0.2610972225666046,
-0.2189781814813614,
-0.3174850046634674,
-0.4120495915412903,
-0.15686370432376862,
-0.021678276360034943,
-0.0045187827199697495,
-0.01344318501651287,
-0.02579873986542225,
-0.04982074350118637,
-0.07583940774202347,
0.004755964037030935,
-0.003164847381412983,
-0.023233694955706596,
4.7682831905149214e-07
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
0.008668658155208378,
0.02022405434559139,
0.09189963471064777,
0.055041314193191596,
0.05193619411764286,
0.05250817228062847,
0.12196702526253103,
0.09533493472770015,
0.04304259063548334,
0.008735758281095813,
0.021724028820482494,
0.07366337499255725,
0.13916412487069324,
0.05304763182865001,
0.0740552658748865,
0.09788409139292459,
0.0856684157637054,
0.031515757273781884,
0.0028551501550830223,
0.00452889907684713,
0.01178630152077521,
0.01785971625171565,
0.029409669632190254,
0.008514955090679089,
0.005638608568332566,
0.022561821378473186,
3.3716854266200553e-07
]
},
{
"data": null,
"name": "Signal_norm",
"type": "normfactor"
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "f1",
"type": "customfunc",
"data": {
"expr": "srqt(LQ_sig)"
}
}
]
},
{
"name": "Ztautau",
"data": [
436.16748046875,
488.24713134765625,
592.1552124023438,
689.1245727539062,
541.2816772460938,
382.540283203125,
399.01214599609375,
159.18698120117188,
86.46419525146484,
7.16304349899292,
13.451751708984375,
19.981090545654297,
17.612442016601562,
11.224842071533203,
8.581535339355469,
9.507404327392578,
3.3369927406311035,
1.5719773769378662,
9.999999717180685e-10,
9.999999717180685e-10,
0.1462988257408142,
1.0335540771484375,
1.1950773000717163,
0.6874598264694214,
0.5342414975166321,
0.7435778975486755,
0.34987324476242065
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
20.5831328383989,
12.685052357006015,
20.653157956521852,
12.434290751560727,
7.3875447209423735,
5.693878050538804,
4.438010832782596,
2.27962780062575,
1.7460472807358358,
3.2293445779617964,
2.937584300663341,
3.8563300760376307,
2.5811459297683035,
2.132442170617347,
1.1302369417272697,
1.158050378970395,
0.4328057657996733,
0.293585895025791,
1.2549671742314705,
0.6123342689795961,
0.42450101635581405,
0.6192830685386423,
0.6034584276428155,
0.2628000515705097,
0.26773556125933473,
0.5429703570495382,
0.20594280974269555
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "Zjetsnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
}
]
},
{
"name": "ttbar",
"data": [
3.1341841220855713,
12.720019340515137,
19.068552017211914,
23.117305755615234,
20.081254959106445,
18.199512481689453,
21.11097526550293,
5.927351474761963,
2.1456751823425293,
9.777939796447754,
36.38203048706055,
58.69355773925781,
55.95661926269531,
55.57005310058594,
38.96144485473633,
45.38947296142578,
14.888962745666504,
5.933634281158447,
7.021153926849365,
30.105974197387695,
41.13555145263672,
38.76812744140625,
40.98802185058594,
26.757863998413086,
23.775829315185547,
9.067926406860352,
3.494324207305908
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
0.7048574962698888,
1.447747788335472,
1.934288774132403,
2.036424385977694,
2.7116461600003383,
1.6235258929340821,
1.7616352447362071,
0.937151105504632,
0.5434934672310043,
1.2492619788650297,
2.364247978851889,
3.534628792227429,
4.882973770900739,
3.3520027176752483,
2.5269571859100153,
2.539433692589492,
1.4886337537773227,
1.3047660819005515,
1.0198372166654062,
2.1343265167845056,
3.6262796121476577,
2.765231829052504,
3.5557478234557185,
1.9370341684741987,
3.161288352499607,
1.1018118503488556,
0.7184444459774411
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "ttnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
},
{
"name": "ttbar-syst-ME",
"type": "normsys",
"data": {
"hi": 1.2540423950845079,
"lo": 0.7459576049154921
}
},
{
"name": "ttbar-syst-ME",
"type": "histosys",
"data": {
"hi_data": [
5.589620550961276,
13.99669448298217,
23.05127199221935,
26.189907309427298,
22.649043407791225,
17.228079477101723,
18.26573635509351,
5.676765489357286,
2.122144502273853,
10.905925305753641,
39.10644624315509,
57.07296865144896,
62.12938274594555,
57.49210270903877,
34.77925045594372,
39.80091365068339,
14.733990610146979,
6.3252451428130225,
8.138022528148616,
30.679739376006957,
39.23730053717147,
42.34591514099523,
34.866447106988176,
22.082967859637037,
23.489714640163406,
6.983240303823213,
3.2344820494254867
],
"lo_data": [
0.6787476932098668,
11.443344198048104,
15.08583204220448,
20.04470420180317,
17.513466510421665,
19.170945486277184,
23.95621417591235,
6.17793746016664,
2.1692058624112054,
8.649954287141867,
33.657614730966,
60.314146827066665,
49.78385577944508,
53.64800349213311,
43.14363925352894,
50.97803227216817,
15.04393488118603,
5.542023419503872,
5.904285325550115,
29.532209018768434,
43.033802368101966,
35.19033974181727,
47.1095965941837,
31.432760137189135,
24.061943990207688,
11.15261250989749,
3.7541663651863297
]
}
},
{
"name": "ttbar-syst-ISR",
"type": "normsys",
"data": {
"hi": 1.220732355329648,
"lo": 0.7792676446703519
}
},
{
"name": "ttbar-syst-ISR",
"type": "histosys",
"data": {
"hi_data": [
2.498763994006645,
13.236306487707676,
19.29230991126219,
22.833228247251583,
24.941273081593412,
18.464820350901913,
20.110985620174425,
6.736669155932718,
3.6911596602628554,
9.457252447743462,
33.866664046268596,
57.86392692014527,
61.54608874899057,
51.63878535918566,
40.261401546171015,
43.00489522688136,
14.475398692042585,
6.010084991843683,
5.240978017858997,
29.013586505569137,
38.59096029569867,
43.213121490802266,
34.89366445424466,
26.35626576140594,
26.585479022782987,
10.249787376359388,
4.099461211408759
],
"lo_data": [
3.7696042501644977,
12.203732193322598,
18.844794123161638,
23.401383263978886,
15.221236836619479,
17.934204612476993,
22.110964910831434,
5.1180337935912075,
0.6001907044222032,
10.098627145152046,
38.8973969278525,
59.52318855837036,
50.367149776400055,
59.501320841986214,
37.66148816330164,
47.7740506959702,
15.302526799290423,
5.857183570473212,
8.801329835839734,
31.198361889206254,
43.680142609574766,
34.323133392010234,
47.08237924692722,
27.159462235420232,
20.966179607588106,
7.8860654373613155,
2.8891872032030577
]
}
},
{
"name": "ttbar-syst-PS",
"type": "normsys",
"data": {
"hi": 1.2050235431816265,
"lo": 0.7949764568183735
}
},
{
"name": "ttbar-syst-PS",
"type": "histosys",
"data": {
"hi_data": [
3.3534550281978777,
12.033856595625718,
21.30899227415003,
24.509509817305503,
25.8481115859143,
18.5719450620786,
21.83413761556733,
7.246702915986389,
2.957846598160289,
7.25676576413713,
38.48344633986547,
56.8197268584179,
59.926051710955605,
59.05327869163685,
41.963464931389396,
42.0285541082128,
15.447970060940204,
7.209872812613123,
5.943613148226668,
25.23561613233863,
35.01958528729912,
41.148430421680196,
34.2366790814458,
24.59702612672,
24.277325919171936,
7.490115417310103,
4.371238319149527
],
"lo_data": [
2.914913215973265,
13.406182085404556,
16.8281117602738,
21.725101693924966,
14.314398332298591,
17.827079901300305,
20.38781291543853,
4.608000033537537,
1.3335037665247698,
12.299113828758378,
34.28061463425562,
60.567388620097724,
51.98718681443502,
52.08682750953503,
35.95942477808326,
48.75039181463876,
14.329955430392804,
4.657395749703771,
8.098694705472063,
34.976332262436756,
47.251517617974315,
36.387824461132304,
47.739364619726075,
28.918701870106172,
23.274332711199158,
10.645737396410599,
2.617410095462289
]
}
}
]
},
{
"name": "Zll",
"data": [
0.8972583413124084,
7.064095973968506,
3.542963981628418,
0.8430342674255371,
2.118805170059204,
3.334984064102173,
1.5606133937835693,
0.050484806299209595,
0.05545007809996605,
9.999999717180685e-10,
0.013289040885865688,
0.08560466766357422,
0.061958350241184235,
9.999999717180685e-10,
9.999999717180685e-10,
0.004126517102122307,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10,
0.0003875246911775321,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10,
9.999999717180685e-10
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
0.3401330686174674,
4.852262569144132,
2.2531419087598037,
0.3680943155314295,
1.1551888083782842,
2.360606035437923,
1.551998700276235,
0.03246505830827842,
0.0518583933421947,
0.0,
0.010456572068689982,
0.0491447290895443,
0.043815842545414034,
4.777328058235764,
0.0,
0.007661113437351681,
0.0,
0.0,
0.0,
0.000387524686374085,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "Zjetsnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
}
]
},
{
"name": "Wjets",
"data": [
25.98953628540039,
148.0875701904297,
466.211181640625,
292.91748046875,
119.51535034179688,
112.79962921142578,
104.71670532226562,
17.376272201538086,
37.127410888671875,
0.8891265392303467,
4.487545013427734,
0.7926654815673828,
5.391210079193115,
3.3131606578826904,
9.999999717180685e-10,
5.14890718460083,
9.999999717180685e-10,
0.5045974850654602,
9.999999717180685e-10,
0.6733123660087585,
0.3701707720756531,
0.19689325988292694,
0.17677272856235504,
0.23742935061454773,
0.010014912113547325,
9.999999717180685e-10,
9.999999717180685e-10
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
18.95457349884378,
70.00297990502952,
177.35522110928375,
91.98004964094874,
33.965511736785935,
26.088142090666643,
12.079689416039482,
12.11578110729692,
24.444370096270514,
0.9677067665549353,
1.974704473289153,
2.0124259685957564,
2.2705996231545655,
1.746895662738218,
19.601065862247825,
1.9193211214366754,
0.6241604219430951,
0.3235011961482161,
0.0,
0.4626067886785151,
0.2209095013802317,
0.09795756807217645,
0.1699487101937834,
0.4260972795655261,
0.010014912457366916,
0.0,
0.0
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "Wjetsnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
}
]
},
{
"name": "singletop",
"data": [
0.4650433659553528,
3.586678981781006,
5.3690314292907715,
6.5188493728637695,
6.794899940490723,
5.230123996734619,
6.455552101135254,
1.416122555732727,
0.9780453443527222,
2.19258451461792,
5.467511177062988,
9.99152946472168,
12.938706398010254,
11.11521053314209,
6.745467185974121,
8.185778617858887,
2.5264368057250977,
1.9584273099899292,
0.30111247301101685,
1.7843446731567383,
5.020862579345703,
3.012359142303467,
3.426671266555786,
1.3434967994689941,
2.187044620513916,
0.59363853931427,
0.6312496066093445
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
0.2718062271031703,
0.7923469562463225,
0.9118307475001273,
1.002149339337283,
0.9966891769592305,
0.9332358116647141,
1.0168364172907858,
0.4762117878606902,
0.3712258868717275,
0.5923058979371326,
0.8984444292653833,
1.2518221626946582,
1.3480464625279092,
1.3001132731189493,
1.016808287743095,
1.1206117848615749,
0.6771452482618175,
0.6090799639166785,
0.2144043880922945,
0.529493511178739,
0.8710017207447484,
0.6197534547258692,
0.7132123735532212,
0.418982653955289,
0.5525910152642597,
0.39046022026138144,
0.3224762788264719
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "singletopnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
}
]
},
{
"name": "multiboson",
"data": [
5.772249698638916,
13.322222709655762,
19.44859504699707,
20.207860946655273,
18.15636444091797,
14.918745994567871,
20.061100006103516,
8.208643913269043,
6.3627095222473145,
0.16463185846805573,
0.9877825975418091,
0.8409641981124878,
0.5360387563705444,
0.9228699207305908,
1.1696816682815552,
0.6535868048667908,
0.44904208183288574,
0.678593635559082,
0.022182032465934753,
0.10862438380718231,
0.08747895807027817,
0.09262492507696152,
0.039744794368743896,
0.03034309111535549,
0.07780832052230835,
0.009849405847489834,
0.008091447874903679
],
"modifiers": [
{
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs2thh",
"type": "staterror",
"data": [
0.7273612161108358,
1.120670974516059,
1.251605905903741,
1.406438716475429,
1.2567012483720807,
1.1372110327058536,
1.1542905228691978,
0.7829390382587853,
0.6832217717405377,
0.14013336880554994,
0.1979738831978867,
0.22565851796019412,
0.2772155648937874,
0.21525601444031867,
0.30849609485526763,
0.14112416269366374,
0.12435971213698106,
0.43622231998384436,
0.0068668539426901066,
0.06510054634385384,
0.028647073491025403,
0.03377145617810225,
0.019076840232356063,
0.027789509477681693,
0.026461215915870768,
0.010656429555943602,
0.006163517185353037
]
},
{
"name": "lumi",
"type": "normsys",
"data": {
"hi": 1.02,
"lo": 0.98
}
},
{
"name": "multibosonnorm",
"type": "normsys",
"data": {
"hi": 1.1,
"lo": 0.9
}
}
]
}
]
}
],
"measurements": [
{
"name": "rec_smim_vs_Nbjet_HighMassOS",
"config": {
"parameters": [
{
"name": "LQ_sig",
"inits": [
0
],
"bounds": [
[
0,
100
]
]
}
],
"poi": "LQ_sig"
}
}
],
"observations": [
{
"name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
"data": [
472.42572021484375,
673.0277099609375,
1105.7955322265625,
1032.72900390625,
707.9483032226562,
537.0233764648438,
552.9171142578125,
192.1658935546875,
133.13348388671875,
20.18732452392578,
60.78990936279297,
90.38542175292969,
92.4969711303711,
77.36933898925781,
38.831180572509766,
68.88927459716797,
20.546157836914062,
10.64723014831543,
7.0030951499938965,
32.448814392089844,
46.76036071777344,
43.103553771972656,
45.82628631591797,
29.05659294128418,
26.584938049316406,
10.414992332458496,
4.483538627624512
]
}
],
"version": "1.0.0"
} and executing: import json
import pyhf
from add_custom_modifier import add_custom_modifier
### LQ signal sqrt attempt
new_params = {
'LQ': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) },
}
expanded_pyhf = add_custom_modifier('customfunc', ['LQ'], new_params)
with open('ws_test.json') as fin:
ws_test = json.load(fin)
model_test = pyhf.Model(ws_test, modifier_set=expanded_pyhf, poi_name = 'LQ', validate=True, batch_size=1) I get the following error: Traceback (most recent call last):
File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/schema/validator.py", line 93, in validate
return validator.validate(spec)
File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/jsonschema/validators.py", line 304, in validate
raise error
jsonschema.exceptions.ValidationError: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} is not valid under any of the given schemas
Failed validating 'anyOf' in schema['properties']['channels']['items']['properties']['samples']['items']['properties']['modifiers']['items']:
{'anyOf': [{'$ref': '#/definitions/modifier/histosys'},
{'$ref': '#/definitions/modifier/lumi'},
{'$ref': '#/definitions/modifier/normfactor'},
{'$ref': '#/definitions/modifier/normsys'},
{'$ref': '#/definitions/modifier/shapefactor'},
{'$ref': '#/definitions/modifier/shapesys'},
{'$ref': '#/definitions/modifier/staterror'}]}
On instance['channels'][0]['samples'][0]['modifiers'][3]:
{'data': {'expr': 'srqt(LQ)'}, 'name': 'f1', 'type': 'customfunc'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/terabig/muellerr/Leptoquarks/fitting/test_modifier.py", line 45, in <module>
model_test = pyhf.Model(ws_test, modifier_set=expanded_pyhf, poi_name = 'LQ', validate=True, batch_size=1)
File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/pdf.py", line 766, in __init__
schema.validate(self.spec, self.schema, version=self.version)
File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/pyhf/schema/validator.py", line 95, in validate
raise pyhf.exceptions.InvalidSpecification(err, schema_name)
pyhf.exceptions.InvalidSpecification: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} is not valid under any of the given schemas.
Path: channels[0].samples[0].modifiers[3]
Instance: {'name': 'f1', 'type': 'customfunc', 'data': {'expr': 'srqt(LQ)'}} Schema: model.json |
Hi @rmnmllr, thanks for all the information provided here! This issue looks to me independent of |
You can also pass in your own schema to validate against, but we should have custom function expressions allowable at some point. |
Thank you! Yes I'm aware that this relates to model_cabi, data_cabi = cabinetry.model_utils.model_and_data(ws_test) and I can only disable the validation going into the
|
cabinetry/src/cabinetry/model_utils.py Lines 66 to 71 in 87a452b
We can add a |
it's in our docs: https://scikit-hep.org/pyhf/_generated/pyhf.schema.Schema.html#pyhf.schema.Schema so you can always pass in the path to your custom schema and have it pass validation. |
Timeline: unclear. We expect it to be in for 0.8.0 but there's a few things in the way first (multi-POI, etc..) |
@rmnmllr please have a look at #385, which adds the relevant arguments to I added a |
Cool, thank you! I hope I can take a look before the holidays starting the day after tomorrow! |
Yes, that should work. You can also install it directly via: pip install git+https://github.com/scikit-hep/cabinetry.git@feat/custom-modifier-models |
Ok I cleaned up/simplified my workspace: {
"channels": [
{
"name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
"samples": [
{
"name": "LQ",
"data": [
-0.05606459826231003,
-0.12818728387355804,
-0.2890488803386688,
-0.32937803864479065,
-0.31295812129974365
],
"modifiers": [
{
"name": "f1",
"type": "customfunc",
"data": {
"expr": "-srqt(m1)"
}
}
]
}
]
}
],
"measurements": [
{
"name": "rec_smim_vs_Nbjet_HighMassOS",
"config": {
"parameters": [
{
"name": "LQ",
"inits": [
0
],
"bounds": [
[
0,
10
]
]
}
],
"poi": "m1"
}
}
],
"observations": [
{
"name": "rec_smim_vs_Nbjet_HighMassOSs2thh",
"data": [
472.42572021484375,
673.0277099609375,
1105.7955322265625,
1032.72900390625,
707.9483032226562
]
}
],
"version": "1.0.0"
} and I can confirm that I can create the model with cabinetry by doing: import json
import cabinetry
from add_custom_modifier import add_custom_modifier
new_params = { 'm1': { 'inits': (1.0,), 'bounds': ((-5.0, 5.0),) } }
expanded_pyhf = add_custom_modifier('customfunc', ['m1'], new_params)
with open('test_mod_ws.json') as fin:
ws_test = json.load(fin)
model, data = cabinetry.model_utils.model_and_data(ws_test, validate=False, modifier_set=expanded_pyhf) Now if I try: cabinetry.fit.fit(model, data) I get a long list of errors:
which I guess is expected because |
Hi @rmnmllr, this issue is due to a typo: instead of For completeness (and since I may want to find this again in the future myself), here's another small setup that works with #385 and scikit-hep/pyhf#1991: import cabinetry
from pyhf.experimental.modifiers import add_custom_modifier
cabinetry.set_logging()
spec = {
"channels": [
{
"name": "singlechannel",
"samples": [
{
"name": "signal",
"data": [10, 10],
"modifiers": [
{
"name": "f1",
"type": "customfunc_sq",
"data": {"expr": "sqrt_mu**2"},
},
],
}
],
},
],
"measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
"observations": [{"name": "singlechannel", "data": [10.0, 10.0]}],
"version": "1.0.0",
}
new_params = {
"sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc_sq", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
spec, validate=False, modifier_set=expanded_pyhf
)
data = [15] * 2 + model.config.auxdata
fit_results = cabinetry.fit.fit(model, data) |
Arghh, thanks for spotting this - works now! |
It may be a good idea to catch this in |
Hi @alexander-held, How do I implement the custom modifier function with the configuration schema of cabinetry? I assume it's in the I tried adding: "Data": {
"type": "object"
},
"Type": {
"type": "string"
} to the With General:
Measurement: "rec_smim_vs_Nbjet_HighMassOS"
HistogramFolder: "outputs/"
POI: "Signal_norm"
InputPath: "fitter-2022-12-14-rebin3/{SamplePath}{VariationPath}.root:{RegionPath}"
Regions:
- Name: "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"
RegionPath: "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"
Samples:
- Name: "Data"
SamplePath: "data"
Data: True
- Name: "LQ"
SamplePath: "LQ-gen-betaL33_1p5TeV"
NormFactors:
- Name: "f1"
Samples: "LQ"
Data: { "expr": "-srqt(m1)" }
Type: "customfunc"
Nominal: 0
Bounds: [0, 100] and executing: import cabinetry
config = cabinetry.configuration.load("myconfig.yml")
ws = cabinetry.workspace.build(config)
cabinetry.workspace.save(ws, workspace_path) I get for the saved {
"channels": [
{
"name": "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets",
"samples": [
{
"data": [
-2.8630847930908203,
-0.28795331716537476,
0.019545821473002434
],
"modifiers": [
{
"data": [
0.26342490191721557,
0.09063038118573631,
0.05747892970426467
],
"name": "staterror_rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets",
"type": "staterror"
},
{
"data": null,
"name": "f1",
"type": "normfactor"
}
],
"name": "LQ"
}
]
}
],
"measurements": [
{
"config": {
"parameters": [
{
"bounds": [
[
0,
100
]
],
"inits": [
0
],
"name": "f1"
}
],
"poi": "Signal_norm"
},
"name": "rec_smim_vs_Nbjet_HighMassOS"
}
],
"observations": [
{
"data": [
109277.0,
10338.0,
1582.0
],
"name": "rec_smim_vs_Nbjet_HighMassOSs0tem_0bjets"
}
],
"version": "1.0.0"
} Maybe I'm making a mistake at implementation, any help would be greatly appriciated! |
Another problem I encounter. If I use your example above and modify: import cabinetry
from add_custom_modifier import add_custom_modifier
cabinetry.set_logging()
spec = {
"channels": [
{
"name": "singlechannel",
"samples": [
{
"name": "signal",
"data": [
-1,
-1
],
"modifiers": [
{
"name": "f1",
"type": "customfunc_sq",
"data": {"expr": "sqrt(sqrt_mu)"},
},
],
}
],
},
],
"measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
"observations": [{"name": "singlechannel", "data": [
15,
15
]}],
"version": "1.0.0",
}
new_params = {
"sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc_sq", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
spec, validate=False, modifier_set=expanded_pyhf
)
# data = [15] * 2 + model.config.auxdata
fit_results = cabinetry.fit.fit(model, data) I get the "estimated distance to minimum too large" error:
Are negative values in the samples not working? |
Hi @rmnmllr,
this is currently not supported. The settings that
Where are you using
One likely source of this issue is that the model predicts negative yields somewhere, resulting in a NaN likelihood (see also scikit-hep/pyhf#2093). MIGRAD (which we use to minimize) is not very good at recovering from that. You can clip the yield to prevent that (see the
This was just a toy example with no deep meaning, I probably picked that to easily compare to an expression-less setup where I scale by |
Ok, makes sense - thank you for your answers!
Ah sorry, I copy-pasted from the wrong file - it's in the
You mean by setting the |
I have not tried out samples with negative yields, I could not point spontaneously to any issue (as long as the total stays positive) but it may be worth some sanity checks (e.g. positive yield with negative normalization attached instead). |
I just manage to do it by using our values: import cabinetry
from add_custom_modifier import add_custom_modifier
cabinetry.set_logging()
spec = {
"channels": [
{
"name": "singlechannel",
"samples": [
{
"name": "signal",
"data": [
-0.05606459826231003,
-0.12818728387355804,
-0.2890488803386688,
-0.32937803864479065,
-0.31295812129974365
],
"modifiers": [
{
"name": "f1",
"type": "customfunc",
"data": {"expr": "-sqrt(sqrt_mu)"},
},
],
}
],
},
],
"measurements": [{"name": "meas", "config": {"parameters": [], "poi": ""}}],
"observations": [
{
"name": "singlechannel",
"data": [
472.42572021484375,
673.0277099609375,
1105.7955322265625,
1032.72900390625,
707.9483032226562
]
}
],
"version": "1.0.0",
}
new_params = {
"sqrt_mu": {"inits": (1.0,), "bounds": ((-5.0, 5.0),)},
}
expanded_pyhf = add_custom_modifier("customfunc", ["sqrt_mu"], new_params)
model, data = cabinetry.model_utils.model_and_data(
spec, validate=False, modifier_set=expanded_pyhf
)
fit_results = cabinetry.fit.fit(model, data) which gives:
I can't really expalin why the trial before didn't work. Working towards the end game of getting a limit_results = cabinetry.fit.limit(model, data) produces the following error:
But maybe this is taking it a step too far for the moment..? |
means you need a bigger starting bracket, it has to be chosen such that CLs=0.05 lies within the POI values you are starting with but |
Hi @alexander-held, {
"data": null,
"name": "Signal_norm",
"type": "normfactor"
}, to: {
"name": "f0",
"type": "customfunc",
"data": {
"expr": "sqrt(mu)"
}
} However this method is quite hacky and I struggle a lot with minimization errors. You can check my cabinetry setup here with the |
We can make this easier in How would you like to be specifying these things in the config? If we can find a good syntax, it should not be too much work to propagate that through for workspace building. Perhaps something like the following? NormFactors:
- Name: "mu"
Samples: "sample_1"
Nominal: 1
Bounds: [0, 10]
- Name: "f0"
Samples: "sample_2"
Expression: "sqrt(mu)" |
That sounds and looks great to me! If you create your model like this, would the
but I guess this will be handled by |
hey @alexander-held, I managed to implement the custom scaling functions by modifying the workspace (as suggested by you above) as follows. In our analysis we have two terms that need to be scaled differently: interference by g2 and BSM by g2**2. The following code does that in the workspace: def add_custom_func(workspace, equation="g2**2"):
"""Adds a custom function `equation` to the `pyhf` workspace.
Note: POI not changed here, change happens in `run_cabinetry`
Args:
workspace (dict): model given by `cabinetry.workspace.build(config)`
equation (str, optional): equation to scale the signal, POI needs to be called `g2`
Returns:
workspace (dict): modified workspace
"""
for channel in workspace["channels"]:
for sample in channel["samples"]:
if sample["name"] == "LQ_BSM":
for i, modifier in enumerate(sample["modifiers"]):
if modifier["type"] == "normfactor":
sample["modifiers"][i]["name"] = "customfunc_" + channel["name"] + "_BSM"
sample["modifiers"][i]["type"] = "customfunc"
sample["modifiers"][i]["data"] = { "expr": equation }
elif sample["name"] == "LQ_Int":
for i, modifier in enumerate(sample["modifiers"]):
if modifier["type"] == "normfactor":
sample["modifiers"][i]["name"] = "customfunc_" + channel["name"] + "_Int"
sample["modifiers"][i]["type"] = "customfunc"
sample["modifiers"][i]["data"] = { "expr": "g2" }
else:
continue
return workspace (from here) General:
Measurement: "rec_smim_vs_Nbjet_HighMassOS"
HistogramFolder: "outputs/"
POI: "g2"
InputPath: "to_be_set_at_execution/{SamplePath}{VariationPath}.root:{RegionPath}"
Regions:
- Name: "HighMassOSs2thh_0bjets"
RegionPath: "rec_smim_vs_Nbjet_HighMassOSs2thh_0bjets"
...
Samples:
- Name: "Data"
SamplePath: "asimov-Sh2211"
Data: True
- Name: "LQ_Int"
SamplePath: "to_be_filled_at_execution"
- Name: "LQ_BSM"
SamplePath: "to_be_filled_at_execution"
...
Systematics:
...
NormFactors:
- Name: "IntAndBSM"
Samples: ["LQ_Int", "LQ_BSM"]
Nominal: 0
Bounds: [-20, 20]
... with the following code snippet from here: ...
if custom_func:
ws = add_custom_func(ws)
inits = ws["measurements"][0]["config"]["parameters"][0]["inits"][0]
bounds = ws["measurements"][0]["config"]["parameters"][0]["bounds"][0]
new_params = {
"g2": {
'inits': (inits,),
'bounds': ((bounds[0], bounds[1]),)
}
}
expanded_pyhf = add_custom_modifier('customfunc', ['g2'], new_params)
model, data = cabinetry.model_utils.model_and_data(ws, validate=False, modifier_set=expanded_pyhf)
... First results are promising, and testing the individual terms separately also worked. The only thing I'm uncertain of is the scaling of two input samples with one POI |
Are you removing the
I am not sure what you mean by this list? |
Isn't this adjusted here: new_params = {
"g2": {
'inits': (inits,),
'bounds': ((bounds[0], bounds[1]),)
}
}
expanded_pyhf = add_custom_modifier('customfunc', ['g2'], new_params) At least in the model workspace I don't see any normalizing factor anymore e.g.:
but I see a init value (another?) in:
Maybe I'm getting confused by the similar terms..
I mean the list that gets saved under NormFactors:
- Name: "IntAndBSM"
Samples: ["LQ_Int", "LQ_BSM"] # <-
Nominal: 0
Bounds: [-20, 20] Depending on the order of the terms in the list, the post-fit yield error is non-zero for the second entry but 0.0 for the first. |
That is probably the case, at least I would not know where else the modifier disappears. In that case the name "add" is a bit misleading though (cc @kratsg / @lukasheinrich who can presumably confirm). Having the parameter left over in the config should not be a problem as long as it appears nowhere else in the workspace anyway, so also the init of zero there should not be the problem. What is the best-fit value for
You can get the samples in the right order from |
It's indeed the post-fit yields, sorry for not specifying earlier.
I don't think they change, I couldn't fully test limits etc. due to some troubles with (most likely) our systematics |
That parameter being zero explains why you get zero yields for the samples it acts on. I imagine the lower bound for that parameter is also zero? This would make the uncertainty estimate unreliable, so you may want to relax the bounds in that case. |
Hey @alexander-held
For our analysis we would like to have a sqrt(mu) dependence of our LQ interference term. We found these custom modfiers of @lukasheinrich here that enables pyhf to accept definitions of custom functions (see also this pyhf feature branch). I tried a simple example with pyhf to verify that it also works with
sqrt
:which gives the desired output:
Now it would be great if we can also use this feature in cabinetry. I tried already by manipulating a workspace JSON created by cabinetry and reading it back in, but obviously while trying to validate the workspace it fails since it can't find the modifiers.
I know that there might be some development (or merging) necessary on pyhf but it would be very benificial for us if we can start a harmonisation somewhere/somehow!
Many thanks already!
Roman
(more on my studies to custom modifiers can be found on this branch, our working cabinetry setup that we want to modfiy with
srqt
can be found on this branch)The text was updated successfully, but these errors were encountered: