You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish you are well. I was trying to understand fine-tuning part from "fine_tune_multi_label.ipynb" notebook.
Few Questions:
Q 1. - I want to know what is the order of 50 ATT&CK Labels defined under CLASSES Variable.
Q 2. - Why is it recommend not to change the code of particular cell.
Q 3. - If somebody wants to change the classes to fine tune model on some other ATT&CK labels, what is the correct method to do
so and in what order the labels should be placed.
Q 4. - If somebody wants to increase number of classes what is the correct approach.
Q1 - They are in lexical order, but the order is somewhat arbitrary. The order of the classes affects how the labels are vectorized, i.e. turned from strings like "T1003.001" into dense vectors. E.g. the vector [1, 0, 0, 0, 0, ....] means that the associated technique is the first item in CLASSES: T1003.001.
Q2 - The notebook says not to modify that cell because we have already fine-tuned SciBERT using that vectorization scheme. This notebook is intended for continuing to fine tune with additional training data for the same set of labels. If you change the order of the labels, then additional fine tuning will be counter-productive, because the model has to relearn what each position in the label vector represents.
Q3 - If you want to fine tune SciBERT using different labels, you should look at the model-development/train_multi_label.ipynb notebook. That notebook illustrates how to start with an upstream SciBERT checkpoint and fine-tune it on the training data in data/tram2-data/multi_label.json.
Q4 - Same as for Q3. You'll want to set up MITRE Annotation Toolkit for labeling your additional training data. See: https://github.com/center-for-threat-informed-defense/tram/wiki/Data-Annotation
Hi Reader,
I wish you are well. I was trying to understand fine-tuning part from "fine_tune_multi_label.ipynb" notebook.
Few Questions:
Q 1. - I want to know what is the order of 50 ATT&CK Labels defined under CLASSES Variable.
Q 2. - Why is it recommend not to change the code of particular cell.
Q 3. - If somebody wants to change the classes to fine tune model on some other ATT&CK labels, what is the correct method to do
so and in what order the labels should be placed.
Q 4. - If somebody wants to increase number of classes what is the correct approach.
Thanks for your support in advance
For Reference CLASSES:
CLASSES = [
'T1003.001', 'T1005', 'T1012', 'T1016', 'T1021.001', 'T1027',
'T1033', 'T1036.005', 'T1041', 'T1047', 'T1053.005', 'T1055',
'T1056.001', 'T1057', 'T1059.003', 'T1068', 'T1070.004',
'T1071.001', 'T1072', 'T1074.001', 'T1078', 'T1082', 'T1083',
'T1090', 'T1095', 'T1105', 'T1106', 'T1110', 'T1112', 'T1113',
'T1140', 'T1190', 'T1204.002', 'T1210', 'T1218.011', 'T1219',
'T1484.001', 'T1518.001', 'T1543.003', 'T1547.001', 'T1548.002',
'T1552.001', 'T1557.001', 'T1562.001', 'T1564.001', 'T1566.001',
'T1569.002', 'T1570', 'T1573.001', 'T1574.002'
]
The text was updated successfully, but these errors were encountered: