CU-8694gzbn3 k fold metrics #432

mart-r · 2024-05-10T16:52:22Z

Add k-fold metrics.

In order to do that, I had to add a sort of state-chaching system.
Because in order for this to make sense, we need to fine-tune using k-1 folds and get metrics for the kth fold. But before moving to the next fold we need to restore to the previous state. Otherwise we'd be getting metrics on stuff we've trained on.

Seems kind of ready, but would definitely need some comments.

EDIT:
In light of issues with tests running (which I never got a full understanding of why it was happening - read more in comment below) there were these additional changes along with this PR:

Removing old/archived/legacy tests
- These tests were never run since the folders which they were in (tests/medmentions/ and tests/archive_tests) didn't have __init__.py files
- However, they would be picked up with the new system (more about this below)
Changes to main workflow
- Split the unit tests into 2
  - By listing all the test files and only running on half of them at a time
    - This is done automatically during GHA time
  - This ensures all the tests are run and that they are successful
- Changed timeout to 25 minutes per half of the tests being run

…n-deprecated one instead

tomolopolis · 2024-05-10T16:52:26Z

Task linked: CU-8694gzbn3 Improve metrics (i.e k-fold)

baixiac

Btw, I have observed memory increase after consecutive training sessions even using an initialised cat instance in between so every so often, I had to restart our training server to bring the memory allocation back down to normal. So possible, the need to split tests into two runs to make the whole suite pass coincidently indicates a code smell in the cat.train_supervised and/or cat.train. I haven't observed a similar issue on transformer_ner.train which uses a different training paradigm.

baixiac · 2024-05-21T11:24:54Z

medcat/stats/kfold.py

+class FoldCreator:
+
+    def __init__(self, mct_export: MedCATTrainerExport, nr_of_folds: int,
+                 use_annotations: bool) -> None:


Better to have docstring for the arguments, e.g, for the meaning of nr_of_folds, is it num_of_folds?

mart-r · 2024-05-22T14:30:35Z

Btw, I have observed memory increase after consecutive training sessions even using an initialised cat instance in between so every so often, I had to restart our training server to bring the memory allocation back down to normal. So possible, the need to split tests into two runs to make the whole suite pass coincidently indicates a code smell in the cat.train_supervised and/or cat.train. I haven't observed a similar issue on transformer_ner.train which uses a different training paradigm.

Thanks for letting me know! I've not observed that explicitly, but it does sound like something that could very well cause the issue here.
I've got a feeling the issue is memory related, but a) the tests run fine locally; b) there is no documented memory usage limit for GHA workflow as far as I can tell; and c) there is no error describing the reason the workflow is being cancelled.

In any case, there may very well be a memory leak, I suppose. Or for some reason some dereferenced variables may not get garbage collected. In any case, since you've observed something like this, it may warrant for me to look into it a little further to see if/when/why this may be happening. And potentially improve the memory situation. Especially if there is an actual memory leak.

baixiac · 2024-05-22T15:06:44Z

The GH runner specs can be found here. Ofc, you may be able to get a better machine on CREATE and use it as a self-hosted runner.

mart-r · 2024-05-22T15:10:28Z

The GH runner specs can be found here. Ofc, you may be able to get a better machine on CREATE and use it as a self-hosted runner.

Thanks! My previous googling only led to the billing page where they described how many minutes/hours and how many concurrent jobs can be run.
So I hadn't seen this one.

…litting strategies

…T export

…levant tests

…uments

…easonable number of annotations per split

baixiac · 2024-05-28T10:26:20Z

medcat/stats/kfold.py

+    creator = get_fold_creator(mct_export_data, k, split_type=split_type)
+    folds = creator.create_folds()
+    per_fold_metrics = get_per_fold_metrics(cat, folds, *args, **kwargs)
+    return get_metrics_mean(per_fold_metrics)


Is that possible to include standard deviations for metrics in the returned stats as additions to the means? Users may want to know whether the model performs consistently across different folds.

agreed - that would be nice to have. New ticket here CU-8694vbw6y

tomolopolis · 2024-06-18T08:38:50Z

on the test issues - if a CREATE GHA runner would be useful we can provision, but test splitting in half, where the total number of tests is now more than , i.e. we're 2 tests twice is a little odd.
To summarise there's potentially:

a memory leak in the cdb code
state between tests causing one or more tests to hang

Fixing the above is outside the scope of this PR, and is likely an issue that's been around for a while. Lets prioritise looking into this though.

tomolopolis

lgtm

tomolopolis · 2024-06-18T08:30:44Z

medcat/utils/cdb_state.py

There's various methods that perhaps should be moved out of the cdb module now this utils module exists. I was thinking the cdb merge functionality that Adam wrote a while ago. Thoughts?

The CDB merge functionality is in medcat.utils.cdb_utils:
https://github.com/CogStack/MedCAT/blob/master/medcat/utils/cdb_utils.py

However, I added the CDB state code in a different utility module because I felt there was enough state-related code to justify having this separately.
With that said, there would also be valid arguments for keeping all CDB-related utilities together in one utility method. So I'm open to making this change as well.

Furthermore, there may very well be more CDB-related things that are currently in the medcat.cdb module that would better be suited in the medcat.utils.cdb_utils module. The former is certainly on the larger side as a whole.
I think a new CU task could be in order.

mart-r · 2024-06-18T09:31:17Z

on the test issues - if a CREATE GHA runner would be useful we can provision, but test splitting in half, where the total number of tests is now more than , i.e. we're 2 tests twice is a little odd. To summarise there's potentially:

a memory leak in the cdb code

state between tests causing one or more tests to hang

Fixing the above is outside the scope of this PR, and is likely an issue that's been around for a while. Lets prioritise looking into this though.

I did try to look into this briefly at the time (when Xi brought up the memory issues). I wasn't able to fully figure out the procedure in which the memory issue arose. But I can try and dedicate more time to figure this out in the context of another task.

baixiac · 2024-06-18T10:19:57Z

Yes, I can confirm the "memory issue" on consecutive trainings has been around for a long time and is beyond the scope of this PR.

mart-r · 2024-06-18T10:24:19Z

Yes, I can confirm the "memory issue" on consecutive trainings has been around for a long time and is beyond the scope of this PR.

I've created a CU task:
https://app.clickup.com/t/8694vdx65
If you have any details as to what situations these issues arise, it would be great if you could write them down under there somewhere.

baixiac · 2024-06-18T10:33:03Z

Oh sure. The simple way to replicate it is to create a cat object, and call cat.train or cat.train_supervised in a loop.

mart-r · 2024-06-18T10:37:46Z

Oh sure. The simple way to replicate it is to create a cat object, and call cat.train or cat.train_supervised in a loop.

I tried that last time (only with supervised training) and wasn't able to meaningfully change the RAM usage. Perhaps it was the nature of my MCT export (synthetic data, all annotations marked as correct, only 15 documents, no meta cats). Not fully sure.

In any case, I'll be sure to take a look soon.

baixiac · 2024-06-18T10:55:29Z

One thing I need to mention is I didn't wait long enough to see whether it will cause OOMs. So another slim chance is pymalloc was holding a cache of GC-ed objects and didn't return memory to OS soon enough to achieve a full GC.

* Pushing changes for bert-style models for MetaCAT * Pushing fix for LSTM * Pushing changes for flake8 and type fixes * Pushing type fixes * Fixing type issue * Pushing changes 1) Added model.zero_grad to clear accumulated gradients 2) Fixed config save issue 3) Re-structured data preparation for oversampled data * Pushing change and type fixes Pushing ml_utils file which was missed in the last commit * Fixing flake8 issues * Pushing flake8 fixes * Pushing fixes for flake8 * Pushing flake8 fix * Adding peft to list of libraries * Pushing changes with load and train workflow and type fixes The workflow for inference is: load() and inference For training: init() and train() Train will always not load the model dict, except when the phase_number is set to 2 for 2 phase learning's second phase * Pushing changes with type hints and new documentation * Pushing type fix * Fixing type issue * Adding test case for BERT and reverting config changes BERT test cases: Testing for BERT model along with 2 phase learning * Merging changes from master to metacat_bert branch (#431) * Small addition to contribution guidelines (#420) * CU-8694cbcpu: Allow specifying an AU Snomed when preprocessing (#421) * CU-8694dpy1c: Return empty generator upon empty stream (#423) * CU-8694dpy1c: Return empty generator upon empty stream * CU-8694dpy1c: Fix empty generator returns * CU-8694dpy1c: Simplify empty generator returns * Relation extraction (#173) * Added files. * More additions to rel extraction. * Rel base. * Update. * Updates. * Dependency parsing. * Updates. * Added pre-training steps. * Added training & model utils. * Cleanup & fixes. * Update. * Evaluation updates for pretraining. * Removed duplicate relation storage. * Moved RE model file location. * Structure revisions. * Added custom config for RE. * Implemented custom dataset loader for RE. * More changes. * Small fix. * Latest additions to RelCAT (pipe + predictions) * Setup.py fix. * RE utils update. * rel model update. * rel dataset + tokenizer improvements. * RelCAT updates. * RelCAT saving/loading improvements. * RelCAT saving/loading improvements. * RelCAT model fixes. * Attempted gpu learning fix. Dataset label generation fixes. * Minor train dataset gen fix. * Minor train dataset gen fix No.2. * Config updates. * Gpu support fixes. Added label stats. * Evaluation stat fixes. * Cleaned stat output mode during training. * Build fix. * removed unused dependencies and fixed code formatting * Mypy compliance. * Fixed linting. * More Gpu mode train fixes. * Fixed model saving/loading issues when using other baes models. * More fixes to stat evaluation. Added proper CAT integration of RelCAT. * Setup.py typo fix. * RelCAT loading fix. * RelCAT Config changes. * Type fix. Minor additions to RelCAT model. * Type fixes. * Type corrections. * RelCAT update. * Type fixes. * Fixed type issue. * RelCATConfig: added seed param. * Adaptations to the new codebase + type fixes.. * Doc/type fixes. * Fixed input size issue for model. * Fixed issue(s) with model size and config. * RelCAT: updated configs to new style. * RelCAT: removed old refs to logging. * Fixed GPU training + added extra stat print for train set. * Type fixes. * Updated dev requirements. * Linting. * Fixed pin_memory issue when training on CPU. * Updated RelCAT dataset get + default config. * Updated RelDS generator + default config * Linting. * Updated RelDatset + config. * Pushing updates to model Made changes to: 1) Extracting given number of context tokens left and right of the entities 2) Extracting hidden state from bert for all the tokens of the entities and performing max pooling on them * Fixing formatting * Update rel_dataset.py * Update rel_dataset.py * Update rel_dataset.py * RelCAT: added test resource files. * RelCAT: Fixed model load/checkpointing. * RelCAT: updated to pipe spacy doc call. * RelCAT: added tests. * Fixed lint/type issues & added rel tag to test DS. * Fixed ann id to token issue. * RelCAT: updated test dataset + tests. * RelCAT: updates to requested changes + dataset improvements. * RelCAT: updated docs/logs according to commends. * RelCAT: type fix. * RelCAT: mct export dataset updates. * RelCAT: test updates + requested changes p2. * RelCAT: log for MCT export train. * Updated docs + split train_test & dataset for benchmarks. * type fixes. --------- Co-authored-by: Shubham Agarwal <66172189+shubham-s-agarwal@users.noreply.github.com> Co-authored-by: mart-r <mart.ratas@gmail.com> * CU-8694fae3r: Avoid publishing PyPI release when doing GH pre-releases (#424) * CU-8694fae3r: Avoid publishing PyPI release when doing GH pre-releases * CU-8694fae3r: Fix pre-releases tagging * CU-8694fae3r: Allow actions to run on release edit --------- Co-authored-by: Mart Ratas <mart.ratas@gmail.com> Co-authored-by: Vlad Dinu <62345326+vladd-bit@users.noreply.github.com> * Pushing changed tests and removing empty change * Pushing change for logging * Revert "Pushing change for logging" This reverts commit fbcdb70. * CU-8694hukwm: Document the materialising of generator when multiproce… (#433) * CU-8694hukwm: Document the materialising of generator when multiprocessing and batching for docs * CU-8694hukwm: Add TODO note for where the generator is materialised * CU-8694hukwm: Add warning from large amounts of generator data (10k items) is materialised by the docs size mp method * CU-8694fk90t (almost) only primitive config (#425) * CU-8694fk90r: Move backwards compatibility method from CDB to config utils * CU-8694fk90r: Move weighted_average_function from config to CDB; create necessary backwards compatibility workarounds * CU-8694fk90r: Move usage of weighted_average_function in tests * CU-8694fk90r: Add JSON encode and decoder for re.Pattern * CU-8694fk90r: Rebuild custom decoder if needed * CU-8694fk90r: Add method to detect old style config * CU-8694fk90r: Use regular json serialisation for config; Retain option to read old jsonpickled config * CU-8694fk90r: Add test for config serialisation * CU-8694fk90r: Make sure to fix weighted_average_function upon setting it * CU-8694fk90t: Add missing tests for config utils * CU-8694fk90t: Add tests for better raised exception upon old way of using weighted_average_function * CU-8694fk90t: Fix exception type in an added test * CU-8694fk90t: Add further tests for exception payload * CU-8694fk90t: Add improved exceptions when using old/unsupported value of weighted_average_function in config * CU-8694fk90t: Add typing fix exceptions * CU-8694fk90t: Make custom exception derive from AttributeError to correctly handle hasattr calls * CU-8694gza88: Create codeql.yml (#434) Run CodeQL to identify vulnerabilities. This will run on any push or pull request to `master`, but also runs once every day in case some new vulnerabilities are discovered (or something else changes). * CU-8694mbn03: Remove the web app (#441) * CU-8694n48uw better deprecation (#443) * CU-8694n493m: Add deprecation and removal versions to deprecation decorator * CU-8694n493m: Deprecation version to existing deprecated methods. Made the removal version 2 minor versions from the minor version in which the method was deprecated, or the next minor version if the method had been deprecated for longer. * CU-8694n4ff0: Raise exception upon deprecated method call at test time * CU-8694n4ff0: Fix usage of deprecated methods call during test time * CU-8694pey4u: extract cdb load to cls method, to be used in trainer for model pack loading * CU-8694pey4u: extract meta cat loading also to a cls method * CU-8694pey4u: docstrings * CU-8694pey4u: typehints and mypy issues * CU-8694pey4u: fix flake8 * CU-8694pey4u: fix flake8 * CU-8694pey4u: missing extra config if passed in * CU-8694py1jr: Fix issue with reuse of opened file when loading old configs * CU-8694py1jr: Make old config identifier more robust * CU-8694py1jr: Add doc string to old config identifier * CU-8694py1jr: Add test for old style MetaCAT config load * CU-8694py1jr: Add test for old style main config load (functional) * CU-8694py1jr: Refactor config utils load tests for more flexibility * CU-8694py1jr: Add config utils load tests for NER and Rel CAT configs * CU-8694vcvz7: Trust remote code when loading transfomers NER dataset (#453) * CU-8694vcvz7: Trust remote code when loading transfomers NER dataset * CU-8694vcvz7: Add support for older datasets without the remote code trusing kwarg * CU-8694gzbn3 k fold metrics (#432) * CU-8694gzbud: Add context manager that is able to snapshot CDB state * CU-8694gzbud: Add tests to snapshotting CDB state * CU-8694gzbud: Refactor tests for CDB state snapshotting * CU-8694gzbud: Remove use of deprecated method in CDB utils and use non-deprecated one instead * CU-8694gzbud: Add tests for training and CDB state capturing * CU-8694gzbud: Small refactor in tests * CU-8694gzbud: Add option to save state on disk * CU-8694gzbud: Add debug logging output when saving state on disk * CU-8694gzbud: Remove unused import * CU-8694gzbud: Add tests for disk-based state save * CU-8694gzbud: Move CDB state code to its own module * CU-8694gzbud: Remove unused import * CU-8694gzbud: Add doc strings to methods * CU-8694gzbx4: Small optimisation for stats * CU-8694gzbx4: Add MCTExport related module * CU-8694gzbx4: Add MCTExport related tests * CU-8694gzbx4: Add code for k-fold statistics * CU-8694gzbx4: Add tests for k-fold statistics * CU-8694gzbx4: Add test-MCT export with fake concepts * CU-8694gzbx4: Fix a doc string * CU-8694gzbx4: Fix types in MCT export module * CU-8694gzbx4: Fix types in k-fold module * CU-8694gzbx4: Remove accidentally committed test class * CU-8694gzbn3: Add missing test helper file * CU-8694gzbn3: Remove whitespace change from otherwise uncahnged file * CU-8694gzbn3: Allow 5 minutes longer for tests * CU-8694gzbn3: Move to python 3.8-compatible typed dict * CU-8694gzbn3: Add more time for tests in worklow (now 30 minutes) * CU-8694gzbn3: Add more time for tests in worklow (now 45 minutes) * CU-8694gzbn3: Update test-pypi timeout to 45 minutes * CU-8694gzbn3: Remove timeout from unit tests in main workflow * CU-8694gzbn3: Make tests stop upon first failure * CU-8694gzbn3: Fix test stop upon first failure (arg/option order) * CU-8694gzbn3: Remove debug code and old comments * CU-8694gzbn3: Remove all timeouts from main workflow * CU-8694gzbn3: Remove more old / useless comments in tests * CU-8694gzbn3: Add debug output when running k-fold tests to see where it may be stalling * CU-8694gzbn3: Add debug output when ANY tests to see where it may be stalling * CU-8694gzbn3: Remove explicit debug output from k-fold test cases * CU-8694gzbn3: Remove timeouts from DEID tests in case they're the ones creating issues * GHA/test fixes (#437) * Revert "CU-8694gzbn3: Remove timeouts from DEID tests in case they're the ones creating issues" This reverts commit faaf7fb. * Revert "CU-8694gzbn3: Remove explicit debug output from k-fold test cases" This reverts commit 9b02925. * Revert "CU-8694gzbn3: Add debug output when ANY tests to see where it may be stalling" This reverts commit 12c519a. * Revert "CU-8694gzbn3: Add debug output when running k-fold tests to see where it may be stalling" This reverts commit 03531da. * Revert "CU-8694gzbn3: Remove all timeouts from main workflow" This reverts commit e6debce. * Revert "CU-8694gzbn3: Fix test stop upon first failure (arg/option order)" This reverts commit 666c013. * Revert "CU-8694gzbn3: Make tests stop upon first failure" This reverts commit 94bce56. * Revert "CU-8694gzbn3: Remove timeout from unit tests in main workflow" This reverts commit 3618b9c. * CU-8694gzbn3: Improve state copy code in CDB state tests * CU-8694gzbn3: Fix a CDB state test issue * CU-8694gzbn3: Split all tests into 2 halves * CU-8694gzbn3: Remove legacy / archived / unused tests * CU-8694gzbn3: Add doc strings for FoldCreator init * CU-8694gzbn3: Move to a split-type enum * CU-8694gzbn3: Add documentation to split-type enum * CU-8694gzbn3: Create separate fold creators for different types of splitting strategies * CU-8694gzbn3: Resort document order in test time nullification process * CU-8694gzbn3: Add option to count number of annotations in doc for MCT export * CU-8694gzbn3: Add weighted documents based split option along with relevant tests * CU-8694gzbn3: Update default fold creation split type to weighted documents * CU-8694gzbn3: Add test to ensure weighted documents split creates a reasonable number of annotations per split * CU-8693n892x environment/dependency snapshots (#438) * CU-8693n892x: Save environment/dependency snapshot upon model pack creation * CU-8693n892x: Fix typing for env snapshot module * CU-8693n892x: Add test for env file existance in .zip * CU-8693n892x: Add doc strings * CU-8693n892x: Centralise env snapshot file name * CU-8693n892x: Add env snapshot file to exceptions in serialisation tests * CU-8693n892x: Only list direct dependencies * CU-8693n892x: Add test that verifies all direct dependencies are listed in environment * CU-8693n892x: Move requirements to separate file and use that for environment snapshot * CU-8693n892x: Remove unused constants * CU-8693n892x: Allow URL based dependencies when using direct dependencies * CU-8693n892x: Distribute install_requires.txt alongside the package; use correct path in distributed version * CU-8694p8y0k deprecation GHA check (#445) * CU-8694p8y0k: Add check for deprecations (code) * CU-8694p8y0k: Add workflow check for deprecations * CU-8694p8y0k: Fix (hopefully) workflow check for deprecations * CU-8694p8y0k: Add option to remove version prefix when checking deprecation * CU-8694p8y0k: Update deprecation checks with more detail (i.e current/next version). * CU-8694p8y0k: Only run deprecation checking step when merging master into production * CU-8694u3yd2 cleanup name removal (#450) * CU-8694u3yd2: Add logged warning for when using full-unlink * CU-8694u3yd2: Make CDB.remove_names simply expect an iterable of names * CU-8694u3yd2: Improve CDB.remove_names doc string * CU-8694u3yd2: Explicitly pass the keys to CDB.remove_names in CAT.unlink_concept_name * CU-8694u3yd2: Add note regarding state (and order) dependent tests to some CDB maker tests * CU-8694u3yd2: Rename/make protected CDB.remove_names method * CU-8694u3yd2: Create deprecated CDB.remove_names method * CU-8694vte2g 1.12 depr removal (#454) * CU-8694vte2g: Remove CDB.add_concept method * CU-8694vte2g: Remove unused import (deprecated decorator) * CU-8694vte2g: Remove CAT.get_spacy_nlp method * CU-8694vte2g: Remove CAT.train_supervised method * CU-8694vte2g: Remove CAT multiprocessing methods * CU-8694vte2g: Remove MetaCAT.train method * CU-8694vte2g: Remove medcat.utils.ner.helper.deid_text method * CU-8694vte2g: Remove use of deprecated method * CU-8694vte2g: Add back removed deprecation import --------- Co-authored-by: Shubham Agarwal <66172189+shubham-s-agarwal@users.noreply.github.com> Co-authored-by: Vlad Dinu <62345326+vladd-bit@users.noreply.github.com> Co-authored-by: Tom Searle <tsearle88@gmail.com>

mart-r added 22 commits May 8, 2024 16:30

CU-8694gzbud: Add context manager that is able to snapshot CDB state

35a66f3

CU-8694gzbud: Add tests to snapshotting CDB state

df8459b

CU-8694gzbud: Refactor tests for CDB state snapshotting

20a403e

CU-8694gzbud: Remove use of deprecated method in CDB utils and use no…

557602b

…n-deprecated one instead

CU-8694gzbud: Add tests for training and CDB state capturing

16e875b

CU-8694gzbud: Small refactor in tests

0294fea

CU-8694gzbud: Add option to save state on disk

cc2bcf5

CU-8694gzbud: Add debug logging output when saving state on disk

8b19609

CU-8694gzbud: Remove unused import

3017067

CU-8694gzbud: Add tests for disk-based state save

de74d0c

CU-8694gzbud: Move CDB state code to its own module

4a269ab

CU-8694gzbud: Remove unused import

bfcff96

CU-8694gzbud: Add doc strings to methods

c0ba2b9

CU-8694gzbx4: Small optimisation for stats

da585f5

CU-8694gzbx4: Add MCTExport related module

50ae4d7

CU-8694gzbx4: Add MCTExport related tests

b51117d

CU-8694gzbx4: Add code for k-fold statistics

c9944d3

CU-8694gzbx4: Add tests for k-fold statistics

bad5936

CU-8694gzbx4: Add test-MCT export with fake concepts

988905c

CU-8694gzbx4: Fix a doc string

b8c4ff4

CU-8694gzbx4: Fix types in MCT export module

bfdb4ca

CU-8694gzbx4: Fix types in k-fold module

df2bde7

mart-r added 7 commits May 10, 2024 18:55

CU-8694gzbx4: Remove accidentally committed test class

08a2d0d

CU-8694gzbn3: Add missing test helper file

05eb171

CU-8694gzbn3: Remove whitespace change from otherwise uncahnged file

383c929

CU-8694gzbn3: Allow 5 minutes longer for tests

48ce03e

CU-8694gzbn3: Move to python 3.8-compatible typed dict

316ec76

CU-8694gzbn3: Add more time for tests in worklow (now 30 minutes)

716e939

CU-8694gzbn3: Add more time for tests in worklow (now 45 minutes)

5be04d2

mart-r added 2 commits May 16, 2024 12:54

Merge branch 'master' into CU-8694gzbn3-k-fold-metrics

f1968d3

Merge branch 'master' into CU-8694gzbn3-k-fold-metrics

8356f16

baixiac reviewed May 22, 2024

View reviewed changes

CU-8694gzbn3: Add doc strings for FoldCreator init

db2c549

mart-r added 9 commits May 24, 2024 09:38

Merge branch 'master' into CU-8694gzbn3-k-fold-metrics

07b5b54

CU-8694gzbn3: Move to a split-type enum

79d5983

CU-8694gzbn3: Add documentation to split-type enum

b01132f

CU-8694gzbn3: Create separate fold creators for different types of sp…

fdac749

…litting strategies

CU-8694gzbn3: Resort document order in test time nullification process

6430ba3

CU-8694gzbn3: Add option to count number of annotations in doc for MC…

70b1c2e

…T export

CU-8694gzbn3: Add weighted documents based split option along with re…

3680105

…levant tests

CU-8694gzbn3: Update default fold creation split type to weighted doc…

a40b456

…uments

CU-8694gzbn3: Add test to ensure weighted documents split creates a r…

f79aa3d

…easonable number of annotations per split

baixiac reviewed May 28, 2024

View reviewed changes

mart-r mentioned this pull request Jun 12, 2024

CU-8694u3yd2 cleanup name removal #450

Merged

tomolopolis approved these changes Jun 18, 2024

View reviewed changes

Merge branch 'master' into CU-8694gzbn3-k-fold-metrics

439f62d

mart-r merged commit e11c1da into master Jun 19, 2024
8 checks passed

mart-r deleted the CU-8694gzbn3-k-fold-metrics branch August 12, 2024 12:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CU-8694gzbn3 k fold metrics #432

CU-8694gzbn3 k fold metrics #432

mart-r commented May 10, 2024 •

edited

Loading

tomolopolis commented May 10, 2024

baixiac left a comment

baixiac May 21, 2024

mart-r commented May 22, 2024 •

edited

Loading

baixiac commented May 22, 2024

mart-r commented May 22, 2024

baixiac May 28, 2024

tomolopolis Jun 18, 2024

tomolopolis commented Jun 18, 2024 •

edited

Loading

tomolopolis left a comment

tomolopolis Jun 18, 2024

mart-r Jun 18, 2024

mart-r commented Jun 18, 2024

baixiac commented Jun 18, 2024

mart-r commented Jun 18, 2024

baixiac commented Jun 18, 2024

mart-r commented Jun 18, 2024 •

edited

Loading

baixiac commented Jun 18, 2024

CU-8694gzbn3 k fold metrics #432

CU-8694gzbn3 k fold metrics #432

Conversation

mart-r commented May 10, 2024 • edited Loading

tomolopolis commented May 10, 2024

baixiac left a comment

Choose a reason for hiding this comment

baixiac May 21, 2024

Choose a reason for hiding this comment

mart-r commented May 22, 2024 • edited Loading

baixiac commented May 22, 2024

mart-r commented May 22, 2024

baixiac May 28, 2024

Choose a reason for hiding this comment

tomolopolis Jun 18, 2024

Choose a reason for hiding this comment

tomolopolis commented Jun 18, 2024 • edited Loading

tomolopolis left a comment

Choose a reason for hiding this comment

tomolopolis Jun 18, 2024

Choose a reason for hiding this comment

mart-r Jun 18, 2024

Choose a reason for hiding this comment

mart-r commented Jun 18, 2024

baixiac commented Jun 18, 2024

mart-r commented Jun 18, 2024

baixiac commented Jun 18, 2024

mart-r commented Jun 18, 2024 • edited Loading

baixiac commented Jun 18, 2024

mart-r commented May 10, 2024 •

edited

Loading

mart-r commented May 22, 2024 •

edited

Loading

tomolopolis commented Jun 18, 2024 •

edited

Loading

mart-r commented Jun 18, 2024 •

edited

Loading