printing and logging inconsistencies #2654

Xabilahu · 2022-03-03T18:45:54Z

Describe the bug
There are several inconsistencies between printing with the native print python function and printing with logging.getLogger('flair').

For example:

flair/flair/trainers/trainer.py

Lines 542 to 543 in 9697dc2

    
           if save_model_each_k_epochs > 0 and epoch % save_model_each_k_epochs == 0: 
        
               print("saving model of current epoch")

flair/flair/training_utils.py

Lines 303 to 304 in 9697dc2

    
           if self.verbose: 
        
               print("Epoch {:5d}: reducing learning rate" " of group {} to {:.4e}.".format(epoch, i, new_lr))

To Reproduce

from flair.datasets import UD_ENGLISH
from flair.embeddings import WordEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer

# 1. get the corpus
corpus = UD_ENGLISH().downsample(0.1)

# 2. what label do we want to predict?
label_type = 'upos'

# 3. make the label dictionary from the corpus
label_dict = corpus.make_label_dictionary(label_type=label_type)

# 4. initialize embeddings
embeddings = WordEmbeddings('glove')

# 5. initialize sequence tagger
tagger = SequenceTagger(hidden_size=256,
                        embeddings=embeddings,
                        tag_dictionary=label_dict,
                        tag_type=label_type,
                        use_crf=True
)

# 6. initialize trainer
trainer = ModelTrainer(tagger, corpus)

# 7. start training
trainer.train('resources/taggers/example-upos',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=10,
              save_model_each_k_epochs=1,
)

Expected behavior

All messages should be logged using logger.getLogger('flair').

Screenshots

Check that the message where learning rate is reduced is not being logged.

2022-03-03 15:49:45,287 ----------------------------------------------------------------------------------------------------
2022-03-03 15:49:49,655 epoch 19 - iter 14/141 - loss 0.02138027 - samples/sec: 57.27 - lr: 0.100000
2022-03-03 15:49:53,916 epoch 19 - iter 28/141 - loss 0.02371238 - samples/sec: 53.10 - lr: 0.100000
2022-03-03 15:49:57,994 epoch 19 - iter 42/141 - loss 0.02502675 - samples/sec: 55.68 - lr: 0.100000
2022-03-03 15:50:02,287 epoch 19 - iter 56/141 - loss 0.02439726 - samples/sec: 52.46 - lr: 0.100000
2022-03-03 15:50:06,238 epoch 19 - iter 70/141 - loss 0.02427097 - samples/sec: 57.13 - lr: 0.100000
2022-03-03 15:50:09,520 epoch 19 - iter 84/141 - loss 0.02436758 - samples/sec: 68.99 - lr: 0.100000
2022-03-03 15:50:12,905 epoch 19 - iter 98/141 - loss 0.02479929 - samples/sec: 66.81 - lr: 0.100000
2022-03-03 15:50:17,367 epoch 19 - iter 112/141 - loss 0.02456275 - samples/sec: 54.21 - lr: 0.100000
2022-03-03 15:50:20,890 epoch 19 - iter 126/141 - loss 0.02462007 - samples/sec: 64.00 - lr: 0.100000
2022-03-03 15:50:24,996 epoch 19 - iter 140/141 - loss 0.02425031 - samples/sec: 54.84 - lr: 0.100000
2022-03-03 15:50:25,464 ----------------------------------------------------------------------------------------------------
2022-03-03 15:50:25,466 EPOCH 19 done: loss 0.0241 - lr 0.1000000
2022-03-03 15:50:30,838 DEV : loss 0.031227365136146545 - f1-score (micro avg)  0.8247
Epoch    19: reducing learning rate of group 0 to 5.0000e-02.
2022-03-03 15:50:31,099 BAD EPOCHS (no improvement): 4
2022-03-03 15:50:31,102 ----------------------------------------------------------------------------------------------------
2022-03-03 15:50:34,923 epoch 20 - iter 14/141 - loss 0.02504787 - samples/sec: 65.37 - lr: 0.050000
2022-03-03 15:50:39,327 epoch 20 - iter 28/141 - loss 0.02588841 - samples/sec: 51.29 - lr: 0.050000
2022-03-03 15:50:43,880 epoch 20 - iter 42/141 - loss 0.02613723 - samples/sec: 49.67 - lr: 0.050000
2022-03-03 15:50:47,592 epoch 20 - iter 56/141 - loss 0.02546627 - samples/sec: 60.97 - lr: 0.050000
2022-03-03 15:50:51,840 epoch 20 - iter 70/141 - loss 0.02415541 - samples/sec: 53.63 - lr: 0.050000
2022-03-03 15:50:55,439 epoch 20 - iter 84/141 - loss 0.02404081 - samples/sec: 62.85 - lr: 0.050000
2022-03-03 15:51:00,323 epoch 20 - iter 98/141 - loss 0.02341436 - samples/sec: 51.17 - lr: 0.050000
2022-03-03 15:51:04,282 epoch 20 - iter 112/141 - loss 0.02350316 - samples/sec: 56.94 - lr: 0.050000
2022-03-03 15:51:07,867 epoch 20 - iter 126/141 - loss 0.02345933 - samples/sec: 62.84 - lr: 0.050000
2022-03-03 15:51:11,720 epoch 20 - iter 140/141 - loss 0.02359901 - samples/sec: 58.54 - lr: 0.050000
2022-03-03 15:51:12,204 ----------------------------------------------------------------------------------------------------

Environment (please complete the following information):

OS: Arch Linux x86_64
Version: flair==0.10

Additional context

Printing with print in testing is ok, but it should not be like that in production ready code.

The text was updated successfully, but these errors were encountered:

alanakbik · 2022-03-10T07:54:35Z

@Xabilahu thanks for spotting this, agree that the printouts should be replaced with logging statements! @Weyaaron can you take a look?

alanakbik · 2022-03-10T10:10:33Z

@Xabilahu thanks for the PR!

@Weyaaron I guess this is then already taken care of ;)

GH-2654: Fixed printing and logging inconsistencies.

@Weyaaron

* flairNLPGH-2632: Revert "Removes hyperparameter features" This reverts commit 9aff426. * flairNLPGH-2632: Updating the param selection docs for the v0.10 syntax * flairNLPGH-2632: Adding hyperopt back to requirements.txt * flairNLPGH-2632: Fixing paramselection code to work with changes in Flair v0.10 * flairNLPGH-2632: Fixing bug where embeddings got added twice on multiple training runs * flairNLPGH-2632: Enabling and fixing tests for param selection * flairNLPGH-2632: Fixing flake, mypy and isort issues * Dropout for all * fix first_last * Fix printouts for SequenceTagger * 🐛 Fix .pre-commit-config.yaml While trying to set up pre-commit, I got an indentation error. Moreover, pycqa/isort does not have a stable rev. I set it to the most recent release tag. * feat: ✨ initial implementation of JsonlCorpora and Datasets * flairNLPGH-2654: Fixed printing and logging inconsistencies. * Adding TransformerDocumentEmbeddings support to TextClassifierParamSelector and applying PR suggestions * Fixing flake tests * Using a small transformer in tests to reduce the CI agent memory usage * Fix find_learning_rate * Updating korean docs * removing warining from step() * fix: patch the missing `document_delmiter` for `lm.__get_state__()` * updated broken link * flairNLPGH-2654: Added review comments made by @Weyaaron * flairNLPGH-2654: Fix breaking gzip import * Making fune_tune a normal (non-tunable) parameter and defaulting it to True * refactor: pin pytest in pipfile * refactor: ♻️ make label_type configurable for Jsonl corpora * fix: pin isort correctly to major release 5 * refactor: pin isort in pipfile to major release 5 * Fix relation extractor * datasets: add support for HIPE 2022 * datasets: register NER_HIPE_2022 * tests: add extensive test cases for all sub-datasets for HIPE 2022 * Set default dropouts to 0 for equality to previous approaches * datasets: fix flake8 errors for HIPE 2022 integration * Update flair/models/language_model.py Co-authored-by: Tadej Magajna <tmagajna@gmail.com> * Formatting * datasets: add support for v2 of HIPE-2022 dataset * tests: update cases for v2 of HIPE-2022 dataset * tests: minor flake fix for datasets * tests: adjust latest HIPE v2.0 data changes for SONAR and NewsEye dataset * datasets: switch to main as default branch name for HIPE-2022 data repo * datasets: introduce some bug fixes for HIPE-2022 (tab as delimiter, ignore empty tokens) * test: include label checking tests for HIPE-2022 * datasets: beautify emtpy token fix for HIPE-2022 dataset reader * tests: fix mypy error (hopefully) * datasets: fix mypy error (hopefully) * flairNLPGH-2689: bump version numbers * different way to exclude labels * remove comment * Change printouts for all DataPoints and Labels * Black formatting * Update printouts * Update printouts to round confidence scores * Add Arabic NER models back in * Update readmes for new label logic and printouts * Make DataPoint hashable and add test * Do not add O tags * Remap relation labels * minor formatting * Changed the documentation on OneHotEmbeddings to reflect changes in the master version: OneHotEmbeddings.from_corpus() instead of OneHotEmbeddings(). * Nicer printouts for make_label_dictionary * Update documentation * Black formatting * small fixes * Global arrow symbol * Global arrow symbol * Update relation model * Fix unit test * Fix unit tests * datasets import * Update documentation * Update TUTORIAL_7_TRAINING_A_MODEL.md * datasets: add possibility to use custom preprocessing function for HIPE-2022 * datasets: fix mypy error for HIPE-2022 preprocessing function * datasets: revert self from HIPE-2022 preprocessing fn * datasets: fix preprocessing function handling in HIPE-2022 * Minor fixes for tutorials * Fix the SciSpacyTokenizer.tokenize() bug. Makes sure the words are added to the correct list variable and that strings, not SpaCy Token objects, are returned. * Fixing Hunflair docs that depended on SciSpacyTokenizer * flairNLPGH-2713: make transformer offset calculation more robust * flairNLPGH-2717: add option to ignore labels to ColumnCorpus * flairNLPGH-2717: formatting * flairNLPGH-2689: bump version numbers to 0.11.1 * flairNLPGH-2720: handle consecutive whitespaces * add exclude labels parameter to trainer.train and minor change in AIDA corpus * minor formatting * minor formating * Remove unnecessary more-itertools pin The dependency and the pin were added in https://github.com/flairNLP/flair/pull/2312/files. more-itertools is a pretty stable library. * fix wrong initialisations of label (where data_type was missing) and reintroduce working version of "return_probabilities_for_all_classes" for sequence tagger * datasets: add support for version 2.1 of HIPE-2022 * added missing property decorator * add encoding=utf-8 to file handles in NER_HIPE_2022 corpus * minor formatting * flairNLPGH-2728: add option to force token-level predictions * Move files to fix unit tests * Adapt dataset name depending on whether use_ids_and_check_existence is set * Fix unit tests for GERMEVAL dataset rename * Ignore deviation in signature in mypy * Black formattin * Extend span detection logic * flairNLPGH-2722: make span detection more robust * Add missing data * cache models used in testing to speed up tests * create cache folder if it doesn't exist * set cache to local folder * don't create redundant cache prefix * fix mypy error * dummy commit to see how fast tests run with caching * don't force creation of cache folder (as it should be created whenever needed anyways) * flairNLPGH-2754: bump version numbers * Update gdown requirement Advance gdown to latest release. * flairNLPGH-2763: remove legacy TransformerXLEmbeddings class * flairNLPGH-2765: Test with Python 3.7 * fix unit tests * flairNLPGH-2770: bump version numbers Co-authored-by: Tadej Magajna <tmagajna@gmail.com> Co-authored-by: Alan Akbik <alan.akbik@gmail.com> Co-authored-by: AnotherStranger <AnotherStranger@users.noreply.github.com> Co-authored-by: Xabier Lahuerta Vázquez <xlahuerta@protonmail.com> Co-authored-by: Mike Tian-Jian Jiang <tmjiang@gmail.com> Co-authored-by: Rishivant Singh <rishivant.singh@knoldus.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Marcel <marcelmilch@gmx.de> Co-authored-by: j <9658618+stw2@users.noreply.github.com> Co-authored-by: Benedikt Fuchs <e1526472@student.tuwien.ac.at> Co-authored-by: mauryaland <amaury@fouret.org> Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com> Co-authored-by: susannaruecker <susanna.ruecker@hu-berlin.de> Co-authored-by: upgradvisor-bot <92053865+upgradvisor-bot@users.noreply.github.com>

Xabilahu added the bug Something isn't working label Mar 3, 2022

Weyaaron self-assigned this Mar 10, 2022

Xabilahu mentioned this issue Mar 10, 2022

GH-2654: Fixed printing and logging inconsistencies. #2665

Merged

alanakbik closed this as completed in #2665 Mar 14, 2022

alanakbik added a commit that referenced this issue Mar 14, 2022

Merge pull request #2665 from Xabilahu/master

2a01edf

GH-2654: Fixed printing and logging inconsistencies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

printing and logging inconsistencies #2654

printing and logging inconsistencies #2654

Xabilahu commented Mar 3, 2022

alanakbik commented Mar 10, 2022

alanakbik commented Mar 10, 2022

printing and logging inconsistencies #2654

printing and logging inconsistencies #2654

Comments

Xabilahu commented Mar 3, 2022

alanakbik commented Mar 10, 2022

alanakbik commented Mar 10, 2022