Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

printing and logging inconsistencies #2654

Closed
Xabilahu opened this issue Mar 3, 2022 · 2 comments · Fixed by #2665
Closed

printing and logging inconsistencies #2654

Xabilahu opened this issue Mar 3, 2022 · 2 comments · Fixed by #2665
Assignees
Labels
bug Something isn't working

Comments

@Xabilahu
Copy link
Contributor

Xabilahu commented Mar 3, 2022

Describe the bug
There are several inconsistencies between printing with the native print python function and printing with logging.getLogger('flair').

For example:

if save_model_each_k_epochs > 0 and epoch % save_model_each_k_epochs == 0:
print("saving model of current epoch")

if self.verbose:
print("Epoch {:5d}: reducing learning rate" " of group {} to {:.4e}.".format(epoch, i, new_lr))

To Reproduce

from flair.datasets import UD_ENGLISH
from flair.embeddings import WordEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer

# 1. get the corpus
corpus = UD_ENGLISH().downsample(0.1)

# 2. what label do we want to predict?
label_type = 'upos'

# 3. make the label dictionary from the corpus
label_dict = corpus.make_label_dictionary(label_type=label_type)

# 4. initialize embeddings
embeddings = WordEmbeddings('glove')

# 5. initialize sequence tagger
tagger = SequenceTagger(hidden_size=256,
                        embeddings=embeddings,
                        tag_dictionary=label_dict,
                        tag_type=label_type,
                        use_crf=True
)

# 6. initialize trainer
trainer = ModelTrainer(tagger, corpus)

# 7. start training
trainer.train('resources/taggers/example-upos',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=10,
              save_model_each_k_epochs=1,
)

Expected behavior

All messages should be logged using logger.getLogger('flair').

Screenshots

Check that the message where learning rate is reduced is not being logged.

2022-03-03 15:49:45,287 ----------------------------------------------------------------------------------------------------
2022-03-03 15:49:49,655 epoch 19 - iter 14/141 - loss 0.02138027 - samples/sec: 57.27 - lr: 0.100000
2022-03-03 15:49:53,916 epoch 19 - iter 28/141 - loss 0.02371238 - samples/sec: 53.10 - lr: 0.100000
2022-03-03 15:49:57,994 epoch 19 - iter 42/141 - loss 0.02502675 - samples/sec: 55.68 - lr: 0.100000
2022-03-03 15:50:02,287 epoch 19 - iter 56/141 - loss 0.02439726 - samples/sec: 52.46 - lr: 0.100000
2022-03-03 15:50:06,238 epoch 19 - iter 70/141 - loss 0.02427097 - samples/sec: 57.13 - lr: 0.100000
2022-03-03 15:50:09,520 epoch 19 - iter 84/141 - loss 0.02436758 - samples/sec: 68.99 - lr: 0.100000
2022-03-03 15:50:12,905 epoch 19 - iter 98/141 - loss 0.02479929 - samples/sec: 66.81 - lr: 0.100000
2022-03-03 15:50:17,367 epoch 19 - iter 112/141 - loss 0.02456275 - samples/sec: 54.21 - lr: 0.100000
2022-03-03 15:50:20,890 epoch 19 - iter 126/141 - loss 0.02462007 - samples/sec: 64.00 - lr: 0.100000
2022-03-03 15:50:24,996 epoch 19 - iter 140/141 - loss 0.02425031 - samples/sec: 54.84 - lr: 0.100000
2022-03-03 15:50:25,464 ----------------------------------------------------------------------------------------------------
2022-03-03 15:50:25,466 EPOCH 19 done: loss 0.0241 - lr 0.1000000
2022-03-03 15:50:30,838 DEV : loss 0.031227365136146545 - f1-score (micro avg)  0.8247
Epoch    19: reducing learning rate of group 0 to 5.0000e-02.
2022-03-03 15:50:31,099 BAD EPOCHS (no improvement): 4
2022-03-03 15:50:31,102 ----------------------------------------------------------------------------------------------------
2022-03-03 15:50:34,923 epoch 20 - iter 14/141 - loss 0.02504787 - samples/sec: 65.37 - lr: 0.050000
2022-03-03 15:50:39,327 epoch 20 - iter 28/141 - loss 0.02588841 - samples/sec: 51.29 - lr: 0.050000
2022-03-03 15:50:43,880 epoch 20 - iter 42/141 - loss 0.02613723 - samples/sec: 49.67 - lr: 0.050000
2022-03-03 15:50:47,592 epoch 20 - iter 56/141 - loss 0.02546627 - samples/sec: 60.97 - lr: 0.050000
2022-03-03 15:50:51,840 epoch 20 - iter 70/141 - loss 0.02415541 - samples/sec: 53.63 - lr: 0.050000
2022-03-03 15:50:55,439 epoch 20 - iter 84/141 - loss 0.02404081 - samples/sec: 62.85 - lr: 0.050000
2022-03-03 15:51:00,323 epoch 20 - iter 98/141 - loss 0.02341436 - samples/sec: 51.17 - lr: 0.050000
2022-03-03 15:51:04,282 epoch 20 - iter 112/141 - loss 0.02350316 - samples/sec: 56.94 - lr: 0.050000
2022-03-03 15:51:07,867 epoch 20 - iter 126/141 - loss 0.02345933 - samples/sec: 62.84 - lr: 0.050000
2022-03-03 15:51:11,720 epoch 20 - iter 140/141 - loss 0.02359901 - samples/sec: 58.54 - lr: 0.050000
2022-03-03 15:51:12,204 ----------------------------------------------------------------------------------------------------

Environment (please complete the following information):

  • OS: Arch Linux x86_64
  • Version: flair==0.10

Additional context

Printing with print in testing is ok, but it should not be like that in production ready code.

@Xabilahu Xabilahu added the bug Something isn't working label Mar 3, 2022
@alanakbik
Copy link
Collaborator

@Xabilahu thanks for spotting this, agree that the printouts should be replaced with logging statements! @Weyaaron can you take a look?

@alanakbik
Copy link
Collaborator

@Xabilahu thanks for the PR!

@Weyaaron I guess this is then already taken care of ;)

alanakbik added a commit that referenced this issue Mar 14, 2022
GH-2654: Fixed printing and logging inconsistencies.
patrickjae added a commit to showheroes/flair that referenced this issue May 18, 2022
* flairNLPGH-2632: Revert "Removes hyperparameter features"

This reverts commit 9aff426.

* flairNLPGH-2632: Updating the param selection docs for the v0.10 syntax

* flairNLPGH-2632: Adding hyperopt back to requirements.txt

* flairNLPGH-2632: Fixing paramselection code to work with changes in Flair v0.10

* flairNLPGH-2632: Fixing bug where embeddings got added twice on multiple training runs

* flairNLPGH-2632: Enabling and fixing tests for param selection

* flairNLPGH-2632: Fixing flake, mypy and isort issues

* Dropout for all

* fix first_last

* Fix printouts for SequenceTagger

* 🐛 Fix .pre-commit-config.yaml

While trying to set up pre-commit, I got an indentation error.
Moreover, pycqa/isort does not have a stable rev. I set it to the most recent release tag.

* feat: ✨ initial implementation of JsonlCorpora and Datasets

* flairNLPGH-2654: Fixed printing and logging inconsistencies.

* Adding TransformerDocumentEmbeddings support to TextClassifierParamSelector and applying PR suggestions

* Fixing flake tests

* Using a small transformer in tests to reduce the CI agent memory usage

* Fix find_learning_rate

* Updating korean docs

* removing warining from step()

* fix: patch the missing `document_delmiter` for `lm.__get_state__()`

* updated broken link

* flairNLPGH-2654: Added review comments made by @Weyaaron

* flairNLPGH-2654: Fix breaking gzip import

* Making fune_tune a normal (non-tunable) parameter and defaulting it to True

* refactor: pin pytest in pipfile

* refactor: ♻️ make label_type configurable for Jsonl corpora

* fix: pin isort correctly to major release 5

* refactor: pin isort in pipfile to major release 5

* Fix relation extractor

* datasets: add support for HIPE 2022

* datasets: register NER_HIPE_2022

* tests: add extensive test cases for all sub-datasets for HIPE 2022

* Set default dropouts to 0 for equality to previous approaches

* datasets: fix flake8 errors for HIPE 2022 integration

* Update flair/models/language_model.py

Co-authored-by: Tadej Magajna <tmagajna@gmail.com>

* Formatting

* datasets: add support for v2 of HIPE-2022 dataset

* tests: update cases for v2 of HIPE-2022 dataset

* tests: minor flake fix for datasets

* tests: adjust latest HIPE v2.0 data changes for SONAR and NewsEye dataset

* datasets: switch to main as default branch name for HIPE-2022 data repo

* datasets: introduce some bug fixes for HIPE-2022 (tab as delimiter, ignore empty tokens)

* test: include label checking tests for HIPE-2022

* datasets: beautify emtpy token fix for HIPE-2022 dataset reader

* tests: fix mypy error (hopefully)

* datasets: fix mypy error (hopefully)

* flairNLPGH-2689: bump version numbers

* different way to exclude labels

* remove comment

* Change printouts for all DataPoints and Labels

* Black formatting

* Update printouts

* Update printouts to round confidence scores

* Add Arabic NER models back in

* Update readmes for new label logic and printouts

* Make DataPoint hashable and add test

* Do not add O tags

* Remap relation labels

* minor formatting

* Changed the documentation on OneHotEmbeddings to reflect changes in the master version: OneHotEmbeddings.from_corpus() instead of OneHotEmbeddings().

* Nicer printouts for make_label_dictionary

* Update documentation

* Black formatting

* small fixes

* Global arrow symbol

* Global arrow symbol

* Update relation model

* Fix unit test

* Fix unit tests

* datasets import

* Update documentation

* Update TUTORIAL_7_TRAINING_A_MODEL.md

* datasets: add possibility to use custom preprocessing function for HIPE-2022

* datasets: fix mypy error for HIPE-2022 preprocessing function

* datasets: revert self from HIPE-2022 preprocessing fn

* datasets: fix preprocessing function handling in HIPE-2022

* Minor fixes for tutorials

* Fix the SciSpacyTokenizer.tokenize() bug.

Makes sure the words are added to the correct list variable and that strings, not SpaCy Token objects, are returned.

* Fixing Hunflair docs that depended on SciSpacyTokenizer

* flairNLPGH-2713: make transformer offset calculation more robust

* flairNLPGH-2717: add option to ignore labels to ColumnCorpus

* flairNLPGH-2717: formatting

* flairNLPGH-2689: bump version numbers to 0.11.1

* flairNLPGH-2720: handle consecutive whitespaces

* add exclude labels parameter to trainer.train and minor change in AIDA corpus

* minor formatting

* minor formating

* Remove unnecessary more-itertools pin

The dependency and the pin were added in https://github.com/flairNLP/flair/pull/2312/files. more-itertools is a pretty stable library.

* fix wrong initialisations of label (where data_type was missing) and reintroduce working version of "return_probabilities_for_all_classes" for sequence tagger

* datasets: add support for version 2.1 of HIPE-2022

* added missing property decorator

* add encoding=utf-8 to file handles in NER_HIPE_2022 corpus

* minor formatting

* flairNLPGH-2728: add option to force token-level predictions

* Move files to fix unit tests

* Adapt dataset name depending on whether use_ids_and_check_existence is set

* Fix unit tests for GERMEVAL dataset rename

* Ignore deviation in signature in mypy

* Black formattin

* Extend span detection logic

* flairNLPGH-2722: make span detection more robust

* Add missing data

* cache models used in testing to speed up tests

* create cache folder if it doesn't exist

* set cache to local folder

* don't create redundant cache prefix

* fix mypy error

* dummy commit to see how fast tests run with caching

* don't force creation of cache folder (as it should be created whenever needed anyways)

* flairNLPGH-2754: bump version numbers

* Update gdown requirement

Advance gdown to latest release.

* flairNLPGH-2763: remove legacy TransformerXLEmbeddings class

* flairNLPGH-2765: Test with Python 3.7

* fix unit tests

* flairNLPGH-2770: bump version numbers

Co-authored-by: Tadej Magajna <tmagajna@gmail.com>
Co-authored-by: Alan Akbik <alan.akbik@gmail.com>
Co-authored-by: AnotherStranger <AnotherStranger@users.noreply.github.com>
Co-authored-by: Xabier Lahuerta Vázquez <xlahuerta@protonmail.com>
Co-authored-by: Mike Tian-Jian Jiang <tmjiang@gmail.com>
Co-authored-by: Rishivant Singh <rishivant.singh@knoldus.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Marcel <marcelmilch@gmx.de>
Co-authored-by: j <9658618+stw2@users.noreply.github.com>
Co-authored-by: Benedikt Fuchs <e1526472@student.tuwien.ac.at>
Co-authored-by: mauryaland <amaury@fouret.org>
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Co-authored-by: susannaruecker <susanna.ruecker@hu-berlin.de>
Co-authored-by: upgradvisor-bot <92053865+upgradvisor-bot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants