Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(deps): bump charset-normalizer from 2.1.1 to 3.0.1 #1350

Merged
merged 1 commit into from
Jan 13, 2023

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Nov 18, 2022

Bumps charset-normalizer from 2.1.1 to 3.0.1.

Release notes

Sourced from charset-normalizer's releases.

Version 3.0.1

3.0.1 (2022-11-18)

Fixed

  • Multi-bytes cutter/chunk generator did not always cut correctly (PR #233)

Changed

  • Speedup provided using mypy/c 0.990 on Python >= 3.7

Version 3.0.0

3.0.0 (2022-10-20)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio
  • normalizer --version now specify if the current version provides extra speedup (meaning mypyc compilation whl)

Changed

  • Build with static metadata (not pyproject.toml yet)
  • Make language detection stricter
  • Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it
  • Sphinx warnings when generating the documentation

Removed

  • Coherence detector no longer returns 'Simple English' instead returns 'English'
  • Coherence detector no longer returns 'Classical Chinese' instead returns 'Chinese'
  • Breaking: Method first() and best() from CharsetMatch
  • UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflicts with ASCII)
  • Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
  • Breaking: Top-level function normalize
  • Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch
  • Support for the backport unicodedata2

This is the last version (3.0.x) to support Python 3.6 We plan to drop it for 3.1.x

Version 3.0.0rc1

This is the last pre-release. If everything goes well, I will publish the stable tag.

3.0.0rc1 (2022-10-18)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

... (truncated)

Changelog

Sourced from charset-normalizer's changelog.

3.0.1 (2022-11-18)

Fixed

  • Multi-bytes cutter/chunk generator did not always cut correctly (PR #233)

Changed

  • Speedup provided by mypy/c 0.990 on Python >= 3.7

3.0.0 (2022-10-20)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio
  • normalizer --version now specify if current version provide extra speedup (meaning mypyc compilation whl)

Changed

  • Build with static metadata using 'build' frontend
  • Make the language detection stricter
  • Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
  • Sphinx warnings when generating the documentation

Removed

  • Coherence detector no longer return 'Simple English' instead return 'English'
  • Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
  • Breaking: Method first() and best() from CharsetMatch
  • UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
  • Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
  • Breaking: Top-level function normalize
  • Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch
  • Support for the backport unicodedata2

3.0.0rc1 (2022-10-18)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

Changed

  • Build with static metadata using 'build' frontend
  • Make the language detection stricter

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it

... (truncated)

Upgrade guide

Sourced from charset-normalizer's upgrade guide.

Guide to upgrade your code from v1 to v2

  • If you are using the legacy detect function, that is it. You have nothing to do.

Detection

Before

from charset_normalizer import CharsetNormalizerMatches
results = CharsetNormalizerMatches.from_bytes(
'我没有埋怨,磋砣的只是一些时间。'.encode('utf_32')
)

After

from charset_normalizer import from_bytes
results = from_bytes(
'我没有埋怨,磋砣的只是一些时间。'.encode('utf_32')
)

Methods that once were staticmethods of the class CharsetNormalizerMatches are now basic functions. from_fp, from_bytes, from_fp and `` are concerned.

Staticmethods scheduled to be removed in version 3.0

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Nov 18, 2022
@github-actions
Copy link

Looks like a major version upgrade! Skipping auto-merge.

@dependabot dependabot bot force-pushed the dependabot/pip/charset-normalizer-3.0.1 branch from 75f8f7e to f2788de Compare December 7, 2022 22:04
@github-actions
Copy link

github-actions bot commented Dec 7, 2022

Looks like a major version upgrade! Skipping auto-merge.

@snarfed
Copy link
Owner

snarfed commented Jan 13, 2023

@dependabot rebase

@snarfed snarfed enabled auto-merge (rebase) January 13, 2023 00:40
@dependabot dependabot bot force-pushed the dependabot/pip/charset-normalizer-3.0.1 branch from f2788de to 6d57d4a Compare January 13, 2023 00:41
@github-actions
Copy link

Looks like a major version upgrade! Skipping auto-merge.

@dependabot dependabot bot force-pushed the dependabot/pip/charset-normalizer-3.0.1 branch from 6d57d4a to 21274b3 Compare January 13, 2023 12:22
@github-actions
Copy link

Looks like a major version upgrade! Skipping auto-merge.

@snarfed
Copy link
Owner

snarfed commented Jan 13, 2023

@dependabot rebase

Bumps [charset-normalizer](https://github.com/Ousret/charset_normalizer) from 2.1.1 to 3.0.1.
- [Release notes](https://github.com/Ousret/charset_normalizer/releases)
- [Changelog](https://github.com/Ousret/charset_normalizer/blob/master/CHANGELOG.md)
- [Upgrade guide](https://github.com/Ousret/charset_normalizer/blob/master/UPGRADE.md)
- [Commits](jawah/charset_normalizer@2.1.1...3.0.1)

---
updated-dependencies:
- dependency-name: charset-normalizer
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/charset-normalizer-3.0.1 branch from 21274b3 to 15cf2b3 Compare January 13, 2023 16:57
@github-actions
Copy link

Looks like a major version upgrade! Skipping auto-merge.

@snarfed snarfed merged commit f30f6f5 into main Jan 13, 2023
@snarfed snarfed deleted the dependabot/pip/charset-normalizer-3.0.1 branch January 13, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant