Hanzi decomposition (Chinese character decomposition) | 汉字拆字

拆字是指將一文字，以筆畫、字形等基本組成單位分解成多個文字。 The decomposition of characters refers to breaking down a single character into multiple characters based on its basic components, such as strokes and structural elements.

汉字拆字让字型相似的字具有相似的拆解结果。 Hanzi decomposition yields similar decomposition results for characters with similar structures.

这种特性可以被深度学习模型用来作为字的特征之一：字形的特征。 This feature can be used by deep learning models as one of the features of characters: the structural feature.

Installation

pip install hanzi_chaizi

Usage

from hanzi_chaizi import HanziChaizi

hc = HanziChaizi()
result = hc.query('名')

print(result)

Output:

['夕', '口']

Development

Data source

Data from this project: 漢語拆字字典

parsing and convert data format

pytohn dev_scripts/parse.py

Credits

Data from this project: 漢語拆字字典

Citation

@misc{kong2018hanzichaizi,
  title={Hanzi Chaizi},
  author={Xiaoquan Kong},
  howpublished={https://github.com/howl-anderson/hanzi_chaizi},
  year={2018}
}

If the package is cited in books, seminars, and academic research papers, or used in company products, you are welcome (but not required) to email me about this. I'm glad to see the package being used and valuable to everyone.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.idea		.idea
chaizi @ e177ab5		chaizi @ e177ab5
dev_scripts		dev_scripts
example_code		example_code
hanzi_chaizi		hanzi_chaizi
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
dev_requirements.txt		dev_requirements.txt
makefile		makefile
pytest.ini		pytest.ini
setup.py		setup.py
test_requirements.txt		test_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hanzi decomposition (Chinese character decomposition) | 汉字拆字

Installation

Usage

Development

Data source

parsing and convert data format

Credits

Citation

About

Languages

License

howl-anderson/hanzi_chaizi

Folders and files

Latest commit

History

Repository files navigation

Hanzi decomposition (Chinese character decomposition) | 汉字拆字

Installation

Usage

Development

Data source

parsing and convert data format

Credits

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages