Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CJK index creation and query #802

Closed
xiaoyifang opened this issue Jun 2, 2023 · 2 comments · Fixed by #806
Closed

Support CJK index creation and query #802

xiaoyifang opened this issue Jun 2, 2023 · 2 comments · Fixed by #806

Comments

@xiaoyifang
Copy link
Contributor

xiaoyifang commented Jun 2, 2023

Current situation:
https://download.kiwix.org/zim/wikipedia/wikipedia_zh_mathematics_mini_2023-05.zim

fulltext search 平方
gives no result.
image

actually there are much more articles contain this word.
image
image

english word seems ok.
image
The No seems another issue.

CJK characters have to be indexed with CJK flags

Creation:

   Xapian::TermGenerator indexer;

    indexer.set_flags( Xapian::TermGenerator::FLAG_CJK_NGRAM );

Query:

      Xapian::QueryParser::feature_flag flag = Xapian::QueryParser::FLAG_DEFAULT;
      if( searchMode == FTS::Wildcards )
        flag = Xapian::QueryParser::FLAG_WILDCARD;
      Xapian::Query query = qp.parse_query( query_string, flag|Xapian::QueryParser::FLAG_CJK_NGRAM );

Originally posted by @kelson42 in #794 (comment)

@mgautierfr
Copy link
Collaborator

@xiaoyifang The PR #806 should fix this.
Can you test on your side ? I'm not sure to fully understand what is the expected behavior :)

@kelson42 kelson42 added this to the 8.3.0 milestone Jun 7, 2023
@kelson42 kelson42 changed the title support CJK index creation and query Support CJK index creation and query Jun 7, 2023
@kelson42 kelson42 modified the milestones: 8.3.0, 8.2.1 Jun 7, 2023
@xiaoyifang
Copy link
Contributor Author

From the unittest you provided .it seems ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants