-
YUL Customizations *
Yul has added some mappings and we have moved the mappings into an XML file rather than in the Java code. The mappings are now loaded in a static block so that each instance doesn’t re-add the items to the map.
This is a Lucene filter and filter factory (see lucene.apache.org ) to fold certain CJK characters to improve recall. You should put it in your analysis chain BEFORE ICUTransforms from Traditional->Simplified Han, as it converts modern Japanese Kanji to their traditional equivalents.
-
clone the project
git clone git://github.com/solrmarc/CJKFoldingFilter.git
-
run the jar ant task
ant jar
-
put the CJKFoldingFilter.jar file found in the dist directory into your Solr lib directory
-
utilize the Solr CJKFoldingFilterFactory in your schema.xml file.
<fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="10000" autoGeneratePhraseQueries="false"> <analyzer> <tokenizer class="solr.ICUTokenizerFactory" /> <filter class="solr.CJKWidthFilterFactory"/> <filter class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/> <filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/> <filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/> <filter class="solr.ICUFoldingFilterFactory"/> <filter class="solr.CJKBigramFilterFactory" han="true" hiragana="true" katakana="true" hangul="true" outputUnigrams="true" /> </analyzer> </fieldType>
-
Fork it
-
Create your feature branch (‘git checkout -b my-new-feature`)
-
Commit your changes (‘git commit -am ’Added some feature’‘)
-
Push to the branch (‘git push origin my-new-feature`)
-
Create new Pull Request