Plume/plume-models
KITAITI Makoto 92a386277b
Switchable tokenizer (#776)
* [REFACTORING]Rename whitespace_tokenizer to tag_tokenizer for
registration

Name representing its purpose is preferred.

* Add lindera-tantivy to plume-model's dependencies

* Install lindera-tantivy

* Add SearchTokenizerConfig struct

* Add search tokenizers to config option

* Use CONFIG for tokenizers

* Use enum to hold tokenizer config instead of initializing on config phase

* Use guard instead of duplicate default values

* Use as_deref() instead of guard

* Move SearchTokenizer from plume-models to plume-models::search::tokenizer

* Rename SearchTokenizer to TokenizerKind

* Define SearchTokenierConfig::determine_tokenizer()

* Use determine_tokenizer in SearchTokenizerConfig::init()

* Pass tokenizer config to Searcher methods

* Add LowerCase filter to Lindera tokenizer

* Add test for Lindera tokenizer

* Define SEARCH_LANG env to specify tokenizers set

* Run cargo fmt

* Make Lindera tokenizer optional

* Fix typos
2020-06-17 16:57:28 +02:00
..
src Switchable tokenizer (#776) 2020-06-17 16:57:28 +02:00
tests Rust 2018! (#726) 2020-01-21 07:02:03 +01:00
Cargo.toml Switchable tokenizer (#776) 2020-06-17 16:57:28 +02:00