Commit Graph

3 Commits

Author SHA1 Message Date
KITAITI Makoto
92a386277b
Switchable tokenizer (#776)
* [REFACTORING]Rename whitespace_tokenizer to tag_tokenizer for
registration

Name representing its purpose is preferred.

* Add lindera-tantivy to plume-model's dependencies

* Install lindera-tantivy

* Add SearchTokenizerConfig struct

* Add search tokenizers to config option

* Use CONFIG for tokenizers

* Use enum to hold tokenizer config instead of initializing on config phase

* Use guard instead of duplicate default values

* Use as_deref() instead of guard

* Move SearchTokenizer from plume-models to plume-models::search::tokenizer

* Rename SearchTokenizer to TokenizerKind

* Define SearchTokenierConfig::determine_tokenizer()

* Use determine_tokenizer in SearchTokenizerConfig::init()

* Pass tokenizer config to Searcher methods

* Add LowerCase filter to Lindera tokenizer

* Add test for Lindera tokenizer

* Define SEARCH_LANG env to specify tokenizers set

* Run cargo fmt

* Make Lindera tokenizer optional

* Fix typos
2020-06-17 16:57:28 +02:00
KITAITI Makoto
ef70cb93e6
Upgrade Tantivy to v0.12.0 (#771)
* Upgrade Tantivy to 0.12.0

* Follow Tantivy Tokenizer's new type definition

* Wrap tokenizers with TextAnalyzer to use filter methods

* Replace async IndexWriter::garbage_collect_files with sync functions

* Update Cargo.toml
2020-05-20 13:31:45 +02:00
fdb-hiroshima
449641d158
Add a search engine into Plume (#324)
* Add search engine to the model

Add a Tantivy based search engine to the model
Implement most required functions for it

* Implement indexing and plm subcommands

Implement indexation on insert, update and delete
Modify func args to get the indexer where required
Add subcommand to initialize, refill and unlock search db

* Move to a new threadpool engine allowing scheduling

* Autocommit search index every half an hour

* Implement front part of search

Add default fields for search
Add new routes and templates for search and result
Implement FromFormValue for Page to reuse it on search result pagination
Add optional query parameters to paginate template's macro
Update to newer rocket_csrf, don't get csrf token on GET forms

* Handle process termination to release lock

Handle process termination
Add tests to search

* Add proper support for advanced search

Add an advanced search form to /search, in template and route
Modify Tantivy schema, add new tokenizer for some properties
Create new String query parser
Create Tantivy query AST from our own

* Split search.rs, add comment and tests

Split search.rs into multiple submodules
Add comments and tests for Query
Make user@domain be treated as one could assume
2018-12-02 17:37:51 +01:00