Update notes for 2018-05-17

This commit is contained in:
2018-05-17 13:14:29 +03:00
parent 232693d9d3
commit 00dc8241fc
3 changed files with 10 additions and 8 deletions

View File

@ -286,3 +286,4 @@ ga('send', 'pageview', {
- I'm not sure which method is better, perhaps the `solr.ASCIIFoldingFilterFactory` filter because it doesn't require copying the `mapping-FoldToASCII.txt` file
- And actually I'm not entirely sure about the order of filtering before tokenizing, etc...
- Ah, I see that `charFilter` must be before the tokenizer because it works on a stream, whereas `filter` operates on tokenized input so it must come after the tokenizer
- Regarding the use of the `charFilter` vs the `filter` class before and after the tokenizer, respectively, I think it's better to use the `charFilter` to normalize the input stream before tokenizing it as I have no idea what kinda stuff might get removed by the tokenizer