content/posts/2020-10.md: Fix typo

This commit is contained in:
Alan Orth 2020-10-24 22:23:06 +03:00
parent f60bb8b10f
commit da88f0e7a9
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9

View File

@ -261,7 +261,7 @@ $ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-438
- I re-factored the `check-spider-hits.sh` script to read patterns from a text file rather than sed's stdout, and to properly search for spaces in patterns that use `\s` because Lucene's search syntax doesn't support it (and spaces work just fine) - I re-factored the `check-spider-hits.sh` script to read patterns from a text file rather than sed's stdout, and to properly search for spaces in patterns that use `\s` because Lucene's search syntax doesn't support it (and spaces work just fine)
- Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html - Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html
- Reference: https://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches - Reference: https://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches
- I added `[Ss]pider` to the Tomcat Crawler Sessions Manager Valve regex because this can catch a few more generic bots and force them to use the same Tomcat JSESSIONID - I added `[Ss]pider` to the Tomcat Crawler Session Manager Valve regex because this can catch a few more generic bots and force them to use the same Tomcat JSESSIONID
- I added a few of the patterns from above to our local agents list and ran the `check-spider-hits.sh` on CGSpace: - I added a few of the patterns from above to our local agents list and ran the `check-spider-hits.sh` on CGSpace:
``` ```