mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-21 12:42:18 +01:00
content/posts/2020-10.md: Fix typo
This commit is contained in:
parent
f60bb8b10f
commit
da88f0e7a9
@ -261,7 +261,7 @@ $ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-438
|
||||
- I re-factored the `check-spider-hits.sh` script to read patterns from a text file rather than sed's stdout, and to properly search for spaces in patterns that use `\s` because Lucene's search syntax doesn't support it (and spaces work just fine)
|
||||
- Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html
|
||||
- Reference: https://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches
|
||||
- I added `[Ss]pider` to the Tomcat Crawler Sessions Manager Valve regex because this can catch a few more generic bots and force them to use the same Tomcat JSESSIONID
|
||||
- I added `[Ss]pider` to the Tomcat Crawler Session Manager Valve regex because this can catch a few more generic bots and force them to use the same Tomcat JSESSIONID
|
||||
- I added a few of the patterns from above to our local agents list and ran the `check-spider-hits.sh` on CGSpace:
|
||||
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user