mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2019-11-05
This commit is contained in:
@ -153,5 +153,23 @@ $ http --print b 'http://localhost:8081/solr/statistics/select?q=userAgent:alanf
|
||||
|
||||
- So basically it seems like a win to update the example file with the latest one from Atmire's COUNTER-Robots list
|
||||
- Even though the "mark by user agent" function is not working (see email to dspace-tech mailing list) DSpace will still not log Solr events from these user agents
|
||||
- I'm curious how the special character matching is in Solr, so I will test two requests: one with "www.gnip.com" which is in the spider list, and one with "www.gnyp.com" which isn't:
|
||||
|
||||
```
|
||||
$ http --print Hh 'https://dspacetest.cgiar.org/handle/10568/105487' User-Agent:"www.gnip.com"
|
||||
$ http --print Hh 'https://dspacetest.cgiar.org/handle/10568/105487' User-Agent:"www.gnyp.com"
|
||||
```
|
||||
|
||||
- Then commit changes to Solr so we don't have to wait:
|
||||
|
||||
```
|
||||
$ http --print b 'http://localhost:8081/solr/statistics/update?commit=true'
|
||||
$ http --print b 'http://localhost:8081/solr/statistics/select?q=userAgent:www.gnip.com&fq=dateYearMonth%3A2019-11' | xmllint --format - | grep numFound
|
||||
<result name="response" numFound="0" start="0"/>
|
||||
$ http --print b 'http://localhost:8081/solr/statistics/select?q=userAgent:www.gnyp.com&fq=dateYearMonth%3A2019-11' | xmllint --format - | grep numFound
|
||||
<result name="response" numFound="1" start="0">
|
||||
```
|
||||
|
||||
- So the blocking seems to be working because "www\.gnip\.com" is one of the new patterns added to the spiders file...
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user