mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2018-10-20
This commit is contained in:
@ -446,5 +446,47 @@ ERROR: Error CREATEing SolrCore 'statistics': Unable to create core [statistics]
|
||||
|
||||
- Apparently a bunch of variable types were removed in [Solr 5](https://issues.apache.org/jira/browse/SOLR-5936)
|
||||
- So for now it's actually a huge pain in the ass to run the tests for my dspace-statistics-api
|
||||
- Linode sent a message that the CPU usage was high on CGSpace (linode18) last night
|
||||
- According to the nginx logs around that time it was 5.9.6.51 (MegaIndex) again:
|
||||
|
||||
```
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "20/Oct/2018:(14|15|16)" | awk '{print $1}' | sort
|
||||
| uniq -c | sort -n | tail -n 10
|
||||
249 207.46.13.179
|
||||
250 157.55.39.173
|
||||
301 54.166.207.223
|
||||
303 157.55.39.213
|
||||
310 66.249.64.95
|
||||
362 34.218.226.147
|
||||
381 66.249.64.93
|
||||
415 35.237.175.180
|
||||
1205 66.249.64.91
|
||||
1227 5.9.6.51
|
||||
```
|
||||
|
||||
- This bot is only using the XMLUI and it does *not* seem to be re-using its sessions:
|
||||
|
||||
```
|
||||
# grep -c 5.9.6.51 /var/log/nginx/*.log
|
||||
/var/log/nginx/access.log:9323
|
||||
/var/log/nginx/error.log:0
|
||||
/var/log/nginx/library-access.log:0
|
||||
/var/log/nginx/oai.log:0
|
||||
/var/log/nginx/rest.log:0
|
||||
/var/log/nginx/statistics.log:0
|
||||
# grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51' dspace.log.2018-10-20 | sort | uniq
|
||||
8915
|
||||
```
|
||||
|
||||
- Last month I added "crawl" to the Tomcat Crawler Session Manager Valve's regular expression matching, and it seems to be working for MegaIndex's user agent:
|
||||
|
||||
```
|
||||
$ http --print Hh 'https://dspacetest.cgiar.org/handle/10568/1' User-Agent:'"Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)"'
|
||||
```
|
||||
|
||||
- So I'm not sure why this bot uses so many sessions — is it because it requests very slowly?
|
||||
|
||||
## 2018-10-21
|
||||
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user