mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2018-09-10
This commit is contained in:
@ -138,5 +138,64 @@ UPDATE 15
|
||||
- Start working on adding metadata for access and usage rights that we started earlier in 2018 (and also in 2017)
|
||||
- The current `cg.identifier.status` field will become "Access rights" and `dc.rights` will become "Usage rights"
|
||||
- I have some work in progress on the [`5_x-rights` branch](https://github.com/alanorth/DSpace/tree/5_x-rights)
|
||||
- Linode said that CGSpace (linode18) had a high CPU load earlier today
|
||||
- When I looked, I see it's the same Russian IP that I noticed last month:
|
||||
|
||||
```
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "10/Sep/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
1459 157.55.39.202
|
||||
1579 95.108.181.88
|
||||
1615 157.55.39.147
|
||||
1714 66.249.64.91
|
||||
1924 50.116.102.77
|
||||
3696 157.55.39.106
|
||||
3763 157.55.39.148
|
||||
4470 70.32.83.92
|
||||
4724 35.237.175.180
|
||||
14132 5.9.6.51
|
||||
```
|
||||
|
||||
- And this bot is still creating more Tomcat sessions than Nginx requests (WTF?):
|
||||
|
||||
```
|
||||
# grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51' dspace.log.2018-09-10
|
||||
14133
|
||||
```
|
||||
|
||||
- The user agent is still the same:
|
||||
|
||||
```
|
||||
Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
|
||||
```
|
||||
|
||||
- I added `.*crawl.*` to the Tomcat Session Crawler Manager Valve, so I'm not sure why the bot is creating so many sessions...
|
||||
- I just tested that user agent on CGSpace and it *does not* create a new session:
|
||||
|
||||
```
|
||||
$ http --print Hh https://cgspace.cgiar.org 'User-Agent:Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)'
|
||||
GET / HTTP/1.1
|
||||
Accept: */*
|
||||
Accept-Encoding: gzip, deflate
|
||||
Connection: keep-alive
|
||||
Host: cgspace.cgiar.org
|
||||
User-Agent: Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
|
||||
|
||||
HTTP/1.1 200 OK
|
||||
Connection: keep-alive
|
||||
Content-Encoding: gzip
|
||||
Content-Language: en-US
|
||||
Content-Type: text/html;charset=utf-8
|
||||
Date: Mon, 10 Sep 2018 20:43:04 GMT
|
||||
Server: nginx
|
||||
Strict-Transport-Security: max-age=15768000
|
||||
Transfer-Encoding: chunked
|
||||
Vary: Accept-Encoding
|
||||
X-Cocoon-Version: 2.2.0
|
||||
X-Content-Type-Options: nosniff
|
||||
X-Frame-Options: SAMEORIGIN
|
||||
X-XSS-Protection: 1; mode=block
|
||||
```
|
||||
|
||||
- I will have to keep an eye on it and perhaps add it to the list of "bad bots" that get rate limited
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user