mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2018-12-04 and regenerate
This commit is contained in:
@ -231,4 +231,77 @@ $ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
|
||||
|
||||
- This has got to be part Ubuntu Tomcat packaging, and part DSpace 5.x Tomcat 8.5 readiness...?
|
||||
|
||||
## 2018-12-04
|
||||
|
||||
- Last night Linode sent a message that the load on CGSpace (linode18) was too high, here's a list of the top users at the time and throughout the day:
|
||||
|
||||
```
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Dec/2018:1(5|6|7|8)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
225 40.77.167.142
|
||||
226 66.249.64.63
|
||||
232 46.101.86.248
|
||||
285 45.5.186.2
|
||||
333 54.70.40.11
|
||||
411 193.29.13.85
|
||||
476 34.218.226.147
|
||||
962 66.249.70.27
|
||||
1193 35.237.175.180
|
||||
1450 2a01:4f8:140:3192::2
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Dec/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
1141 207.46.13.57
|
||||
1299 197.210.168.174
|
||||
1341 54.70.40.11
|
||||
1429 40.77.167.142
|
||||
1528 34.218.226.147
|
||||
1973 66.249.70.27
|
||||
2079 50.116.102.77
|
||||
2494 78.46.79.71
|
||||
3210 2a01:4f8:140:3192::2
|
||||
4190 35.237.175.180
|
||||
```
|
||||
|
||||
- `35.237.175.180` is known to us (CCAFS?), and I've already added it to the list of bot IPs in nginx, which appears to be working:
|
||||
|
||||
```
|
||||
$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180' dspace.log.2018-12-03
|
||||
4772
|
||||
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180' dspace.log.2018-12-03 | sort | uniq | wc -l
|
||||
630
|
||||
```
|
||||
|
||||
- I haven't seen `2a01:4f8:140:3192::2` before. Its user agent is some new bot:
|
||||
|
||||
```
|
||||
Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
|
||||
```
|
||||
|
||||
- At least it seems the Tomcat Crawler Session Manager Valve is working to re-use the common bot XMLUI sessions:
|
||||
|
||||
```
|
||||
$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2' dspace.log.2018-12-03
|
||||
5111
|
||||
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2' dspace.log.2018-12-03 | sort | uniq | wc -l
|
||||
419
|
||||
```
|
||||
|
||||
- `78.46.79.71` is another host on Hetzner with the following user agent:
|
||||
|
||||
```
|
||||
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
|
||||
```
|
||||
|
||||
- This is not the first time a host on Hetzner has used a "normal" user agent to make thousands of requests
|
||||
- At least it is re-using its Tomcat sessions somehow:
|
||||
|
||||
```
|
||||
$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03
|
||||
2044
|
||||
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03 | sort | uniq | wc -l
|
||||
1
|
||||
```
|
||||
|
||||
- In other news, it's good to see my re-work of the database connectivity in the [dspace-statistics-api](https://github.com/ilri/dspace-statistics-api) actually caused a reduction of persistent database connections (from 1 to 0, but still!):
|
||||
|
||||

|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user