Update notes for 2017-11-12

This commit is contained in:
2017-11-12 18:48:52 +02:00
parent f2ef00d1e9
commit 41bdd24079
3 changed files with 94 additions and 8 deletions

View File

@ -555,3 +555,44 @@ $ grep 5.9.6.51 /home/cgspace.cgiar.org/log/dspace.log.2017-11-12 | grep -o -E '
$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88' /home/cgspace.cgiar.org/log/dspace.log.2017-11-12
991
```
- Move some items and collections on CGSpace for Peter Ballantyne, running [`move_collections.sh`](https://gist.github.com/alanorth/e60b530ed4989df0c731afbb0c640515) with the following configuration:
```
10947/6 10947/1 10568/83389
10947/34 10947/1 10568/83389
10947/2512 10947/1 10568/83389
```
- I explored nginx rate limits as a way to aggressively throttle Baidu bot which doesn't seem to respect disallowed URLs in robots.txt
- There's an interesting [blog post from Nginx's team about rate limiting](https://www.nginx.com/blog/rate-limiting-nginx/) as well as a [clever use of mapping with rate limits](https://gist.github.com/arosenhagen/8aaf5d7f94171778c0e9)
- The solution [I came up with](https://github.com/ilri/rmg-ansible-public/commit/f0646991772660c505bea9c5ac586490e7c86156) uses tricks from both of those
- I deployed the limit on CGSpace and DSpace Test and it seems to work well:
```
$ http --print h https://cgspace.cgiar.org/handle/10568/1 User-Agent:'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'
HTTP/1.1 200 OK
Connection: keep-alive
Content-Encoding: gzip
Content-Language: en-US
Content-Type: text/html;charset=utf-8
Date: Sun, 12 Nov 2017 16:30:19 GMT
Server: nginx
Strict-Transport-Security: max-age=15768000
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-Cocoon-Version: 2.2.0
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
$ http --print h https://cgspace.cgiar.org/handle/10568/1 User-Agent:'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'
HTTP/1.1 503 Service Temporarily Unavailable
Connection: keep-alive
Content-Length: 206
Content-Type: text/html
Date: Sun, 12 Nov 2017 16:30:21 GMT
Server: nginx
```
- The first request works, second is denied with an HTTP 503!
- I need to remember to check the Munin graphs for PostgreSQL and JVM next week to see how this affects them