Update notes for 2017-11-09

This commit is contained in:
2017-11-09 17:52:14 +02:00
parent 25cef07639
commit 696c05977c
3 changed files with 24 additions and 24 deletions

View File

@ -253,7 +253,7 @@ $ grep -c 207.46.13.36 /var/log/nginx/access.log.1
- I think I will end up blocking Baidu as well...
- Next is for me to look and see what was happening specifically at 3AM and 7AM when the server crashed
- I should look in nginx access.log, rest.log, oai.log, and DSpace's dspace.log.2017-11-07
- Here are the top IPs making requests to XMLUI from 28 AM:
- Here are the top IPs making requests to XMLUI from 2 to 8 AM:
```
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
@ -270,10 +270,10 @@ $ grep -c 207.46.13.36 /var/log/nginx/access.log.1
```
- Of those, most are Google, Bing, Yahoo, etc, except 63.143.42.244 and 63.143.42.242 which are Uptime Robot
- Here are the top IPs making requests to REST from 28 AM:
- Here are the top IPs making requests to REST from 2 to 8 AM:
```
# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
8 207.241.229.237
10 66.249.66.90
16 104.196.152.243
@ -465,9 +465,9 @@ proxy_set_header User-Agent $ua;
- It seems that they rarely even bother checking `robots.txt`, but Google does multiple times per day!
```
# zgrep Baiduspider /var/log/nginx/access.log* | grep -c robots.txt
# zgrep Baiduspider /var/log/nginx/access.log* | grep -c robots.txt
14
# zgrep Googlebot /var/log/nginx/access.log* | grep -c robots.txt
# zgrep Googlebot /var/log/nginx/access.log* | grep -c robots.txt
1134
```
@ -482,7 +482,7 @@ proxy_set_header User-Agent $ua;
- Awesome, it seems my bot mapping stuff in nginx actually reduced the number of Tomcat sessions used by the CIAT scraper today, total requests and unique sessions:
```
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep '09/Nov/2017' | grep -c 104.196.152.243
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep '09/Nov/2017' | grep -c 104.196.152.243
5769
$ grep 104.196.152.243 dspace.log.2017-11-09 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
223
@ -495,9 +495,9 @@ $ grep 104.196.152.243 dspace.log.2017-11-09 | grep -o -E 'session_id=[A-Z0-9]{3
10216
$ grep 104.196.152.243 dspace.log.2017-11-08 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
2592
# zcat -f -- /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep '07/Nov/2017' | grep -c 104.196.152.243
# zcat -f -- /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep '07/Nov/2017' | grep -c 104.196.152.243
8120
$ grep 104.196.152.243 dspace.log.2017-11-07 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
$ grep 104.196.152.243 dspace.log.2017-11-07 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
3506
```