mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2017-11-07
This commit is contained in:
@ -253,7 +253,7 @@ $ grep -c 207.46.13.36 /var/log/nginx/access.log.1
|
||||
- I think I will end up blocking Baidu as well...
|
||||
- Next is for me to look and see what was happening specifically at 3AM and 7AM when the server crashed
|
||||
- I should look in nginx access.log, rest.log, oai.log, and DSpace's dspace.log.2017-11-07
|
||||
- Here are the top IPs during 2–10 AM:
|
||||
- Here are the top IPs making requests to XMLUI from 2–8 AM:
|
||||
|
||||
```
|
||||
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
||||
@ -270,3 +270,97 @@ $ grep -c 207.46.13.36 /var/log/nginx/access.log.1
|
||||
```
|
||||
|
||||
- Of those, most are Google, Bing, Yahoo, etc, except 63.143.42.244 and 63.143.42.242 which are Uptime Robot
|
||||
- Here are the top IPs making requests to REST from 2–8 AM:
|
||||
|
||||
```
|
||||
# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
||||
8 207.241.229.237
|
||||
10 66.249.66.90
|
||||
16 104.196.152.243
|
||||
25 41.60.238.61
|
||||
26 157.55.39.161
|
||||
27 207.46.13.103
|
||||
27 207.46.13.80
|
||||
31 207.46.13.36
|
||||
1498 50.116.102.77
|
||||
```
|
||||
|
||||
- The OAI requests during that same time period are nothing to worry about:
|
||||
|
||||
```
|
||||
# cat /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E '07/Nov/2017:0[2-8]' | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
||||
1 66.249.66.92
|
||||
4 66.249.66.90
|
||||
6 68.180.229.254
|
||||
```
|
||||
|
||||
- The top IPs from dspace.log during the 2–8 AM period:
|
||||
|
||||
```
|
||||
$ grep -E '2017-11-07 0[2-8]' dspace.log.2017-11-07 | grep -o -E 'ip_addr=[0-9.]+' | sort -n | uniq -c | sort -h | tail
|
||||
143 ip_addr=213.55.99.121
|
||||
181 ip_addr=66.249.66.91
|
||||
223 ip_addr=157.55.39.161
|
||||
248 ip_addr=207.46.13.80
|
||||
251 ip_addr=207.46.13.103
|
||||
291 ip_addr=207.46.13.36
|
||||
297 ip_addr=197.210.168.174
|
||||
312 ip_addr=65.49.68.199
|
||||
462 ip_addr=104.196.152.243
|
||||
488 ip_addr=66.249.66.90
|
||||
```
|
||||
|
||||
- These aren't actually very interesting, as the top few are Google, CIAT, Bingbot, and a few other unknown scrapers
|
||||
- The number of requests isn't even that high to be honest
|
||||
- As I was looking at these logs I noticed another heavy user (124.17.34.59) that was not active during this time period, but made many requests today alone:
|
||||
|
||||
```
|
||||
# zgrep -c 124.17.34.59 /var/log/nginx/access.log*
|
||||
/var/log/nginx/access.log:22581
|
||||
/var/log/nginx/access.log.1:0
|
||||
/var/log/nginx/access.log.2.gz:14
|
||||
/var/log/nginx/access.log.3.gz:0
|
||||
/var/log/nginx/access.log.4.gz:0
|
||||
/var/log/nginx/access.log.5.gz:3
|
||||
/var/log/nginx/access.log.6.gz:0
|
||||
/var/log/nginx/access.log.7.gz:0
|
||||
/var/log/nginx/access.log.8.gz:0
|
||||
/var/log/nginx/access.log.9.gz:1
|
||||
```
|
||||
|
||||
- The whois data shows the IP is from China, but the user agent doesn't really give any clues:
|
||||
|
||||
```
|
||||
# grep 124.17.34.59 /var/log/nginx/access.log | awk -F'" ' '{print $3}' | sort | uniq -c | sort -h
|
||||
210 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
|
||||
22610 "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.2; Win64; x64; Trident/7.0; LCTE)"
|
||||
```
|
||||
|
||||
- A Google search for "LCTE bot" doesn't return anything interesting, but this [Stack Overflow discussion](https://stackoverflow.com/questions/42500881/what-is-lcte-in-user-agent) references the lack of information
|
||||
- So basically after a few hours of looking at the log files I am not closer to understanding what is going on!
|
||||
- I do know that we want to block Baidu, though, as it does not respect `robots.txt`
|
||||
- And as we speak Linode alerted that the outbound traffic rate is very high for the past two hours (about 12–14 hours)
|
||||
- At least for now it seems to be that new Chinese IP (124.17.34.59):
|
||||
|
||||
```
|
||||
# grep -E "07/Nov/2017:1[234]:" /var/log/nginx/access.log | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
||||
198 207.46.13.103
|
||||
203 207.46.13.80
|
||||
205 207.46.13.36
|
||||
218 157.55.39.161
|
||||
249 45.5.184.221
|
||||
258 45.5.187.130
|
||||
386 66.249.66.90
|
||||
410 197.210.168.174
|
||||
1896 104.196.152.243
|
||||
11005 124.17.34.59
|
||||
```
|
||||
|
||||
- Seems 124.17.34.59 are really downloading all our PDFs, compared to the next top active IPs during this time!
|
||||
|
||||
```
|
||||
# grep -E "07/Nov/2017:1[234]:" /var/log/nginx/access.log | grep 124.17.34.59 | grep -c pdf
|
||||
5948
|
||||
# grep -E "07/Nov/2017:1[234]:" /var/log/nginx/access.log | grep 104.196.152.243 | grep -c pdf
|
||||
0
|
||||
```
|
||||
|
Reference in New Issue
Block a user