Add notes for 2021-10-31

This commit is contained in:
Alan Orth 2021-11-01 10:07:51 +02:00
parent 5bec10a872
commit dbff911e21
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9

View File

@ -1,4 +1,4 @@
--- e--
title: "October, 2021" title: "October, 2021"
date: 2021-10-01T11:14:07+03:00 date: 2021-10-01T11:14:07+03:00
author: "Alan Orth" author: "Alan Orth"
@ -531,4 +531,78 @@ $ wc -l /tmp/ips-to-purge.txt
- Help ICARDA colleagues with GLDC reports on AReS - Help ICARDA colleagues with GLDC reports on AReS
- There was an issue due to differences in CRP metadata between repositories - There was an issue due to differences in CRP metadata between repositories
## 2021-10-28
- Meeting with Medha and a bunch of others about the FAIRscribe tool they have been developing
- Seems it is a submission tool like MEL
## 2021-10-29
- Linode alerted me that CGSpace (linode18) has high outbound traffic for the last two hours
- This has happened a few other times this week so I decided to go look at the Solr stats for today
- I see 93.158.91.62 is making thousands of requests to Discover with a normal user agent:
```
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36
```
- Even more annoying, they are not re-using their session ID:
```console
$ grep 93.158.91.62 log/dspace.log.2021-10-29 | grep -oE 'session_id=[A-Z0-9]{32}:ip_addr=' | sort | uniq | wc -l
4888
```
- This IP has made 36,000 requests to CGSpace...
- The IP is owned by [Internet Vikings](internetvikings.com) in Sweden
- I purged their statistics and set up a temporary HTTP 403 telling them to use a real user agent
- I see another one in Sweden a few days ago (192.36.109.131), also using the same exact user agent as above, but belonging to [Resilans AB](http://webb.resilans.se/)
- I purged another 74,619 hits from this bot
- I added these two IPs to the nginx IP bot identifier
- Jesus I found a few Russian IPs attempting SQL injection and path traversal, ie:
```
45.9.20.71 - - [20/Oct/2021:02:31:15 +0200] "GET /bitstream/handle/10568/1820/Rhodesgrass.pdf?sequence=4&OoxD=6591%20AND%201%3D1%20UNION%20ALL%20SELECT%201%2CNULL%2C%27%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E%27%2Ctable_name%20FROM%20information_schema.tables%20WHERE%202%3E1--%2F%2A%2A%2F%3B%20EXEC%20xp_cmdshell%28%27cat%20..%2F..%2F..%2Fetc%2Fpasswd%27%29%23 HTTP/1.1" 200 143070 "https://cgspace.cgiar.org:443/bitstream/handle/10568/1820/Rhodesgrass.pdf" "Mozilla/5.0 (X11; U; Linux i686; es-AR; rv:1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11"
```
- I reported them to AbuseIPDB.com and purged their hits:
```console
$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip.txt -p
Purging 6364 hits from 45.9.20.71 in statistics
Purging 8039 hits from 45.146.166.157 in statistics
Purging 3383 hits from 45.155.204.82 in statistics
Total number of bot hits purged: 17786
```
## 2021-10-31
- Update Docker containers for AReS on linode20 and run a fresh harvest
- Found some strange IP (94.71.3.44) making 51,000 requests today with the user agent "Microsoft Internet Explorer"
- It is in Greece, and it seems to be requesting each item's XMLUI full metadata view, so I suspect it's Gardian actually
- I found it making another 25,000 requests yesterday...
- I purged them from Solr
- Found 20,000 hits from Qualys (according to AbuseIPDB.com) using normal user agents... ugh, must be some ILRI ICT scan
- Found more request from a Swedish IP (93.158.90.34) using that weird Firefox user agent that I noticed a few weeks ago:
```
Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0
```
- That's from ASN 12552 (IPO-EU, SE), which is operated by Internet Vikings, though AbuseIPDB.com says it's [Availo Networks AB](availo.se)
- There's another IP (3.225.28.105) that made a few thousand requests to the REST API from Amazon, though it's using a normal user agent
```console
# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | wc -l
3991
~# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | grep -oE 'GET /rest/(collections|handle|items)' | sort | uniq -c
3154 GET /rest/collections
427 GET /rest/handle
410 GET /rest/items
```
- It requested the [CIAT Story Maps](https://cgspace.cgiar.org/handle/10568/75560) collection over 3,000 times last month...
- I will purge those hits
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->