mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 14:45:03 +01:00
Add notes for 2021-10-31
This commit is contained in:
parent
5bec10a872
commit
dbff911e21
@ -1,4 +1,4 @@
|
|||||||
---
|
e--
|
||||||
title: "October, 2021"
|
title: "October, 2021"
|
||||||
date: 2021-10-01T11:14:07+03:00
|
date: 2021-10-01T11:14:07+03:00
|
||||||
author: "Alan Orth"
|
author: "Alan Orth"
|
||||||
@ -531,4 +531,78 @@ $ wc -l /tmp/ips-to-purge.txt
|
|||||||
- Help ICARDA colleagues with GLDC reports on AReS
|
- Help ICARDA colleagues with GLDC reports on AReS
|
||||||
- There was an issue due to differences in CRP metadata between repositories
|
- There was an issue due to differences in CRP metadata between repositories
|
||||||
|
|
||||||
|
## 2021-10-28
|
||||||
|
|
||||||
|
- Meeting with Medha and a bunch of others about the FAIRscribe tool they have been developing
|
||||||
|
- Seems it is a submission tool like MEL
|
||||||
|
|
||||||
|
## 2021-10-29
|
||||||
|
|
||||||
|
- Linode alerted me that CGSpace (linode18) has high outbound traffic for the last two hours
|
||||||
|
- This has happened a few other times this week so I decided to go look at the Solr stats for today
|
||||||
|
- I see 93.158.91.62 is making thousands of requests to Discover with a normal user agent:
|
||||||
|
|
||||||
|
```
|
||||||
|
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36
|
||||||
|
```
|
||||||
|
|
||||||
|
- Even more annoying, they are not re-using their session ID:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ grep 93.158.91.62 log/dspace.log.2021-10-29 | grep -oE 'session_id=[A-Z0-9]{32}:ip_addr=' | sort | uniq | wc -l
|
||||||
|
4888
|
||||||
|
```
|
||||||
|
|
||||||
|
- This IP has made 36,000 requests to CGSpace...
|
||||||
|
- The IP is owned by [Internet Vikings](internetvikings.com) in Sweden
|
||||||
|
- I purged their statistics and set up a temporary HTTP 403 telling them to use a real user agent
|
||||||
|
- I see another one in Sweden a few days ago (192.36.109.131), also using the same exact user agent as above, but belonging to [Resilans AB](http://webb.resilans.se/)
|
||||||
|
- I purged another 74,619 hits from this bot
|
||||||
|
- I added these two IPs to the nginx IP bot identifier
|
||||||
|
- Jesus I found a few Russian IPs attempting SQL injection and path traversal, ie:
|
||||||
|
|
||||||
|
```
|
||||||
|
45.9.20.71 - - [20/Oct/2021:02:31:15 +0200] "GET /bitstream/handle/10568/1820/Rhodesgrass.pdf?sequence=4&OoxD=6591%20AND%201%3D1%20UNION%20ALL%20SELECT%201%2CNULL%2C%27%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E%27%2Ctable_name%20FROM%20information_schema.tables%20WHERE%202%3E1--%2F%2A%2A%2F%3B%20EXEC%20xp_cmdshell%28%27cat%20..%2F..%2F..%2Fetc%2Fpasswd%27%29%23 HTTP/1.1" 200 143070 "https://cgspace.cgiar.org:443/bitstream/handle/10568/1820/Rhodesgrass.pdf" "Mozilla/5.0 (X11; U; Linux i686; es-AR; rv:1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11"
|
||||||
|
```
|
||||||
|
|
||||||
|
- I reported them to AbuseIPDB.com and purged their hits:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip.txt -p
|
||||||
|
Purging 6364 hits from 45.9.20.71 in statistics
|
||||||
|
Purging 8039 hits from 45.146.166.157 in statistics
|
||||||
|
Purging 3383 hits from 45.155.204.82 in statistics
|
||||||
|
|
||||||
|
Total number of bot hits purged: 17786
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2021-10-31
|
||||||
|
|
||||||
|
- Update Docker containers for AReS on linode20 and run a fresh harvest
|
||||||
|
- Found some strange IP (94.71.3.44) making 51,000 requests today with the user agent "Microsoft Internet Explorer"
|
||||||
|
- It is in Greece, and it seems to be requesting each item's XMLUI full metadata view, so I suspect it's Gardian actually
|
||||||
|
- I found it making another 25,000 requests yesterday...
|
||||||
|
- I purged them from Solr
|
||||||
|
- Found 20,000 hits from Qualys (according to AbuseIPDB.com) using normal user agents... ugh, must be some ILRI ICT scan
|
||||||
|
- Found more request from a Swedish IP (93.158.90.34) using that weird Firefox user agent that I noticed a few weeks ago:
|
||||||
|
|
||||||
|
```
|
||||||
|
Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0
|
||||||
|
```
|
||||||
|
|
||||||
|
- That's from ASN 12552 (IPO-EU, SE), which is operated by Internet Vikings, though AbuseIPDB.com says it's [Availo Networks AB](availo.se)
|
||||||
|
- There's another IP (3.225.28.105) that made a few thousand requests to the REST API from Amazon, though it's using a normal user agent
|
||||||
|
|
||||||
|
```console
|
||||||
|
# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | wc -l
|
||||||
|
3991
|
||||||
|
~# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | grep -oE 'GET /rest/(collections|handle|items)' | sort | uniq -c
|
||||||
|
3154 GET /rest/collections
|
||||||
|
427 GET /rest/handle
|
||||||
|
410 GET /rest/items
|
||||||
|
```
|
||||||
|
|
||||||
|
- It requested the [CIAT Story Maps](https://cgspace.cgiar.org/handle/10568/75560) collection over 3,000 times last month...
|
||||||
|
- I will purge those hits
|
||||||
|
|
||||||
<!-- vim: set sw=2 ts=2: -->
|
<!-- vim: set sw=2 ts=2: -->
|
||||||
|
Loading…
Reference in New Issue
Block a user