mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-21 22:25:02 +01:00
Add notes for 2021-10-31
This commit is contained in:
parent
5bec10a872
commit
dbff911e21
@ -1,4 +1,4 @@
|
||||
---
|
||||
e--
|
||||
title: "October, 2021"
|
||||
date: 2021-10-01T11:14:07+03:00
|
||||
author: "Alan Orth"
|
||||
@ -531,4 +531,78 @@ $ wc -l /tmp/ips-to-purge.txt
|
||||
- Help ICARDA colleagues with GLDC reports on AReS
|
||||
- There was an issue due to differences in CRP metadata between repositories
|
||||
|
||||
## 2021-10-28
|
||||
|
||||
- Meeting with Medha and a bunch of others about the FAIRscribe tool they have been developing
|
||||
- Seems it is a submission tool like MEL
|
||||
|
||||
## 2021-10-29
|
||||
|
||||
- Linode alerted me that CGSpace (linode18) has high outbound traffic for the last two hours
|
||||
- This has happened a few other times this week so I decided to go look at the Solr stats for today
|
||||
- I see 93.158.91.62 is making thousands of requests to Discover with a normal user agent:
|
||||
|
||||
```
|
||||
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36
|
||||
```
|
||||
|
||||
- Even more annoying, they are not re-using their session ID:
|
||||
|
||||
```console
|
||||
$ grep 93.158.91.62 log/dspace.log.2021-10-29 | grep -oE 'session_id=[A-Z0-9]{32}:ip_addr=' | sort | uniq | wc -l
|
||||
4888
|
||||
```
|
||||
|
||||
- This IP has made 36,000 requests to CGSpace...
|
||||
- The IP is owned by [Internet Vikings](internetvikings.com) in Sweden
|
||||
- I purged their statistics and set up a temporary HTTP 403 telling them to use a real user agent
|
||||
- I see another one in Sweden a few days ago (192.36.109.131), also using the same exact user agent as above, but belonging to [Resilans AB](http://webb.resilans.se/)
|
||||
- I purged another 74,619 hits from this bot
|
||||
- I added these two IPs to the nginx IP bot identifier
|
||||
- Jesus I found a few Russian IPs attempting SQL injection and path traversal, ie:
|
||||
|
||||
```
|
||||
45.9.20.71 - - [20/Oct/2021:02:31:15 +0200] "GET /bitstream/handle/10568/1820/Rhodesgrass.pdf?sequence=4&OoxD=6591%20AND%201%3D1%20UNION%20ALL%20SELECT%201%2CNULL%2C%27%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E%27%2Ctable_name%20FROM%20information_schema.tables%20WHERE%202%3E1--%2F%2A%2A%2F%3B%20EXEC%20xp_cmdshell%28%27cat%20..%2F..%2F..%2Fetc%2Fpasswd%27%29%23 HTTP/1.1" 200 143070 "https://cgspace.cgiar.org:443/bitstream/handle/10568/1820/Rhodesgrass.pdf" "Mozilla/5.0 (X11; U; Linux i686; es-AR; rv:1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11"
|
||||
```
|
||||
|
||||
- I reported them to AbuseIPDB.com and purged their hits:
|
||||
|
||||
```console
|
||||
$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip.txt -p
|
||||
Purging 6364 hits from 45.9.20.71 in statistics
|
||||
Purging 8039 hits from 45.146.166.157 in statistics
|
||||
Purging 3383 hits from 45.155.204.82 in statistics
|
||||
|
||||
Total number of bot hits purged: 17786
|
||||
```
|
||||
|
||||
## 2021-10-31
|
||||
|
||||
- Update Docker containers for AReS on linode20 and run a fresh harvest
|
||||
- Found some strange IP (94.71.3.44) making 51,000 requests today with the user agent "Microsoft Internet Explorer"
|
||||
- It is in Greece, and it seems to be requesting each item's XMLUI full metadata view, so I suspect it's Gardian actually
|
||||
- I found it making another 25,000 requests yesterday...
|
||||
- I purged them from Solr
|
||||
- Found 20,000 hits from Qualys (according to AbuseIPDB.com) using normal user agents... ugh, must be some ILRI ICT scan
|
||||
- Found more request from a Swedish IP (93.158.90.34) using that weird Firefox user agent that I noticed a few weeks ago:
|
||||
|
||||
```
|
||||
Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0
|
||||
```
|
||||
|
||||
- That's from ASN 12552 (IPO-EU, SE), which is operated by Internet Vikings, though AbuseIPDB.com says it's [Availo Networks AB](availo.se)
|
||||
- There's another IP (3.225.28.105) that made a few thousand requests to the REST API from Amazon, though it's using a normal user agent
|
||||
|
||||
```console
|
||||
# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | wc -l
|
||||
3991
|
||||
~# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | grep -oE 'GET /rest/(collections|handle|items)' | sort | uniq -c
|
||||
3154 GET /rest/collections
|
||||
427 GET /rest/handle
|
||||
410 GET /rest/items
|
||||
```
|
||||
|
||||
- It requested the [CIAT Story Maps](https://cgspace.cgiar.org/handle/10568/75560) collection over 3,000 times last month...
|
||||
- I will purge those hits
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Loading…
Reference in New Issue
Block a user