Add notes for 2023-03-27

This commit is contained in:
2023-03-27 10:03:45 +03:00
parent 11646971a9
commit 37bdf2645f
31 changed files with 85 additions and 36 deletions

View File

@ -492,4 +492,29 @@ RL: performed 0 reads and 16 write i/o operations
- I added a Flyway SQL migration for the PNG bitstream format registry changes on DSpace 7.6
## 2023-03-26
- There seems to be a slightly high load on CGSpace
- I don't see any locks in PostgreSQL, but there's some new bot I have never heard of:
```console
92.119.18.13 - - [26/Mar/2023:18:41:47 +0200] "GET /handle/10568/16500/discover?filtertype_0=impactarea&filter_relational_operator_0=equals&filter_0=Climate+adaptation+and+mitigation&filtertype=sdg&filter_relational_operator=equals&filter=SDG+11+-+Sustainable+cities+and+communities HTTP/2.0" 200 7856 "-" "colly - https://github.com/gocolly/colly"
```
- In the last week I see a handful of IPs making requests with this agent:
```console
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.{2,3,4,5,6,7}.gz | grep go
colly | awk '{print $1}' | sort | uniq -c | sort -h
2 194.233.95.37
4304 92.119.18.142
9496 5.180.208.152
27477 92.119.18.13
```
- Most of these come from Packethub S.A. / ASN 62240 (CLOUVIDER Clouvider - Global ASN, GB)
- Oh, I've apparently seen this user agent before, as it is in our ILRI spider user agent overrides
- I exported CGSpace to check for missing Initiative collection mappings
- Start a harvest on AReS
<!-- vim: set sw=2 ts=2: -->