diff --git a/content/posts/2020-04.md b/content/posts/2020-04.md
index 431ba2228..7af45c15c 100644
--- a/content/posts/2020-04.md
+++ b/content/posts/2020-04.md
@@ -395,5 +395,66 @@ $ psql -c 'select * from pg_stat_activity' | grep 'dspaceWeb' | grep -c "idle in
```
- I don't see anything in the PostgreSQL or Tomcat logs suggesting anything is wrong... I think the solution to clear these idle connections is probably to just restart Tomcat
+- I looked at the Solr stats for this month and see lots of suspicious IPs:
+
+```
+$ curl -s 'http://localhost:8081/solr/statistics/select?q=*:*&fq=dateYearMonth:2020-04&rows=0&wt=json&indent=true&facet=true&facet.field=ip
+
+ "88.99.115.53",23621, # Hetzner, using XMLUI and REST API with no user agent
+ "104.154.216.0",11865,# Google cloud, scraping XMLUI with no user agent
+ "104.198.96.245",4925,# Google cloud, using REST API with no user agent
+ "52.34.238.26",2907, # EcoSearch on XMLUI, user agent: EcoSearch (+https://search.ecointernet.org/)
+```
+
+- And a bunch more... ugh...
+ - 70.32.90.172: scraping REST API for IWMI/WLE pages with no user agent
+ - 2a01:7e00::f03c:91ff:fe16:fcb: Linode, REST API, no user agent
+ - 2607:f298:5:101d:f816:3eff:fed9:a484: DreamHost, XMLUI and REST API, python-requests/2.18.4
+ - 2a00:1768:2001:7a::20: Netherlands, XMLUI, trying SQL injections
+- I need to start blocking requests without a user agent...
+- I purged these user agents using my `check-spider-ip-hits.sh` script:
+
+```
+$ for year in {2010..2019}; do ./check-spider-ip-hits.sh -f /tmp/ips -s statistics-$year -p; done
+$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
+```
+- Then I added a few of them to the bot mapping in the nginx config because it appears they are regular harvesters since 2018
+- Looking through the Solr stats faceted by the `userAgent` field I see some interesting ones:
+
+```
+$ curl 'http://localhost:8081/solr/statistics/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userAgent'
+...
+"Delphi 2009",50725,
+"OgScrper/1.0.0",12421,
+```
+
+- Delphi is only used by IP addresses in Greece, so that's obviously the GARDIAN people harvesting us...
+- I have no idea what OgScrper is, but it's not a user!
+- Then there are 276,000 hits from `MEL-API` from Jordanian IPs in 2018, so that's obviously CodeObia guys...
+- Other user agents:
+ - GigablastOpenSource/1
+ - Owlin Domain Resolver V1
+ - API scraper
+ - MetaURI
+- I don't know why, but my `check-spider-hits.sh` script doesn't seem to be handling the user agents with spaces properly so I will delete those manually after
+- First delete the ones without spaces, creating a temp file in `/tmp/agents` containing the patterns:
+
+```
+$ for year in {2010..2019}; do ./check-spider-hits.sh -f /tmp/agents -s statistics-$year -p; done
+$ ./check-spider-hits.sh -f /tmp/agents -s statistics -p
+```
+- That's about 300,000 hits purged...
+- Then remove the ones with spaces manually, checking the query syntax first, then deleting in yearly cores and the statistics core:
+
+```
+$ curl -s "http://localhost:8081/solr/statistics/select" -d "q=userAgent:/Delphi 2009/&rows=0"
+...
+
$ curl -s 'http://localhost:8081/solr/statistics/select?q=*:*&fq=dateYearMonth:2020-04&rows=0&wt=json&indent=true&facet=true&facet.field=ip
+
+ "88.99.115.53",23621, # Hetzner, using XMLUI and REST API with no user agent
+ "104.154.216.0",11865,# Google cloud, scraping XMLUI with no user agent
+ "104.198.96.245",4925,# Google cloud, using REST API with no user agent
+ "52.34.238.26",2907, # EcoSearch on XMLUI, user agent: EcoSearch (+https://search.ecointernet.org/)
+
check-spider-ip-hits.sh
script:$ for year in {2010..2019}; do ./check-spider-ip-hits.sh -f /tmp/ips -s statistics-$year -p; done
+$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
+
userAgent
field I see some interesting ones:$ curl 'http://localhost:8081/solr/statistics/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userAgent'
+...
+"Delphi 2009",50725,
+"OgScrper/1.0.0",12421,
+
MEL-API
from Jordanian IPs in 2018, so that’s obviously CodeObia guys…check-spider-hits.sh
script doesn’t seem to be handling the user agents with spaces properly so I will delete those manually after/tmp/agents
containing the patterns:$ for year in {2010..2019}; do ./check-spider-hits.sh -f /tmp/agents -s statistics-$year -p; done
+$ ./check-spider-hits.sh -f /tmp/agents -s statistics -p
+
$ curl -s "http://localhost:8081/solr/statistics/select" -d "q=userAgent:/Delphi 2009/&rows=0"
+...
+<lst name="responseHeader"><int name="status">0</int><int name="QTime">52</int><lst name="params"><str name="q">userAgent:/Delphi 2009/</str><str name="rows">0</str></lst></lst><result name="response" numFound="38760" start="0"></result>
+$ for year in {2010..2019}; do curl -s "http://localhost:8081/solr/statistics-$year/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>userAgent:"Delphi 2009"</query></delete>'; done
+$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>userAgent:"Delphi 2009"</query></delete>'
+