diff --git a/content/posts/2018-09.md b/content/posts/2018-09.md
index 06f24a5f0..373e112a2 100644
--- a/content/posts/2018-09.md
+++ b/content/posts/2018-09.md
@@ -489,8 +489,26 @@ $ dspace stats-util -f
- I restarted the server with `logBots = false` and after it came back up I see 266 events with `isBots:true` (maybe they were buffered)... I will check again tomorrow
- After a few hours I see there are still only 266 view events with `isBot:true` on DSpace Test's Solr statistics core, so I'm definitely going to deploy this on CGSpace soon
- Also, CGSpace currently has 60,089,394 view events with `isBot:true` in it's Solr statistics core and it is 124GB!
-- Amazing! After running `dspace stats-util -f` on CGSpace the Solr statistics core went from 124GB to 84GB, and there are only 700 events with `isBot:true` so I should really disable logging of bot events!
+- Amazing! After running `dspace stats-util -f` on CGSpace the Solr statistics core went from 124GB to 60GB, and now there are only 700 events with `isBot:true` so I should really disable logging of bot events!
- I'm super curious to see how the JVM heap usage changes...
- I made (and merged) a pull request to disable bot logging on the `5_x-prod` branch ([#387](https://github.com/ilri/DSpace/pull/387))
+- Now I'm wondering if there are other bot requests that aren't classified as bots because the IP lists or user agents are outdated
+- DSpace ships a list of spider IPs, for example: `config/spiders/iplists.com-google.txt`
+- I checked the list against all the IPs we've seen using the "Googlebot" useragent on CGSpace's nginx access logs
+- The first thing I learned is that shit tons of IPs in Russia, Ukraine, Ireland, Brazil, Portugal, the US, Canada, etc are pretending to be "Googlebot"...
+- According to the [Googlebot FAQ](https://support.google.com/webmasters/answer/80553) the domain name in the reverse DNS lookup should contain either `googlebot.com` or `google.com`
+- In Solr this appears to be an appropriate query that I can maybe use later (returns 81,000 documents):
+
+```
+*:* AND (dns:*googlebot.com. OR dns:*google.com.) AND isBot:false
+```
+
+- I translate that into a delete command using the `/update` handler:
+
+```
+http://localhost:8081/solr/statistics/update?commit=true&stream.body=logBots = false
and after it came back up I see 266 events with isBots:true
(maybe they were buffered)… I will check again tomorrow
isBot:true
on DSpace Test’s Solr statistics core, so I’m definitely going to deploy this on CGSpace soonisBot:true
in it’s Solr statistics core and it is 124GB!dspace stats-util -f
on CGSpace the Solr statistics core went from 124GB to 84GB, and there are only 700 events with isBot:true
so I should really disable logging of bot events!dspace stats-util -f
on CGSpace the Solr statistics core went from 124GB to 60GB, and now there are only 700 events with isBot:true
so I should really disable logging of bot events!5_x-prod
branch (#387)config/spiders/iplists.com-google.txt
googlebot.com
or google.com
*:* AND (dns:*googlebot.com. OR dns:*google.com.) AND isBot:false
+
+
+/update
handler:http://localhost:8081/solr/statistics/update?commit=true&stream.body=<delete><query>*:*+AND+(dns:*googlebot.com.+OR+dns:*google.com.)+AND+isBot:false</query></delete>
+
+
+