mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-09-28 21:24:18 +02:00
45 lines
2.2 KiB
Markdown
45 lines
2.2 KiB
Markdown
|
---
|
||
|
title: "May, 2022"
|
||
|
date: 2022-05-04T09:13:39+03:00
|
||
|
author: "Alan Orth"
|
||
|
categories: ["Notes"]
|
||
|
---
|
||
|
|
||
|
## 2022-05-04
|
||
|
|
||
|
- I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
|
||
|
- 18.207.136.176
|
||
|
- 185.189.36.248
|
||
|
- 50.118.223.78
|
||
|
- 52.70.76.123
|
||
|
- 3.236.10.11
|
||
|
- Looking at the Solr statistics for 2022-04
|
||
|
- 52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests
|
||
|
- 64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc
|
||
|
- 185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt
|
||
|
- 157.55.39.159 is owned by Microsoft and identifies as bingbot so I don't know why its requests were logged in Solr
|
||
|
- 52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
|
||
|
- 157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
|
||
|
- 207.46.13.177 is owned by Microsoft and identifies as bingbot so I don't know why its requests were logged in Solr
|
||
|
- If I query Solr for `time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.` I see a handful of IPs that made 41,000 requests
|
||
|
- I purged 93,974 hits from these IPs using my `check-spider-ip-hits.sh` script
|
||
|
|
||
|
<!--more-->
|
||
|
|
||
|
- Now looking at the Solr statistics by user agent I see:
|
||
|
- `SomeRandomText`
|
||
|
- `RestSharp/106.11.7.0`
|
||
|
- `MetaInspector/5.7.0 (+https://github.com/jaimeiniesta/metainspector)`
|
||
|
- `wp_is_mobile`
|
||
|
- `Mozilla/5.0 (compatible; um-LN/1.0; mailto: techinfo@ubermetrics-technologies.com; Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1"`
|
||
|
- `insomnia/2022.2.1`
|
||
|
- `ZoteroTranslationServer`
|
||
|
- `omgili/0.5 +http://omgili.com`
|
||
|
- `curb`
|
||
|
- `Sprout Social (Link Attachment)`
|
||
|
- I purged 2,900 hits from these user agents from Solr using my `check-spider-hits.sh` script
|
||
|
- I made a [pull request to COUNTER-Robots](https://github.com/atmire/COUNTER-Robots/pull/54) for some of these agents
|
||
|
- In the mean time I will add them to our local overrides in DSpace
|
||
|
|
||
|
<!-- vim: set sw=2 ts=2: -->
|