mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2022-07-18
This commit is contained in:
@ -335,5 +335,23 @@ geo $ua {
|
||||
- This allows me to accomplish the original goal while still only using one bot-networks.conf file for the `limit_req_zone` and the user agent mapping that we pass to Tomcat
|
||||
- Unfortunately this means I will have hundreds of thousands of requests in Solr with a literal `$http_user_agent`
|
||||
- I might try to purge some by enumerating all the networks in my block file and running them through `check-spider-ip-hits.sh`
|
||||
- I extracted all the IPs/subnets from `bot-networks.conf` and prepared them so I could enumerate their IPs
|
||||
- I had to add `/32` to all single IPs, which I did with this crazy vim invocation:
|
||||
|
||||
```console
|
||||
:g!/\/\d\+$/s/^\(\d\+\.\d\+\.\d\+\.\d\+\)$/\1\/32/
|
||||
```
|
||||
|
||||
- Explanation:
|
||||
- `g!`: global, lines *not* matching (the opposite of `g`)
|
||||
- `/\/\d\+$/`, pattern matching `/` with one or more digits at the end of the line
|
||||
- `s/^\(\d\+\.\d\+\.\d\+\.\d\+\)$/\1\/32/`, for lines not matching above, capture the IPv4 address and add `/32` at the end
|
||||
- Then I ran the list through prips to enumerate the IPs:
|
||||
|
||||
```console
|
||||
$ while read -r line; do prips "$line" | sed -e '1d; $d'; done < /tmp/bot-networks.conf > /tmp/bot-ips.txt
|
||||
$ wc -l /tmp/bot-ips.txt
|
||||
1946968 /tmp/bot-ips.txt
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user