diff --git a/content/posts/2022-07.md b/content/posts/2022-07.md
index 08f9acefa..17db51db4 100644
--- a/content/posts/2022-07.md
+++ b/content/posts/2022-07.md
@@ -318,5 +318,22 @@ geo $ua {
- But I can't get it to work, neither for the default value or for matching my IP...
- I will have to ask on the nginx mailing list
+- The total number of requests and unique hosts was not even very high (below here around midnight so is almost all day):
+
+```console
+# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | sort -u | wc -l
+2776
+# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | wc -l
+40325
+```
+
+## 2022-07-18
+
+- Reading more about nginx's geo/map and doing some tests on DSpace Test, it appears that the [geo module cannot do dynamic values](https://stackoverflow.com/questions/47011497/nginx-geo-module-wont-use-variables)
+ - So this issue with the literal `$http_user_agent` is due to the geo block I put in place earlier this month
+ - I reworked the logic so that the geo block sets "bot" or and empty string when a network matches or not, and then re-use that value in a mapping that passes through the host's user agent in case geo has set it to an empty string
+ - This allows me to accomplish the original goal while still only using one bot-networks.conf file for the `limit_req_zone` and the user agent mapping that we pass to Tomcat
+ - Unfortunately this means I will have hundreds of thousands of requests in Solr with a literal `$http_user_agent`
+ - I might try to purge some by enumerating all the networks in my block file and running them through `check-spider-ip-hits.sh`
diff --git a/docs/2022-07/index.html b/docs/2022-07/index.html
index e82e146c0..c9ec84202 100644
--- a/docs/2022-07/index.html
+++ b/docs/2022-07/index.html
@@ -19,7 +19,7 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens
-
+
@@ -44,9 +44,9 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens
"@type": "BlogPosting",
"headline": "July, 2022",
"url": "https://alanorth.github.io/cgspace-notes/2022-07/",
- "wordCount": "1959",
+ "wordCount": "2156",
"datePublished": "2022-07-02T14:07:36+03:00",
- "dateModified": "2022-07-14T16:46:24+03:00",
+ "dateModified": "2022-07-17T22:45:16+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -484,6 +484,23 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens
I will have to ask on the nginx mailing list
+The total number of requests and unique hosts was not even very high (below here around midnight so is almost all day):
+
+# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | sort -u | wc -l
+2776
+# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | wc -l
+40325
+
2022-07-18
+
+- Reading more about nginx’s geo/map and doing some tests on DSpace Test, it appears that the geo module cannot do dynamic values
+
+- So this issue with the literal
$http_user_agent
is due to the geo block I put in place earlier this month
+- I reworked the logic so that the geo block sets “bot” or and empty string when a network matches or not, and then re-use that value in a mapping that passes through the host’s user agent in case geo has set it to an empty string
+- This allows me to accomplish the original goal while still only using one bot-networks.conf file for the
limit_req_zone
and the user agent mapping that we pass to Tomcat
+- Unfortunately this means I will have hundreds of thousands of requests in Solr with a literal
$http_user_agent
+- I might try to purge some by enumerating all the networks in my block file and running them through
check-spider-ip-hits.sh
+
+
diff --git a/docs/categories/index.html b/docs/categories/index.html
index ee2d35a07..6002bcddb 100644
--- a/docs/categories/index.html
+++ b/docs/categories/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html
index ea7168db8..369b23af0 100644
--- a/docs/categories/notes/index.html
+++ b/docs/categories/notes/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html
index 4ecb1ea90..71f31f177 100644
--- a/docs/categories/notes/page/2/index.html
+++ b/docs/categories/notes/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html
index ca3363911..de25155c0 100644
--- a/docs/categories/notes/page/3/index.html
+++ b/docs/categories/notes/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html
index 35e781797..aa245dc46 100644
--- a/docs/categories/notes/page/4/index.html
+++ b/docs/categories/notes/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html
index e75c761bb..380a03c86 100644
--- a/docs/categories/notes/page/5/index.html
+++ b/docs/categories/notes/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html
index a97ddf2e3..b3eba0fd7 100644
--- a/docs/categories/notes/page/6/index.html
+++ b/docs/categories/notes/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html
index fb9457a5e..081af3025 100644
--- a/docs/categories/notes/page/7/index.html
+++ b/docs/categories/notes/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/index.html b/docs/index.html
index 709375011..b14ac3726 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/2/index.html b/docs/page/2/index.html
index 2993b444d..7bb02089f 100644
--- a/docs/page/2/index.html
+++ b/docs/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/3/index.html b/docs/page/3/index.html
index 2b6399522..8c25e8ef7 100644
--- a/docs/page/3/index.html
+++ b/docs/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/4/index.html b/docs/page/4/index.html
index fdd30b5cf..70d129ee2 100644
--- a/docs/page/4/index.html
+++ b/docs/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/5/index.html b/docs/page/5/index.html
index bc1ea6f4b..a6f5c7bff 100644
--- a/docs/page/5/index.html
+++ b/docs/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/6/index.html b/docs/page/6/index.html
index 72112afc9..b0719b969 100644
--- a/docs/page/6/index.html
+++ b/docs/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/7/index.html b/docs/page/7/index.html
index 57491675a..fc65a45ea 100644
--- a/docs/page/7/index.html
+++ b/docs/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/8/index.html b/docs/page/8/index.html
index a0111ad52..410cea6b1 100644
--- a/docs/page/8/index.html
+++ b/docs/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/9/index.html b/docs/page/9/index.html
index 38a1720bc..c4b902cc5 100644
--- a/docs/page/9/index.html
+++ b/docs/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/index.html b/docs/posts/index.html
index d342f1783..3ce7aa806 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html
index 2fe05f29d..58011fee2 100644
--- a/docs/posts/page/2/index.html
+++ b/docs/posts/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html
index 2c1b21707..b09a58363 100644
--- a/docs/posts/page/3/index.html
+++ b/docs/posts/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html
index 32ffea037..3887f7c06 100644
--- a/docs/posts/page/4/index.html
+++ b/docs/posts/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html
index 5af087bd1..be099cd2d 100644
--- a/docs/posts/page/5/index.html
+++ b/docs/posts/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html
index 548022590..38a07eb9a 100644
--- a/docs/posts/page/6/index.html
+++ b/docs/posts/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html
index bd85d2a8f..ecbbf5746 100644
--- a/docs/posts/page/7/index.html
+++ b/docs/posts/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html
index 7fbb94ada..700fe5ba5 100644
--- a/docs/posts/page/8/index.html
+++ b/docs/posts/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html
index 24009dc59..826087e8a 100644
--- a/docs/posts/page/9/index.html
+++ b/docs/posts/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 30f141627..a903298c9 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
https://alanorth.github.io/cgspace-notes/categories/
- 2022-07-14T16:46:24+03:00
+ 2022-07-17T22:45:16+03:00
https://alanorth.github.io/cgspace-notes/
- 2022-07-14T16:46:24+03:00
+ 2022-07-17T22:45:16+03:00
https://alanorth.github.io/cgspace-notes/2022-07/
- 2022-07-14T16:46:24+03:00
+ 2022-07-17T22:45:16+03:00
https://alanorth.github.io/cgspace-notes/categories/notes/
- 2022-07-14T16:46:24+03:00
+ 2022-07-17T22:45:16+03:00
https://alanorth.github.io/cgspace-notes/posts/
- 2022-07-14T16:46:24+03:00
+ 2022-07-17T22:45:16+03:00
https://alanorth.github.io/cgspace-notes/2022-06/
2022-07-04T09:25:14+03:00